Detailed explanation of JSON read and write operations necessary for data science

The most common application scenarios for the JSON data type are APIs or saving data to .json stable data. Working with this data in Python is very simple.
insert image description here

JSON

JSON origin

JSON stands for JavaScript Object Notation. A subset of the JavaScript programming language that handles object literal syntax. JSON has long been a language-agnostic language and exists as its own standard.

JSON sample

{
    
    
	 "data":[
	  {
    
    
	    "id": "1",
	    "name": "A同学",
	    "state": "1",
	    "createTime": "2020-01-21"
	  },
	  {
    
    
	    "id": "2",
	    "name": "B同学",
	    "state": "1",
	    "createTime": "2020-01-21"
	  },
	  {
    
    
	    "id": "3",
	    "name": "C同学",
	    "state": "0",
	    "createTime": "2020-01-21"
	  }
	]
}

Python natively supports JSON

Python comes with a built-in package json for encoding and decoding JSON data.
insert image description here

Citation method.

import json

The process of JSON encoding is often called serialization. The term refers to converting data into a series of bytes for storage or transmission over a network. Deserialization is the interactive process of decoding data stored or delivered in the JSON standard.

Serialize JSON

Intuitive conversions convert simple Python objects to JSON.

Python JSON
dict object
list,tuple array
str string
int, long,float number
True true
False false
None null

Simple serialization example

Create a simple data.

data =   {
    
    
	 "data":[
	  {
    
    
	    "id": "1",
	    "name": "A同学",
	    "state": "1",
	    "createTime": "2020-01-21"
	  },
	  {
    
    
	    "id": "2",
	    "name": "B同学",
	    "state": "1",
	    "createTime": "2020-01-21"
	  },
	  {
    
    
	    "id": "3",
	    "name": "C同学",
	    "state": "0",
	    "createTime": "2020-01-21"
	  }
	]
}

Data is saved directly as text.

with open("data_file.json", "w") as f:
    json.dump(data, f)

The data is used directly as a string.

json_str = json.dumps(data)

JSON deserialization

Use load() and loads() in the json library to convert JSON encoded data to Python objects.

JSON Python
object dict
array list
string str
number (integer) int
number (floating point) float
true True
false False
null None

Simple deserialization example

Read data written to a json file.

with open("data_file.json", "r") as read_file:
    data = json.load(read_file)

String data.

json_string = """
{
	 "data":[
	  {
	    "id": "1",
	    "name": "A同学",
	    "state": "1",
	    "createTime": "2020-01-21"
	  },
	  {
	    "id": "2",
	    "name": "B同学",
	    "state": "1",
	    "createTime": "2020-01-21"
	  },
	  {
	    "id": "3",
	    "name": "C同学",
	    "state": "0",
	    "createTime": "2020-01-21"
	  }
	]
}
"""
data = json.loads(json_string)

Applications

Parse text information through data scraping from the Internet.

# 秦皇岛煤炭网微博
import requests
from bs4 import BeautifulSoup
import datetime
url = "http://news.cqcoal.com/manage/newsaction.do?method:webListPageNewsArchivesByTypeid"
post_param = {
    
    'pageNum':'1','pageSize':'20','jsonStr':'{"typeid":"238"}'}
return_data = requests.post(url,data =post_param)
return_data = return_data.content.decode("utf-8")

import json
for i in json.loads(return_data)["rows"]:
    title = i["title"]
    url = "http://news.cqcoal.com/blank/nc.jsp?mid="+str(i["id"])
    timeStamp=int(i["pubdate"])
    dateArray = datetime.datetime.utcfromtimestamp(timeStamp)
    date = dateArray.strftime("%Y-%m-%d")
    print(title,url,date)

insert image description here

encode and decode

custom data.

import json

# 基础的数字字典
py_object = {
    
    "c": 0, "b": 0, "a": 0}

# JSON 编码
json_string = json.dumps(py_object)
print(json_string)
print(type(json_string))

{
    
    "c": 0, "b": 0, "a": 0}
<class 'str'>


# JSON 解码
py_obj = json.loads(json_string)

print(py_obj)
print(type(py_obj))

{
    
    'c': 0, 'b': 0, 'a': 0}
<class 'dict'>

If you encounter TypeError: Object of type SampleClass is not JSON serializable error, you need to customize encoding and decoding.

import json

class Student:
	def __init__(self, name, roll_no, address):
		self.name = name
		self.roll_no = roll_no
		self.address = address

	def to_json(self):
		'''
		将此类的实例转换为 json
		'''
		return json.dumps(self, indent = 4, default=lambda o: o.__dict__)

class Address:
	def __init__(self, city, street, pin):
		self.city = city
		self.street = street
		self.pin = pin
		
address = Address("Bulandshahr", "Adarsh Nagar", "203001")
student = Student("Raju", 53, address)

# 编码
student_json = student.to_json()
print(student_json)
print(type(student_json))

{
    
    
    "name": "Raju",
    "roll_no": 53,
    "address": {
    
    
        "city": "Bulandshahr",
        "street": "Adarsh Nagar",
        "pin": "203001"
    }
}
<class 'str'>

# 解码
student = json.loads(student_json)
print(student)
print(type(student))

{
    
    'name': 'Raju', 'roll_no': 53, 'address': {
    
    'city': 'Bulandshahr', 'street': 'Adarsh Nagar', 'pin': '203001'}}
<class 'dict'>

Guess you like

Origin blog.csdn.net/qq_20288327/article/details/124115125