1. Background description
I did a simulation under python, but the simulation time is relatively long. Every time I want to see some data in the middle, I need to run it again, so I think of a way to save all these data, and just load it when I want to see it. .
The first thing I thought of here was json, and then I found that some things in JSON could not be done or were very troublesome, so I found pickle again.
2. JSON
JSON is a lightweight data interchange format, a subset of JavaScript. Originally, this data format was used to replace xml, and now it is basically proved that he can do it. Here's how to use it.
2.1. Convert Python objects to JSON strings
There are two ways to accomplish this task, namely dumps and dump.
First introduce the dumps:
import json
data = {
'a': 1, 'b': 'name'}
str = json.dumps(data)
print(str)
It serializes the python object into a string.
Then introduce the dump:
import json
data = {
'a': 1, 'b': 'name'}
with open('data.json','wb') as f:
json.dump(data,f)
It serializes python objects and writes them to files.
Difference : between the two, one is serialized into a string and the other is serialized into a file. The extra s in the dumps can be considered as a string.
2.2. Convert JSON string to Python object
Contrary to the above content, there are also two interfaces, namely loads and load. Smart readers have already guessed that one is to deserialize strings and the other is to deserialize files.
import json
data = json.loads(str)
print(data)
import json
with open('data.json', 'rb') as f:
data = json.load(f)
2.3, custom json decoder and encoder
JSON encoders and decoders can be customized by inheriting from json.JSONEncoder
and classes.json.JSONDecoder
import json
# 自定义 JSON 编码器
class PersonEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, Person):
return {
'name': obj.name, 'age': obj.age}
return super().default(obj)
# 自定义 JSON 解码器
class PersonDecoder(json.JSONDecoder):
def __init__(self, *args, **kwargs):
super().__init__(object_hook=self.object_hook, *args, **kwargs)
def object_hook(self, dct):
if 'name' in dct and 'age' in dct:
return Person(dct['name'], dct['age'])
return dct
# 定义 Person 类
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
# 使用自定义编码器和解码器
data = [Person('Alice', 25), Person('Bob', 30)]
json_str = json.dumps(data, cls=PersonEncoder)
data = json.loads(json_str, cls=PersonDecoder)
However, this definition method is cumbersome, and it still cannot handle very complex data, but only provides a processing method for some more complex data. We're not going to go into detail (nor will I).
Three, pickle
Pickle also provides which of the above four methods dump
, dumps
, load
, loads
, the difference is the same, we will not go into details, here we give our serialization method
import numpy as np
import pickle
# 定义包含 NumPy 数组的 Python 对象
class DataObject:
def __init__(self):
self.array = np.zeros(100,dtype='float64')
self.age = 0
self.name = 'name'
def reset(self):
self.age = 100
self.name = 'ale'
data = DataObject();
# 序列化对象到文件
with open('serialized_data.pkl', 'wb') as file:
pickle.dump(data, file)
with open('serialized_data.pkl', 'rb') as file:
data2 = pickle.load(file)
data2.reset()
print(data2.array)
print(data2.age)
print(data2.name)
Running the above program, we found that serialization and deserialization can be performed. What are the advantages and disadvantages between them?
Four, pros and cons
pickle
The advantages:
- Supports serialization of almost all Python objects, including custom classes, functions, and reference relationships between multiple objects.
- Save the state, method and class information of the object, etc., and can completely restore the original structure of the object after deserialization.
- The serialization and deserialization process is relatively simple, just call the
dump()
andload()
methods.
pickle
Disadvantages:
- Since
pickle
is a Python-specific format, it cannot be used across languages. Other programming languages may not be able to parsepickle
the generated data correctly. pickle
The generated serialized data is usually large and consumes high storage space.
json
The advantages:
json
It is a common data exchange format, and almost all programming languages support parsing and generating data in JSON format.- The generated JSON data is relatively small, and the storage space consumption is relatively low.
- Human readable, easy to understand and debug.
json
Disadvantages:
- Only some simple data types can be serialized, such as basic types such as dictionaries, lists, strings, and numbers, and serialization of custom complex objects and functions is not supported.
- Only the value of the data can be saved, and the state, method and class information of the object are lost.