JSON and pickle for python data serialization

1. Background description

I did a simulation under python, but the simulation time is relatively long. Every time I want to see some data in the middle, I need to run it again, so I think of a way to save all these data, and just load it when I want to see it. .

The first thing I thought of here was json, and then I found that some things in JSON could not be done or were very troublesome, so I found pickle again.

2. JSON

JSON is a lightweight data interchange format, a subset of JavaScript. Originally, this data format was used to replace xml, and now it is basically proved that he can do it. Here's how to use it.

2.1. Convert Python objects to JSON strings

There are two ways to accomplish this task, namely dumps and dump.

First introduce the dumps:

import json

data = {
    
    'a': 1, 'b': 'name'}
str = json.dumps(data)
print(str)

It serializes the python object into a string.

Then introduce the dump:

import json

data = {
    
    'a': 1, 'b': 'name'}
with open('data.json','wb') as f:
	json.dump(data,f)

It serializes python objects and writes them to files.

Difference : between the two, one is serialized into a string and the other is serialized into a file. The extra s in the dumps can be considered as a string.

2.2. Convert JSON string to Python object

Contrary to the above content, there are also two interfaces, namely loads and load. Smart readers have already guessed that one is to deserialize strings and the other is to deserialize files.

import json

data = json.loads(str)
print(data)
import json

with open('data.json', 'rb') as f:
    data = json.load(f)

2.3, custom json decoder and encoder

JSON encoders and decoders can be customized by inheriting from json.JSONEncoderand classes.json.JSONDecoder

import json

# 自定义 JSON 编码器
class PersonEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, Person):
            return {
    
    'name': obj.name, 'age': obj.age}
        return super().default(obj)

# 自定义 JSON 解码器
class PersonDecoder(json.JSONDecoder):
    def __init__(self, *args, **kwargs):
        super().__init__(object_hook=self.object_hook, *args, **kwargs)

    def object_hook(self, dct):
        if 'name' in dct and 'age' in dct:
            return Person(dct['name'], dct['age'])
        return dct

# 定义 Person 类
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

# 使用自定义编码器和解码器
data = [Person('Alice', 25), Person('Bob', 30)]
json_str = json.dumps(data, cls=PersonEncoder)
data = json.loads(json_str, cls=PersonDecoder)

However, this definition method is cumbersome, and it still cannot handle very complex data, but only provides a processing method for some more complex data. We're not going to go into detail (nor will I).

Three, pickle

Pickle also provides which of the above four methods dump, dumps, load, loads, the difference is the same, we will not go into details, here we give our serialization method

import numpy as np
import pickle


# 定义包含 NumPy 数组的 Python 对象
class DataObject:
    def __init__(self):
        self.array = np.zeros(100,dtype='float64')
        self.age = 0
        self.name = 'name'
    
    def reset(self):
        self.age = 100
        self.name = 'ale'
        


data = DataObject();
# 序列化对象到文件
with open('serialized_data.pkl', 'wb') as file:
    pickle.dump(data, file)

with open('serialized_data.pkl', 'rb') as file:
    data2 = pickle.load(file)

data2.reset()

print(data2.array)
print(data2.age)
print(data2.name)

Running the above program, we found that serialization and deserialization can be performed. What are the advantages and disadvantages between them?

Four, pros and cons

pickleThe advantages:

  1. Supports serialization of almost all Python objects, including custom classes, functions, and reference relationships between multiple objects.
  2. Save the state, method and class information of the object, etc., and can completely restore the original structure of the object after deserialization.
  3. The serialization and deserialization process is relatively simple, just call the dump()and load()methods.

pickleDisadvantages:

  1. Since pickleis a Python-specific format, it cannot be used across languages. Other programming languages ​​may not be able to parse picklethe generated data correctly.
  2. pickleThe generated serialized data is usually large and consumes high storage space.

jsonThe advantages:

  1. jsonIt is a common data exchange format, and almost all programming languages ​​support parsing and generating data in JSON format.
  2. The generated JSON data is relatively small, and the storage space consumption is relatively low.
  3. Human readable, easy to understand and debug.

jsonDisadvantages:

  1. Only some simple data types can be serialized, such as basic types such as dictionaries, lists, strings, and numbers, and serialization of custom complex objects and functions is not supported.
  2. Only the value of the data can be saved, and the state, method and class information of the object are lost.

Guess you like

Origin blog.csdn.net/weixin_43903639/article/details/131622256