What is the difference between serialization modules pickle and json

Table of contents

What is the serialization module pickle

What is serialization module json

What is the difference between pickle and json

Summarize


What is the serialization module pickle

pickle is a built-in module in Python for serializing and deserializing Python objects into byte streams. It provides a way to convert complex data structures (such as lists, dictionaries, class instances, etc.) into byte streams or deserialize byte streams into primitive objects. The pickle module uses a specific binary format to represent objects, which makes it ideal for use in network transport or persistent storage.

 

The main functions of the pickle module are as follows:

- `pickle.dumps(obj)`: Serialize the Python object `obj` into a byte stream and return the byte stream result.
- `pickle.loads(bytes)`: Deserialize the byte stream `bytes` into a raw object, and return the object result.
- `pickle.dump(obj, file)`: Serialize the Python object `obj` into a byte stream and write the result to the file object `file`.
- `pickle.load(file)`: Reads a stream of bytes from a file object `file` and deserializes it to the original object.

Here is a simple pickle example:

import pickle

data = {'name': 'Alice', 'age': 30, 'city': 'New York'}

# 使用pickle.dumps将字典对象序列化为字节流
serialized_data = pickle.dumps(data)

# 使用pickle.loads将字节流反序列化为原始对象
deserialized_data = pickle.loads(serialized_data)

print(deserialized_data)
# 输出: {'name': 'Alice', 'age': 30, 'city': 'New York'}

# 使用pickle.dump将字典对象序列化并写入文件
with open('data.pickle', 'wb') as file:
    pickle.dump(data, file)

# 使用pickle.load从文件中读取字节流并反序列化为原始对象
with open('data.pickle', 'rb') as file:
    loaded_data = pickle.load(file)

print(loaded_data)
# 输出: {'name': 'Alice', 'age': 30, 'city': 'New York'}

It should be noted that the pickle module has certain security risks when processing untrusted data, because malicious pickle data can lead to code execution or introduce vulnerabilities. Therefore, use the pickle module with caution before reading pickled data from untrusted sources, and preferably in a safe and secure environment.

What is serialization module json

JSON (JavaScript Object Notation) is a lightweight data exchange format that is often used to transfer and store data between different applications. JSON adopts a human-readable text format, and is easy to parse and generate, so it is widely used in network communication and data storage.

 

In Python, the json module is a built-in standard library that provides functions for working with JSON data. This module provides a number of functions and methods for encoding and decoding JSON data, making it easy to convert between Python objects and JSON strings.

The main functions and methods in the json module are as follows:

- `json.dumps(obj, indent=None)`
  encodes the Python object `obj` into a JSON-formatted string and returns the result. If the `indent` parameter is specified, it defines the level of indentation, making the resulting JSON string more readable.

- `json.loads(json_str)`
  decodes the JSON-formatted string `json_str` into a Python object and returns the result.

- `json.dump(obj, file, indent=None)`
  encodes the Python object `obj` into a JSON-formatted string, and writes the result to the file object `file`. If the `indent` parameter is specified, it will define the level of indentation.

- `json.load(file)`
  reads a JSON-formatted string from the file object `file` and decodes it into a Python object.

Here is a simple example showing how to encode and decode using the json module:

import json

data = {'name': 'Alice', 'age': 30, 'city': 'New York'}

# 将Python对象编码为JSON字符串
json_str = json.dumps(data)
print(json_str)
# 输出: {"name": "Alice", "age": 30, "city": "New York"}

# 将JSON字符串解码为Python对象
decoded_data = json.loads(json_str)
print(decoded_data)
# 输出: {'name': 'Alice', 'age': 30, 'city': 'New York'}

# 将Python对象编码为JSON字符串,并写入文件
with open('data.json', 'w') as file:
    json.dump(data, file)

# 从文件中读取JSON字符串,并解码为Python对象
with open('data.json', 'r') as file:
    loaded_data = json.load(file)

print(loaded_data)
# 输出: {'name': 'Alice', 'age': 30, 'city': 'New York'}

It should be noted that JSON only supports some basic data types, such as strings, numbers, booleans, lists, dictionaries, and None. Other types in Python objects, such as functions, class instances, and special objects, may not be directly convertible to JSON strings. You can use `json.dump()` and `json.load()` functions with custom encoding and decoding functions to handle these special types of objects. In addition, the json module also provides extended options for formatted output, sort keys, encoding and decoding, which can be configured according to specific needs.

What is the difference between pickle and json

pickle and json are two different serialization modules, and they have some differences in implementation methods and application scenarios.

 

1. Data format: pickle uses a Python-specific binary format, while json uses a text-based standard format. The serialized data generated by pickle is a binary stream, unreadable, suitable for Python internal use or data exchange between Python-related systems. The serialized data generated by json is presented in a highly readable text form, which is suitable for cross-platform and cross-language data exchange.

2. Compatibility: Since pickle is a Python-specific format, its serialized data can only be deserialized in a Python environment that supports the pickle format. And json is a general data exchange format, which supports json in almost all programming languages, so the data serialized by json can be intercommunicated between different languages.

3. Security: The pickle module has certain security risks when processing untrusted data, because malicious pickle data can lead to code execution or introduce vulnerabilities. In contrast, json is a relatively safe data format because it does not contain any executable code.

4. Application scenarios: pickle is suitable for persistent storage of objects, inter-process communication, and data transmission within the Python environment, and can easily handle complex Python data structures. And json is suitable for cross-platform data exchange and storage, especially for data interaction with applications in different languages.

Summarize

Both pickle and json are commonly used serialization modules, but they differ in data format, compatibility, security and application scenarios. If you are manipulating data inside the Python environment, or packaging and transferring complex Python objects, you can choose pickle. If you need cross-language, cross-platform data exchange and storage, it is recommended to use json. In practical applications, an appropriate serialization module should be selected according to specific needs and environments.

Guess you like

Origin blog.csdn.net/weixin_43856625/article/details/131953858