[Python] understanding and use of pickle package

pickle is a module in the Python library for serializing and deserializing Python objects. Pickle can serialize objects into strings or sequences of bytes for transmission over the network or saving to files.

pickle is a very useful tool for converting Python objects into serializable strings or sequences of bytes, which can then be saved to files or transmitted over the network. This is very useful in several areas, especially in areas such as caching, configuration, and persistence. Pickle can handle most Python objects, including basic data types, dictionaries, lists, tuples, sets, user-defined classes and instances, etc.

Using pickle , one can easily serialize a Python object into a stream of bytes:

import pickle

data = [1, 2, 3, 4, 5]
# 序列化对象
pickled_data = pickle.dumps(data)
print(pickled_data)

The output is:

b'\x80\x04\x95\x0f\x00\x00\x00\x00\x00\x00\x00]\x94(K\x01K\x02K\x03K\x04K\x05e.'

Deserialize:

unpickled_data = pickle.loads(pickled_data)
print(unpickled_data)

The output is:

[1, 2, 3, 4, 5]

Note: The object serialized by pickle is binary data, so you need to use the byte string prefix "b" when printing the output .

Pickle also has many other functions, such as using dump() and load() to serialize and deserialize data into files, use the Protocol parameter to control the serialized version, use HIGHEST_PROTOCOL to specify the highest version of the serialization protocol, and so on. It should be noted that pickle may have some security issues, because it can deserialize arbitrary Python code. Therefore, it is recommended to only deserialize pickled data from trusted sources.

pickle is a serialization module in the Python standard library that converts Python objects into streams of bytes in order to save them to files or transfer them over the network.

Pickle can handle most Python objects, including primitive data types, complex data types, and instances of user-defined classes. Pickle can implement serialization and deserialization, converting an object into a byte stream is serialization, and converting a byte stream into an object is deserialization. The main applications of pickle include: caching, configuration and persistence.

For example, suppose we have a Python dictionary, and we want to persist it to a file or transfer it to the network, we can use the pickle package to achieve:

import pickle

# 定义一个字典
person = {
    
    'name': 'Alice', 'age': 28, 'gender': 'Female'}

# 将字典对象序列化为字节流
bytes_person = pickle.dumps(person)

# 将字节流反序列化为对象
new_person = pickle.loads(bytes_person)

print(person)       # {'name': 'Alice', 'age': 28, 'gender': 'Female'}
print(new_person)   # {'name': 'Alice', 'age': 28, 'gender': 'Female'}

The output is:

{
    
    'name': 'Alice', 'age': 28, 'gender': 'Female'}
{
    
    'name': 'Alice', 'age': 28, 'gender': 'Female'}

Let's take a more practical example. Suppose we have a machine learning model and we want to save the model to a file and reload the model when needed in order to make predictions. We can use the pickle package to serialize and deserialize models.

import pickle
import numpy as np
from sklearn.linear_model import LogisticRegression

# 生成一些随机数据
X = np.random.rand(100, 5)
y = np.random.randint(0, 2, (100,))

# 实例化一个逻辑回归模型
clf = LogisticRegression()

# 拟合模型
clf.fit(X, y)

# 将模型序列化为字节流
bytes_model = pickle.dumps(clf)

# 将字节流反序列化为模型对象
new_clf = pickle.loads(bytes_model)

# 对新数据进行预测
new_X = np.random.rand(10, 5)
new_y_pred = new_clf.predict(new_X)

print(new_y_pred)

The output is:

[1 1 0 1 0 1 1 0 1 1]

Guess you like

Origin blog.csdn.net/wzk4869/article/details/130648728