About
program ape a thoughtful, lifelong learning practitioners, is currently in a start-up team any team lead, technology stack involves Android, Python, Java, and Go, this is the main technology stack our team.
GitHub: https://github.com/hylinux1024
micro-channel public number: Lifetime developer (angrycode)
0x00 marshal
marshal
Use is Python
associated with a machine-independent language but to read and write binary Python
objects. This binary format also with Python
the relevant language version, marshal
serialized format different versions Python
are not compatible.
marshal
Usually a Python
sequence of internal objects.
Generally they include:
- basic type
booleans, integers,floating point numbers,complex numbers
- Sequence set type
strings, bytes, bytearray, tuple, list, set, frozenset, dictionary
- Object code
code object
- Other types
None, Ellipsis, StopIteration
marshal
The main role is to Python
"compile" the .pyc
file read and write support. This is also marshal
for Python
the version is not compatible with reason. Developers If you are using serialization / de-serialization , you should use the pickle
module.
Common method
marshal.dump(value, file[, version])
Serialize an object to a file
marshal.dumps(value[, version])
Serialize an object and returns a bytes
subject
marshal.load(file)
Deserialized from the file object
marshal.loads(bytes)
From the bytes
binary data of an object to deserialize
0x01 pickle
pickle
Module can also be a binary manner Python
object read. Compared marshal
provide basic serialization capabilities, pickle
serialization more widely.
pickle
Data is serialized and Python
specific language, i.e., for example, other languages Java
can not be read by a Python
by pickle
a sequence of binary data. If you are unable to use the language serialization then we should use json
. The information below describes.
Be pickle
serialized data types:
- None, True, and False
- integers, floating point numbers, complex numbers
- strings, bytes, bytearrays
- tuples, lists, sets, and dictionaries and pickle containing a serialized object may be
- Function objects defined in the module top layer (the def defined, instead of
lambda
the expression) - Built-in functions defined in the module top
- Top level class in the schema definition
- A class
__dict__
contains a serializable object or__getstate__()
method to return the object to be serialized
If it pickle
will be thrown when an unsupported serialized object PicklingError
.
Common method
pickle.dump(obj, file, protocol=None, *, fix_imports=True)
The obj
object is serialized to a file
file, and the process Pickler(file, protocol).dump(obj)
is equivalent.
pickle.dumps(obj, protocol=None, *, fix_imports=True)
The obj
target sequence into bytes
binary data.
pickle.load(file, *, fix_imports=True, encoding="ASCII", errors="strict")
From file
document deserialize an object, the method Unpickler(file).load()
is equivalent.
pickle.loads(bytes_object, *, fix_imports=True, encoding="ASCII", errors="strict")
From the binary data bytes_object
is deserialized.
Examples of sequence
import pickle
# 定义了一个包含了可以被序列化对象的字典
data = {
'a': [1, 2.0, 3, 4 + 6j],
'b': ("character string", b"byte string"),
'c': {None, True, False}
}
with open('data.pickle', 'wb') as f:
# 序列化对象到一个data.pickle文件中
# 指定了序列化格式的版本pickle.HIGHEST_PROTOCOL
pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
After performing in a multi-folder data.pickle
file
serialization
├── data.pickle
├── pickles.py
└── unpickles.py
Examples deserialization
import pickle
with open('data.pickle', 'rb') as f:
# 从data.pickle文件中反序列化对象
# pickle能够自动检测序列化文件的版本
# 所以这里可以不用版本号
data = pickle.load(f)
print(data)
# 执行后结果
# {'a': [1, 2.0, 3, (4+6j)], 'b': ('character string', b'byte string'), 'c': {False, True, None}}
0x02 json
json
Is language-independent, very common data exchange format. In Python
it marshal
and pickle
the same it has similar API
.
Common method
json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
Serialized object to fp
a file
json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
The obj
serialized json
objects
json.load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
Anti sequence from a file into an object
json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
From the json
format of the document deserialized into an object
json
And Python
transforming the object table
JSON | Python |
---|---|
object | dict |
list,tuple | array |
str | string |
int, float, int- & float-derived Enums | number |
True | true |
False | false |
None | null |
For primitive types, sequences, and a set of the type comprising a basic types json
are well completion sequence of work.
Examples of sequence
>>> import json
>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
'["foo", {"bar": ["baz", null, 1.0, 2]}]'
>>> print(json.dumps("\"foo\bar"))
"\"foo\bar"
>>> print(json.dumps('\u1234'))
"\u1234"
>>> print(json.dumps('\\'))
"\\"
>>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True))
{"a": 0, "b": 0, "c": 0}
>>> from io import StringIO
>>> io = StringIO()
>>> json.dump(['streaming API'], io)
>>> io.getvalue()
'["streaming API"]'
Examples deserialization
>>> import json
>>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]')
['foo', {'bar': ['baz', None, 1.0, 2]}]
>>> json.loads('"\\"foo\\bar"')
'"foo\x08ar'
>>> from io import StringIO
>>> io = StringIO('["streaming API"]')
>>> json.load(io)
['streaming API']
For object
the more complicated the situation
For example, it defines a complex complex
object json
document
complex_data.json
{
"__complex__": true,
"real": 42,
"imaginary": 36
}
Take this json
document into anti-sequence Python
objects, we need to define the method of conversion
# coding=utf-8
import json
# 定义转化函数,将json中的内容转化成complex对象
def decode_complex(dct):
if "__complex__" in dct:
return complex(dct["real"], dct["imaginary"])
else:
return dct
if __name__ == '__main__':
with open("complex_data.json") as complex_data:
# object_hook指定转化的函数
z = json.load(complex_data, object_hook=decode_complex)
print(type(z))
print(z)
# 执行结果
# <class 'complex'>
# (42+36j)
If not specified object_hook
, the default will json
document object
turn intodict
# coding=utf-8
import json
if __name__ == '__main__':
with open("complex_data.json") as complex_data:
# 这里不指定object_hook
z2 = json.loads(complex_data.read())
print(type(z2))
print(z2)
# 执行结果
# <class 'dict'>
# {'__complex__': True, 'real': 42, 'imaginary': 36}
We can see json
the document object
turned into dict
objects.
Under normal circumstances this use seems to have no problem, but if the high type requires a well-defined scene you need a process for converting.
In addition to object_hook
the parameters can also be usedjson.JSONEncoder
import json
class ComplexEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, complex):
# 如果complex对象这里转成数组的形式
return [obj.real, obj.imag]
# 默认处理
return json.JSONEncoder.default(self, obj)
if __name__ == '__main__':
c = json.dumps(2 + 1j, cls=ComplexEncoder)
print(type(c))
print(c)
# 执行结果
# <class 'str'>
# [2.0, 1.0]
Because the json
modules are not able to automatically complete sequence of all types, not for the type of support will directly thrown TypeError
.
>>> import datetime
>>> d = datetime.datetime.now()
>>> dct = {'birthday':d,'uid':124,'name':'jack'}
>>> dct
{'birthday': datetime.datetime(2019, 6, 14, 11, 16, 17, 434361), 'uid': 124, 'name': 'jack'}
>>> json.dumps(dct)
Traceback (most recent call last):
File "<pyshell#19>", line 1, in <module>
json.dumps(dct)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type datetime is not JSON serializable
Do not support the type of sequence, for example, datetime
and a self-defined types, you need to use JSONEncoder
to define the logical transformation.
import json
import datetime
# 定义日期类型的JSONEncoder
class DatetimeEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime.datetime):
return obj.strftime('%Y-%m-%d %H:%M:%S')
elif isinstance(obj, datetime.date):
return obj.strftime('%Y-%m-%d')
else:
return json.JSONEncoder.default(self, obj)
if __name__ == '__main__':
d = datetime.date.today()
dct = {"birthday": d, "name": "jack"}
data = json.dumps(dct, cls=DatetimeEncoder)
print(data)
# 执行结果
# {"birthday": "2019-06-14", "name": "jack"}
Now when we want to send serialization, can be json
converted into a date format document datetime.date
objects, then you need to use json.JSONDecoder
up.
# coding=utf-8
import json
import datetime
# 定义Decoder解析json
class DatetimeDecoder(json.JSONDecoder):
# 构造方法
def __init__(self):
super().__init__(object_hook=self.dict2obj)
def dict2obj(self, d):
if isinstance(d, dict):
for k in d:
if isinstance(d[k], str):
# 对日期格式进行解析,生成一个date对象
dat = d[k].split("-")
if len(dat) == 3:
date = datetime.date(int(dat[0]), int(dat[1]), int(dat[2]))
d[k] = date
return d
if __name__ == '__main__':
d = datetime.date.today()
dct = {"birthday": d, "name": "jack"}
data = json.dumps(dct, cls=DatetimeEncoder)
# print(data)
obj = json.loads(data, cls=DatetimeDecoder)
print(type(obj))
print(obj)
# 执行结果
# {"birthday": "2019-06-14", "name": "jack"}
# <class 'dict'>
# {'birthday': datetime.date(2019, 6, 14), 'name': 'jack'}
0x03 summarize
Python
The chemical industry has a common sequence marshal
, pickle
and json
. marshal
Mainly used for Python
the .pyc
file, and with the Python
relevant version. It can not be serialized user-defined classes.
pickle
Is Python
serialized object tool than the marshal
more common of these, it is compatible with Python
different versions. json
It is a language-independent data structure, widely used in a variety of network applications, especially in REST API
interactive data service.