The module Python3 Detailed json

This article introduces Python3 the json module uses detailed, Xiao Bian feel very good, now for everyone to share, but also to be a reference. Come and see, to follow the small series together

1 Overview

JSON (JavaScript Object Notation) is a widely used data format lightweight. Python standard library module provides json JSON data processing functions.
The Python in a very basic data structure is common dictionary (Dictionary). It A typical structure is as follows:

d = {
'a': 123,
'b': {
'x': ['A', 'B', 'C']
}
}

JSON and is structured as follows:

{
"a": 123,
"b": {
"x": ["A", "B", "C"]
}
}

We can see, Dictionary and JSON are very close, and the main features of the Python in json library provides, but also the transition between the two.

2. Read JSON

json.loads method may be a JSON str contains data, bytes or bytearray object into a Python Dictionary End Interface its signature as follows:

json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

2.1 The simplest example

json.loads most basic way is to use a JSON data containing str passed to this method:

>>> json.loads('{"a": 123}')
{'a': 123}

Note that
in Python, str value can be placed in single quotes may be placed in double quotation marks:

>>> 'ABC' == "ABC"
True

Therefore, when the keys and values ​​of the type defined in the Dictionary str, single or double quotes are legitimate and equivalent:

>>> {"a": 'ABC'} == {'a': "ABC"}
True

However, in JSON string data can only be placed in double quotes, thus JSON string contents json.loads method of treatment, the string must use double quotes or decoding error occurs:

>>> json.loads("{'a': 123}")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

If the string Python be processed is contained in double quotes, the double quotation marks in JSON need to escape:

>>> json.loads("{\"a\": 123}")
{'a': 123}

2.2 bytes and data bytearray

For content data bytes JSON and bytearray, json.loads method may also be treated:

>>> json.loads('{"a": 123}'.encode('UTF-8'))
{'a': 123}
>>> json.loads(bytearray('{"a": 123}', 'UTF-8'))
{'a': 123}

2.3 encoding format

The second parameter is the encoding json.loads no practical effect.
Since the Python 3 str type always use UTF-8 encoding, the s parameter is type str, json.loads method for automatically using UTF-8 encoding. and, str BOM can not start byte.
when the s parameter bytes or bytearray, json.loads method automatically determines UTF-8, UTF-16 or UTF-32 encoded. is converted to the default str object according to UTF-8 encoding for subsequent processing.

2.4 Data type conversion

JSON type data may represent four kinds of master

1. String String
2. Digital Number
3. Boolean Boolean class
4. a null value
and the data structure of two junction

1. Object Object
2. Array An array
default implementation, the data conversion between JSON and Python correspondence relationship as follows:
Here Insert Picture Description
the actual conversions following example:

>>> json.loads("""
... {
... "obj": {
... "str": "ABC",
... "int": 123,
... "float": -321.89,
... "bool_true": true,
... "bool_false": false,
... "null": null,
... "array": [1, 2, 3]
... }
... }""")
{'obj': {'str': 'ABC', 'int': 123, 'float': -321.89, 'bool_true': True, 'bool_false': False, 'null': None, 'array': [1, 2, 3]}} 

For digital number JSON types of data, the following points should be noted:
real number type Real precision 1.JSON not exceed the accuracy range of the float type in Python, otherwise there is a loss of precision in the following example:

>>> json.loads('3.141592653589793238462643383279')
3.141592653589793

2.JSON standard does not include non-numeric NaN, negative infinity and positive infinity Infinity -Infinity, but json.loads method JSON string will default NaN, Infinity, -Infinity Python into the float ( 'nan'), float ( 'inf') and a float. ( '- inf') noted herein in JSON NaN, Infinity, -Infinity spelled correctly and must complete the case in the following example.

>>> json.loads('{"inf": Infinity, "nan": NaN, "ninf": -Infinity}')
{'inf': inf, 'nan': nan, 'ninf': -inf}

2.5 custom type object conversion JSON

json.loads default object data in JSON a Dictionary type, parameters can be used to change the object_hook objects constructed of.
the object_hook a function to accept input parameters of this function is the conversion target data in JSON Dictionary object, the return value . was custom objects in the following example:

>>> class MyJSONObj:
... def __init__(self, x):
... self.x = x
...
>>> def my_json_obj_hook(data):
... print('obj_hook data: %s' % data)
... return MyJSONObj(data['x'])
...
>>> result = json.loads('{"x": 123}', object_hook=my_json_obj_hook)
obj_hook data: {'x': 123}
>>> type(result)
<class '__main__.MyJSONObj'>
>>> result.x
123

When the nested JSON object, json.loads will traverse the depth-first method of object tree embodiment, each layer passes data to the target object_hook. JSON leaf node of the constructed Python objects, as will be the parent node a value is passed to the parent node object_hook method of the following example:

>>> class MyJSONObj:
... def __init__(self, x, y):
... self.x = x
... self.y = y
...
>>> def my_json_obj_hook(data):
... print('obj_hook data: %s' % data)
... return MyJSONObj(**data)
...
>>> result = json.loads('{"x": {"x": 11, "y": 12}, "y": {"x": 21, "y":22}}', object_hook=my_json_obj_hook)
obj_hook data: {'x': 11, 'y': 12}
obj_hook data: {'x': 21, 'y': 22}
obj_hook data: {'x': <__main__.MyJSONObj object at 0x10417ef28>, 'y': <__main__.MyJSONObj object at 0x10417ed68>} 

Apart object_hook parameters, there is a object_pairs_hook parameter This parameter can also be used to change the method of the type constructed of json.loads Python object. Object_hook and this parameter is different, the method comprising passing the received input data is not a dictionary, but contains a List tuple. each tuple has two elements, the first element is JSON data key, the second element is a value corresponding to the key as JSON object

{
"a": 123,
"b": "ABC"
} 

Corresponds to the input data
Here Insert Picture Description
when calling json.loads method, and specifying object_hook object_pairs_hook, object_pairs_hook overrides object_hook parameters.

2.6 custom digital conversion type JSON

The default implementation, real JSON is converted to float Python, integers are converted to an int or long. Similarly the object_hook, we can specify the conversion logic customized by parse_float and parse_int parameters input parameters of these two methods is JSON string represents a real number or integer in this embodiment, we convert the real number numpy.float64, convert integer numpy.int64.:

>>> def my_parse_float(f):
... print('%s(%s)' % (type(f), f))
... return numpy.float64(f)
...
>>> def my_parse_int(i):
... print('%s(%s)' % (type(i), i))
... return numpy.int64(i)
...
>>> result = json.loads('{"i": 123, "f": 321.45}', parse_float=my_parse_float, parse_int=my_parse_int)
<type 'str'>(123)
<type 'str'>(321.45)
>>> type(result['i'])
<type 'numpy.int64'>
>>> type(result['f'])
<type 'numpy.float64'> 

2.6.1 custom NaN, Infinity conversion type and -Infinity

Since the standard data do not support JSON NaN, Infinity and -Infinity, so parse_float not receiving these values. When needed these custom object when the value of the conversion, it is necessary to use another interface parse_constant. Example is as follows in these several values ​​converted to numpy.float64 same type:

>>> def my_parse_constant(data):
... print('%s(%s)' % (type(data), data))
... return numpy.float64(data)
...
>>> result = json.loads('{"inf": Infinity, "nan": NaN, "ninf": -Infinity}', parse_constant=my_parse_constant)
<type 'str'>(Infinity)
<type 'str'>(NaN)
<type 'str'>(-Infinity)
>>> result['inf']
inf
>>> type(result['inf'])
<type 'numpy.float64'> 

2.7 the top non-target value
according to JSON specification, a JSON data may contain only one value, rather than a complete object. This value can be a string, a numeric, boolean, null, or an array. In addition to these the three types of specifications given JSON may also be NaN, Infinity or -Infinity:

>>> json.loads('"hello"')
'hello'
>>> json.loads('123')
123
>>> json.loads('123.34')
123.34
>>> json.loads('true')
True
>>> json.loads('false')
False
>>> print(json.loads('null'))
None
>>> json.loads('[1, 2, 3]')
[1, 2, 3] 

2.8 duplicate keys

JSON object in the same hierarchy, the name does not duplicate key should appear, but no specification is given JSON this situation the standard treatment in json.loads, when JSON data has duplicate keys, the key will later the front cover:

>>> json.loads('{"a": 123, "b": "ABC", "a": 321}')
{'a': 321, 'b': 'ABC'}

2.9 processing JSON data file

When JSON data is stored in a file when, json.load method can be used to read data from the file, and converted to a Python object. The method of the first parameter is a pointer json.load JSON data file type . objects
such as /tmp/data.json file contains the following:

{
"a": 123,
"b": "ABC"
} 

May be used in the following example the code reads and JSON conversion data file:

>>> with open('/tmp/data.json') as jf:
... json.load(jf)
...
{u'a': 123, u'b': u'ABC'} 

In addition to the file type of the object, as long as a read method to achieve the object file, fp can be used as the parameter, such as the embodiment of io.StringIO:

>>> sio = io.StringIO('{"a": 123}')
>>> json.load(sio)
{'a': 123}

the meaning and use of the above and other parameters json.load json.loads same method, not repeat them here.
recommend our python learning sites , to see how old the program is to learn! From basic python script, reptiles, django, data mining, programming techniques, work experience, as well as senior careful study of small python partners to combat finishing zero-based information projects! The method has timed programmer Python explain everyday technology, to share some of the learning and the need to pay attention to small details

3 generates JSON

json.dumps method may be converted to a string of Python object data representing JONS its full signature following interfaces:

json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)

Its first argument is the data object obj to be converted.

>>> json.dumps({'a': 123, 'b': 'ABC'})
'{"a": 123, "b": "ABC"}'

3.1 encoding format

ensure_ascii json.dumps parameters for controlling the generated encode JSON string. The default value is True, this time, all of the note are non-ASCII escape. If not want to automatically escape, will maintain the original coding , limit UTF-8 in the following example:

>>> json.dumps({'数字': 123, '字符': '一二三'})
'{"\\u6570\\u5b57": 123, "\\u5b57\\u7b26": "\\u4e00\\u4e8c\\u4e09"}'
>>> json.dumps({'数字': 123, '字符': '一二三'}, ensure_ascii=False)
'{"数字": 123, "字符": "一二三"}'

3.2 data type conversion

In the default implementation, Python json.dumps objects that can be processed, and all attribute values must be of type dict, list, tuple, str, float int or JSON with these types of data conversion relationship as follows: Here Insert Picture Description
Actual conversions the following example:

>>> json.dumps(
... {
... 'str': 'ABC',
... 'int': 123,
... 'float': 321.45,
... 'bool_true': True,
... 'bool_false': False,
... 'none': None,
... 'list': [1, 2, 3],
... 'tuple': [12, 34]
... }
... )
'{"str": "ABC", "int": 123, "float": 321.45, "bool_true": true, "bool_flase": false, "none": null, "list": [1, 2, 3], "tuple": [12, 34]}'

Although JSON standard specification does not support NaN, Infinity and -Infinity, but the default json.dumps implementation will float ( 'nan'), float ( 'inf') and float ( '- inf') is converted to the constant NaN, Infinity, . -Infinity and the following example:

>>> json.dumps(
... {
... 'nan': float('nan'),
... 'inf': float('inf'),
... '-inf': float('-inf')
... }
... )
'{"nan": NaN, "inf": Infinity, "-inf": -Infinity}'

由于这些常量可能会导致生成的JSON字符串不能被其他的JSON实现处理, 为了防止这种情况出现, 可以将json.dumps的allow_nan参数设置为True. 此时, 当处理的Python对象中出现这些值时, json.dumps方法会抛出异常.

3.3 循环引用

json.dumps方法会检查Python对象中是否有循环引用, 如果发现了循环引用, 就会抛出异常. 如下例所示:

>>> circular_obj = {}
>>> circular_obj['self'] = circular_obj
>>> circular_obj
{'self': {...}}
>>> json.dumps(circular_obj)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
ValueError: Circular reference detected

如果不希望json.dumps方法检查循环引用, 可以将参数check_circular设置为False. 但如果此时Python对象中有循环引用, 有可能发生递归嵌套过深的错误或者其他错误, 这么做是比较危险的. 如下例所示:

>>> json.dumps(circular_obj, check_circular=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 238, in dumps
**kw).encode(obj)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
RecursionError: maximum recursion depth exceeded while encoding a JSON object

3.4 JSON字符串输出格式

json.dumps方法的indent参数可以用来控制JSON字符串的换行和缩进效果.
indent参数默认值为None. 此时, JSON字符串不会有换行和缩进效果. 如下示:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}))
{"a": 123, "b": {"x": 321, "y": "ABC"}} 

当indent为0或者负数时, JSON字符会包含换行:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=-1))
{
"a": 123,
"b": {
"x": 321,
"y": "ABC"
}
}
>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=0))
{
"a": 123,
"b": {
"x": 321,
"y": "ABC"
}
} 

而当indent为正整数时, 除了换行, JSON还会以指定数量的空格为单位在对象层次间进行缩进:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=2))
{
"a": 123,
"b": {
"x": 321,
"y": "ABC"
}
}

indent还可以是str, 此时, JSON会以str内容为单位进行缩进, 比如制表符\t:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent='\t'))
{
"a": 123,
"b": {
"x": 321,
"y": "ABC"
}
} 

json.dumps的另外一个参数separators可以用来设置输出的分隔符. 这个参数的值应当是一个有两个元素的tuple. 其第一个值为成员间的分隔符, 第二个值为键值之间的分隔符. 其默认值也会随上文中的indent参数影响. 当indent为None时, separators的默认值为(’, ‘, ‘: ‘), 即分隔符后都有一个空格. 当indent不为None时, 其默认值则为(’,’, ‘:’), 即只有键值间分隔符后会有一个空格, 而元素间分隔符则不带空格, 因为此时会有换行.
separators参数的一种可能的使用场景是希望移除所有的非必要格式字符, 以此来减小JSON字符串的大小. 此时可以将separator设置为(’,’, ‘;’), 并不设置indent参数, 或者将其显式设置为None:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=None, separators=(',', ':')))
{"a":123,"b":{"x":321,"y":"ABC"}}

3.5 转换自定义Python对象
json.dumps的默认实现只能转换Dictionary类型的对象. 如果想要转换自定义对象, 需要使用default参数. 这个参数接收一个函数, 这个函数的参数是一个要转换的Python对象, 返回值是能够表示这个Python对象的Dictionary对象. default函数会从对象引用树的顶层开始, 逐层遍历整个对象引用树. 因此, 不用自己实现对象树的遍历逻辑, 只需要处理当前层次的对象. 如下例所示:

>>> class MyClass:
... def __init__(self, x, y):
... self.x = x
... self.y = y
...
>>> def my_default(o):
... if isinstance(o, MyClass):
... print('%s.y: %s' % (type(o), o.y))
... return {'x': o.x, 'y': o.y}
... print(o)
... return o
...
>>> obj = MyClass(x=MyClass(x=1, y=2), y=11)
>>> json.dumps(obj, default=my_default)
<class '__main__.MyClass'>.y: 11
<class '__main__.MyClass'>.y: 2
'{"x": {"x": 1, "y": 2}, "y": 11}'

3.6 非字符串类型键名

在Python中, 只是可哈希(hashable)的对象和数据都可以做为Dictionary对象的键, 而JSON规范中则只能使用字符串做为键名. 所以在json.dumps的实现中, 对这个规则进行了检查, 不过键名允许的范围有所扩大, str, int, float, bool和None类型的数据都可以做为键名. 不过当键名非str的情况时, 键名会转换为对应的str值. 如下例:

>>> json.dumps(
... {
... 'str': 'str',
... 123: 123,
... 321.54: 321.54,
... True: True,
... False: False,
... None: None
... }
... )
'{"str": "str", "123": 123, "321.54": 321.54, "true": true, "false": false, "null": null}'

而当出现其他类型的键名时, 默认出抛出异常:

>>> json.dumps({(1,2): 123})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
TypeError: keys must be a string 

json.dumps的skipkeys参数可以改变这个行为. 当将skipkeys设置为True时, 遇到非法的键名类型, 不会抛出异常, 而是跳过这个键名:

>>> json.dumps({(1,2): 123}, skipkeys=True)
'{}'

3.7 生成JSON文件
当需要将生成的JSON数据保存到文件时, 可以使用json.dump方法. 这个方法比json.dumps多了一个参数fp, 这个参数就是用来保存JSON数据的文件对象. 比如, 下例中的代码

>>> with open('/tmp/data.json', mode='a') as jf:
... json.dump({'a': 123}, jf)
...

就会将JSON数据写入到/tmp/data.json文件里. 代码执行完后, 文件内容为

{"a": 123} 
json.dump方法也可以接受其他类文件对象:
>>> sio = io.StringIO()
>>> json.dump({'a': 123}, sio)
>>> sio.getvalue()
'{"a": 123}'

json.dump的其他参数和json.dumps的用法相同, 这里不再赘述.

4 JSON解码和编码类实现

json.loads, json.load, json.dumps和json.dump这四个方法是通过json.JSONDecoder和json.JSONEncoder这两个类来完成各自的任务的. 所以也可以直接使用这两个类来完成前文描述的功能:

>>> json.JSONDecoder().decode('{"a": 123}')
{'a': 123}
>>> json.JSONEncoder().encode({'a': 123})
'{"a": 123}'

json.loads, json.load, json.dumps和json.dump这个四个方法的参数主要都是传递给了json.JSONDecoder和json.JSONEncoder的构造方法, 所以使用这些方法可以满足绝大部分需求. 当需要自定义json.JSONDecoder和json.JSONEncoder子类的时候, 只需要将子类传递给cls参数. 同时, 这些方法都有**kw参数. 当自定义实现类的构造函数需要标准参数列表之外的新参数时, 这个参数就会将新参数传递给实现类的构造方法.

这个是我们的学习基地,欢迎大家加入学习了,
Here Insert Picture Description

发布了21 篇原创文章 · 获赞 9 · 访问量 2万+

Guess you like

Origin blog.csdn.net/haoxun06/article/details/104435319