python serialize data

There are two different modules available for serializing data in python, one is json format and the other is pickle module!

Serialization concept:

Serialization: The process of converting the state information of an object into a process that can be stored or transmitted over the network. The transmission format can be JSON, XML, etc.

Deserialization is to read the state of the deserialized object from the storage area (JSON, XML) and recreate the object.

JSON module

JSON is a common format across languages, so data can be passed across languages.

JSON: A lightweight data interaction format, which is simpler than XML, easy to read and write, and easy for machines to parse and generate. JSON is a subset of JavaScript.

The process of serialization and deserialization of python's json module are encoding and decoding, respectively.

  • encoding: Convert a python object encoding into a Json string, the corresponding methods are load and loads.
  • decoding: Convert the json format string encoding into a python object, the corresponding methods are dump and dumps.

The json module provides four methods: dump , load, dumps, loads

The dumps method and the loads method of json: dumps formats the python object as a json string and saves it in a variable, and the loads method deserializes the json string into a python object.

# dumps function 
# Convert the data to a string recognized by all programming languages ​​in a special form 
In [2]: data = [ " a " , " bb " , " ccc " ]

In [3]: j_data = json.dumps(data)

In [4]: j_data
Out[4]: '["a", "bb", "ccc"]'

In [5]: type(j_data)
Out[5]: str

# loads function 
# Convert the json-encoded string to a python data structure 
In [6]: meg = json.loads(j_data)

In [7]: meg
Out[7]: [u'a', u'bb', u'ccc']

In [8]: type(meg)
Out[8]: list

The functions of dump and load are the same as above, except that the data is saved in a file.

 # dump function 
 # Convert the data into a string recognized by all programming languages ​​through a special form, and write to the file 
 In [11]: with open( " /tmp/tmp.json " , " wr " ) as fd:
   ....:     json.dump(data,fd)
   ....:
[root@db2 tmp]# cat tmp.json 
["a", "bb", "ccc"]
[root@db2 tmp] #
 
# load function 
# Read data from the data file and convert the json-encoded string to python's data structure 
In [1]: import json

In [2]: fd = open("/tmp/tmp.json","r")

In [3]: data = json.load(fd)

In [4]: data
Out[ 4]: [u ' a ' , u ' b ' , u ' c ' ]   #Pay attention to the encoding format

For dictionaries, json will assume the keys are strings (any non-string keys in the dictionary will be converted to strings when encoding), to comply with the JSON specification, only python lists and dictionaries should be encoded. In addition, in web applications, it is a standard practice to define the top-level object as a dictionary.
The format of json encoding is almost the same as python syntax, with a slight difference: True will be mapped to true, False will be mapped to false, None will be mapped to null, and tuple() will be mapped to list[], because Other languages ​​don't have the concept of tuples, only arrays, which are lists.

Some parameters used (excerpt):

  • ensure_ascii is True by default, which ensures that all ascii characters in the converted json string, and non-ascii characters will be escaped. If there are Chinese or other non-ascii characters in the data, it is best to set ensure_ascii to False to ensure normal output results.
  • indent Indent, the default is None, no indent, when set to a positive integer, the output format will be indented according to the number of half-width spaces specified by indent, which is quite practical.
  • separators Set the separator. The default separator is (',', ': '). If you need to customize the separator in json, such as adjusting the number of spaces before and after the colon, you can set it in the form of (item_separator, key_separator).
  • sort_keys defaults to False. When set to True, the output results will be sorted by the keys in the dictionary.
data
Out[ 23]: { ' a ' : True, ' b ' : False, ' c ' : None, ' d ' : (1, 2), 1: ' abc ' , ' cn ' : [ ' China-Beijing ' ]}

jdata1 = json.dumps(data) #Use   default parameters

print(jdata1)
{"a": true, "b": false, "c": null, "d": [1, 2], "1": "abc", "cn": ["\u4e2d\u56fd-\u5317\u4eac"]}

jdata2 = json.dumps(data,ensure_ascii=False, indent=4) #Modify parameters

print(jdata2)
{
    "a": true,
    "b": false,
    "c": null,
    "d": [
        1,
        2
    ],
    "1": "abc",
    "cn": [
        "中国-北京"
    ]
}

Use of pickle module

The json module is a common data format for various programming languages, and pickle is a serialization format that comes with python.

There are two modules pickle and cpickle in python2, and there is only one module pickle in python3.

There are also four methods corresponding to pickle and dump and load, dumps and loads.

In [1]: l1 = [1,2,3]

In [2]: l2 = ["a","c","d"]

In [3]: l3 = {1:"a", 2:"c", 3:"d"}

In [5]: fd = open("pickle.kpi", "a")  

In [ 6]: import pickle   #Serialize the three lists into the file in order 

In [ 8 ]: pickle.dump(l1,fd)

In [9]: pickle.dump(l2,fd)

In [10]: pickle.dump(l3,fd)

In [11]: fd.close()

In [12]: fd = open("pickle.kpi", "r") 

In [ 13]: s1 = pickle.load(fd) #Note   that when deserializing, read the order of the list, follow the principle of first-in, first-out 

In [ 14 ]: s1
Out[14]: [1, 2, 3]

In [16]: s2 = pickle.load(fd)

In [17]: s2
Out[17]: ['a', 'c', 'd']

In [19]: s3 = pickle.load(fd)

In [20]: s3
Out[20]: {1: 'a', 2: 'c', 3: 'd'}

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324872806&siteId=291194637