json.dumps(var,ensure_ascii=False) does not solve the problem of Chinese garbled characters
json.dumps
There will be different performances in different versions of Python. Note that the Chinese garbled problem mentioned below does not exist in the Python3 version.
Note: The following code python 2.7
is tested under the version
1 2 3 |
|
result:
1 |
|
1 |
|
result:
1 |
|
1 |
|
result:
1 |
|
1 |
|
result:
1 |
|
To solve Chinese encoding, you need to know how python2.7 handles strings:
Because # -- coding: utf-8 --
of the role, the content of the file is encoded in utf-8, so print odata
The output is the result of utf-8 encoding{‘a’: ‘\xe4\xbd\xa0\xe5\xa5\xbd’}
json.dumps is the ascii encoding used by default for Chinese when serializing, print json.dumps(odata) outputs the result of unicode encoding
print json.dumps(odata,ensure_ascii=False)
Unused ascii encoding, encoding in gbk
"Hello" encoding with utf8 is %E4%BD%A0%E5%A5%BD and decoding with gbk is huan ソ
The representation of strings in Python is unicode encoding.
Therefore, when doing encoding conversion, it is usually necessary to use unicode as the intermediate encoding, that is, first decode other encoded strings into unicode, and then encode from unicode into another encoding.
The function of decode is to convert other encoded strings into unicode encoding
decode('utf-8') means to convert utf-8 encoded string into unicode encoding.
The function of encode is to convert unicode encoding into other encoded strings
encode('gb2312'), which means to convert a unicode-encoded string into gb2312 encoding.
There is no such problem in python3, so the easiest way is to introduce the __future__ module to import the features of the new version into the current version
1 2 |
|
result:
1 |
|
UnicodeEncodeError:'ascii' codec can't encode exception occurred in Python2.7 when writing the file
Great God’s solution:
Do not use open to open the file, but use codecs:
1 2 3 4 5 |
|