12th class

hashlib

haslibThe module implements a common, common interface for different Secure Hash Algorithm and Message Digest Algorithm. The haslib module provides function implementations for many algorithms, such as: md5, sha1, sha224, sha256, sha384, sha512, etc.

Here is a brief introduction to the centralized data encryption method

Data encryption method describe main problem solving Common Algorithms
Symmetric encryption Data encryption and decryption using the same key Confidentiality of data DES, AES
Asymmetric encryption Also known as public key encryption, which means that data encryption and decryption use different keys (key pairs) Authentication DSA,RSA
One-way encryption Only encrypt data, not decrypt data Data integrity verification MD5, SHA system algorithm

Digest algorithm is also called hash algorithm, hash algorithm. It converts data of any length into a fixed-length string (usually represented by a hexadecimal string) through a function. md5Algorithms are commonly used .
Properties and functions corresponding to the hashlib module

  • hselib.new(name[, date])
    The general hash object constructor, used to construct the hash object corresponding to the specified hash algorithm. It nameis used to specify the name of the hash algorithm, such as md5, sha1, which is not case-sensitive; dataoptional, represents the initial data.

    1
    2
    3
    >>> import hashlib
    >>> hashlib.new('md5') #Build a hash object
    <md5 HASH object @ 0x00000131AE948AD0>
  • hashlib.algorithm name()
    can directly obtain the hash object through the function corresponding to the specific hash algorithm name, such as hashlib.md5().
    hashlib.md5() and hashlib.new('md5') are equivalent.

    1
    2
    3
    >>> import hashlib
    >>> hashlib.md5()
    <md5 HASH object @ 0x00000131AE9489E0>
  • hashlib.algorithms_guaranteed
    Its value is the set of names of hash algorithms supported by this module on all platforms.

    1
    2
    3
    >>> import hashlib
    >>> hashlib.algorithms_guaranteed
    { 'sha384', 'blake2b', 'sha3_384', 'shake_128', 'sha3_256', 'sha3_224', 'md5', 'sha3_512', 'blake2s', 'sha224', 'shake_256', 'sha256' sha512', 'sha1'}
  • hashlib.algorithms_available
    currently runs the set of hash algorithms available in the python interpreter. algorithms_guaranteedis a subset of it.

    1
    2
    3
    >>> import hashlib
    >>> hashlib.algorithms_available
    {'md4', 'sha3_384', 'shake_128', 'whirlpool', 'SHA224', 'blake2s', 'ripemd160', 'MD4', 'sha1', 'sha384', 'SHA384', 'ecdsa-with- SHA1', 'md5', 'SHA256', 'DSA-SHA', 'SHA1', 'sha3_512', 'shake_256', 'sha', 'sha256', 'sha512', 'DSA', 'RIPEMD160', ' blake2b', 'dsaEncryption', 'SHA512', 'sha3_224', 'sha224', 'SHA', 'sha3_256', 'MD5', 'dsaWithSHA'}

Properties and methods corresponding to the hash object

  • hash.update()
    updates the data to be calculated by the hash object, and multiple calls are cumulative.
    m.update(a);m.update(b)is equivalent to m.update(a+b).
  • hash.digest()
    returns a digest of all data passed to the update() function (a string in binary format)
  • hash.hexdigest()
    returns digest information (string in hexadecimal format) of all data passed to update() function
  • hash.copy()
    returns a copy("clone") of the hash object, which can be used to efficiently compute digests of data that share a common initialization substring.
  • hash.digest_size
    The byte size of the hash result, that is, hash.digest()the string length of the result returned by the method. This value is fixed for hash objects. md5:16, sha1:20, sha224:28.
  • hash.name
    The standard name (lowercase) of the hash algorithm corresponding to the current hash object, which can be passed directly to the hashlib.new()function to create another hash object of the same type.
    Example of use
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    >>> import hashlib
    >>> m = hashlib.md5() #Create a hash object
    >>> m.update('hello world'.encode('utf-8')) #Update the calculation data of the hash object
    >>> print(m)
    <md5 HASH object @ 0x00000131AE9489E0>
    >>> print(m.hexdigest()) #Return the corresponding data digest information (hexadecimal)
    5eb63bbbe01eeed093cb22bb8f5acdc3
    >>> print(m.digest()) #Return the corresponding data digest information (binary)
    b'^\xb6;\xbb\xe0\x1e\xee\xd0\x93\xcb"\xbb\x8fZ\xcd\xc3'
    >>> print(m.name) #return the hash algorithm used
    md5
    >>> print(m.digest_size) #return the length of the string
    16

important point:

  1. In actual use, the hexadecimal string is obtained, that is, use hash.hexdigest().
  2. Using the hashlib module is generally a three-step process, by hashlib.md5()creating a hash object; by update()appending the data to be calculated; by hexdigest()obtaining the hexadecimal string corresponding to the data (that is, the summary information).

    StringIO

    StringIOMainly used to write strings and string caches in memory. Its interface is the same as that of file operations, and basically all methods related to files can be used.
    For the method of file operation, you can click python file operation to view more content.
    Example
    1
    2
    3
    4
    5
    6
    >>> from io import StringIO
    >>> s = StringIO() #Initialize StringIO object
    >>> s.write('hello world') #write string
    11
    >>> s.getvalue() #Get the string in the instance
    'hello world'

stringIO.getvalue()Returns the string in the StringIO instance.
To read strings in StringIO, you can also use file-like read, readline, readlinesetc. methods.

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> s = StringIO()
>>> s.write('hello world')
11
>>> s.write('\n new line')
10
>>> s.seek(0) #Go back to the beginning of the file
0
>>> for line in s:
...     print(line)
...
hello world

 new line

 

BytesIO

StringIOThe operation can only be str, if you want to operate on binary data, you need to use BytesIO.
BytesIOImplements reading and writing bytes in memory.

1
2
3
4
5
6
7
8
9
10
11
12
>>> from io import BytesIO
>>> f = BytesIO()
>>> f.write('中文'.encode('utf-8'))
6
>>> print(f.getvalue())
b '\ xe4 \ xb8 \ limit \ xe6 \ x96 \ x87'
>>> f.seek(0) #Go back to the beginning of the file
0
>>> f.read() #read
b '\ xe4 \ xb8 \ limit \ xe6 \ x96 \ x87'
>>> 'Chinese'.encode('utf-8') #Comparison of read content and write
b '\ xe4 \ xb8 \ limit \ xe6 \ x96 \ x87'

 

What is written here is not a 中文string, but UTF-8 encoded bytes.

Json

Json (JavaScript Object Notation), it is a lightweight data interchange format. It is most widely used as a data format for communication between web servers and clients in AJAX, and is also commonly used in http requests now.

  • 序列化和反序列化
    将对象转换为可通过网络传输或可以存储到本地磁盘的数据格式(如:XML、JSON或特定数据格式)的过程称为序列化;反之则称为反序列化。
    python的JSON模块序列化和反序列化的过程分别叫做:encoding和decoding。
    encoding:把python对象转换成JSON字符串。
    decoding:把JSON字符串转换成python对象。
    json模块提供了下面的方法进行序列化和反序列化操作
    1
    2
    3
    4
    #序列化:将python对象转换成json字符串
    dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
    #反序列化:将json字符串转换成python对象
    loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

将序列化后得到的json数据保存到文件以及直接读取文件中的json数据进行反序列化操作

1
2
3
4
#序列化:将python对象转换成json字符串并存储到文件
dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
#反序列化:读取指定文件中的json字符串并转换成python对象
load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

 

json和python数据类型对应关系

  1. python转换为json
Python JSON
dict object
list,tuple array
str string
int,float,int- & float-derived Enums number
True true
False false
None null
  1. json转换为python
JSON Python
object dict
array list
string str
number(int) int
number(real) float
true True
false False
null None

序列化示例

1
2
3
>>> import json
>>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)})
'{"a": "str", "c": true, "e": 10, "b": 11.1, "d": null, "f": [1, 2, 3], "g": [4, 5, 6]}'

 

sort_keys参数: 表示序列化时是否对dict的key进行排序(dict默认是无序的)

1
2
>>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, sort_keys=True)
'{"a": "str", "b": 11.1, "c": true, "d": null, "e": 10, "f": [1, 2, 3], "g": [4, 5, 6]}'

 

indent参数: 表示缩进,它可以使得数据存储的格式变得更加优雅、可读性更强;如果indent是一个非负整数或字符串,则JSON array元素和object成员将会被以相应的缩进级别进行打印输出;如果indent是0或负数或空字符串,则将只会插入换行,不会有缩进。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
>>> print(json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, sort_keys=True, indent=4))
{
    "a": "str",
    "b": 11.1,
    "c": true,
    "d": null,
    "e": 10,
    "f": [
        1,
        2,
        3
    ],
    "g": [
        4,
        5,
        6
    ]
}

 

separators参数: 尽管indent参数可以使得数据存储的格式变得更加优雅、可读性更强,但是那是通过添加一些冗余的空白字符进行填充的。当json被用于网络数据通信时,应该尽可能的减少无用的数据传输,这样可以节省带宽并加快数据传输速度。json模块序列化Python对象后得到的json字符串中的’,’号和’:’号分隔符后默认都会附加一个空白字符,我们可以通过separators参数重新指定分隔符,从而去除无用的空白字符。
该参数的值应该是一个tuple(item_separator, key_separator)

  • 若indent是None,其默认值为(‘, ‘, ‘: ‘)
  • 若indent不为None,则默认值为(‘,’, ‘: ‘)
  • 我们可以通过为separator赋值为(‘,’, ‘:’)来消除空白字符
    1
    2
    3
    4
    >>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)})
    '{"a": "str", "c": true, "e": 10, "b": 11.1, "d": null, "f": [1, 2, 3], "g": [4, 5, 6]}'
    >>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, separators=(',',':'))
    '{"a":"str","c":true,"e":10,"b":11.1,"d":null,"f":[1,2,3],"g":[4,5,6]}'

ensure_ascii参数: 当该参数的值为True(默认值)时,输出中的所有非ASCII字符(比如中文)都会被转义成’\uXXXX’组成的序列,得到的结果是一个完全由ASCII字符组成的str实例。如果我们想得到一个人类可读的输出结果,需要把ensure_ascii参数的值设置为False。

1
2
3
4
5
6
7
>>> stu={"name": "小明", "age" : 16}
>>> stu_json = json.dumps(stu)
>>> print(stu_json)
{"name": "\u5c0f\u660e", "age": 16}
>>> stu_json01 = json.dumps(stu, ensure_ascii=False)
>>> print(stu_json01)
{"name": "小明", "age": 16}

 

说明:\u5c0f\u660e是unicode字符对应的内存编码值,该内存编码名称为”unicode-escape”。可以通过unicodestr.encode('unicode-escape')decode('unicode-escape')完成unicode字符串和Unicode内存编码序列进行相互转换。
反序列化示例

1
2
3
4
>>> json.loads('{"a": "str", "c": true, "b": 11.1, "e": 10, "d": null, "g": [4, 5, 6], "f": [1, 2, 3]}')
{'a': 'str', 'c': True, 'b': 11.1, 'e': 10, 'd': None, 'g': [4, 5, 6], 'f': [1, 2, 3]}
>>> json.loads('{"a":"str","c":true,"b":11.1,"e":10,"d":null,"g":[4,5,6],"f":[1,2,3]}')
{'a': 'str', 'c': True, 'b': 11.1, 'e': 10, 'd': None, 'g': [4, 5, 6], 'f': [1, 2, 3]}

 

load()和dump()

1
2
3
4
5
6
7
8
9
#序列化到文件中
>>> with open('test.json', 'w') as fp:
...     json.dump({'a':'str中国', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, fp, indent=4)
...
#反序列化文件中的内容
>>> with open('test.json', 'r') as fp:
...     json.load(fp)
...
{'a': 'str中国', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g': [4, 5, 6]}

 

注意:如果试图使用相同的fp调用dump()函数去序列化多个对象,将会产生一个无效的JSON文件。也就是说对于一个fd只能调用一次dump()
更多更详细的内容可以点击JSON encoder and decoder查看官网内容。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325011753&siteId=291194637