hashlib
haslib
The module implements a common, common interface for different Secure Hash Algorithm and Message Digest Algorithm. The haslib module provides function implementations for many algorithms, such as: md5, sha1, sha224, sha256, sha384, sha512, etc.
Here is a brief introduction to the centralized data encryption method
Data encryption method | describe | main problem solving | Common Algorithms |
---|---|---|---|
Symmetric encryption | Data encryption and decryption using the same key | Confidentiality of data | DES, AES |
Asymmetric encryption | Also known as public key encryption, which means that data encryption and decryption use different keys (key pairs) | Authentication | DSA,RSA |
One-way encryption | Only encrypt data, not decrypt data | Data integrity verification | MD5, SHA system algorithm |
Digest algorithm is also called hash algorithm, hash algorithm. It converts data of any length into a fixed-length string (usually represented by a hexadecimal string) through a function. md5
Algorithms are commonly used .
Properties and functions corresponding to the hashlib module
-
hselib.new(name[, date])
The general hash object constructor, used to construct the hash object corresponding to the specified hash algorithm. Itname
is used to specify the name of the hash algorithm, such asmd5
,sha1
, which is not case-sensitive;data
optional, represents the initial data.1 2 3
>>> import hashlib >>> hashlib.new('md5') #Build a hash object <md5 HASH object @ 0x00000131AE948AD0>
-
hashlib.algorithm name()
can directly obtain the hash object through the function corresponding to the specific hash algorithm name, such as hashlib.md5().
hashlib.md5() and hashlib.new('md5') are equivalent.1 2 3
>>> import hashlib >>> hashlib.md5() <md5 HASH object @ 0x00000131AE9489E0>
-
hashlib.algorithms_guaranteed
Its value is the set of names of hash algorithms supported by this module on all platforms.1 2 3
>>> import hashlib >>> hashlib.algorithms_guaranteed { 'sha384', 'blake2b', 'sha3_384', 'shake_128', 'sha3_256', 'sha3_224', 'md5', 'sha3_512', 'blake2s', 'sha224', 'shake_256', 'sha256' sha512', 'sha1'}
-
hashlib.algorithms_available
currently runs the set of hash algorithms available in the python interpreter.algorithms_guaranteed
is a subset of it.1 2 3
>>> import hashlib >>> hashlib.algorithms_available {'md4', 'sha3_384', 'shake_128', 'whirlpool', 'SHA224', 'blake2s', 'ripemd160', 'MD4', 'sha1', 'sha384', 'SHA384', 'ecdsa-with- SHA1', 'md5', 'SHA256', 'DSA-SHA', 'SHA1', 'sha3_512', 'shake_256', 'sha', 'sha256', 'sha512', 'DSA', 'RIPEMD160', ' blake2b', 'dsaEncryption', 'SHA512', 'sha3_224', 'sha224', 'SHA', 'sha3_256', 'MD5', 'dsaWithSHA'}
Properties and methods corresponding to the hash object
- hash.update()
updates the data to be calculated by the hash object, and multiple calls are cumulative.
m.update(a);m.update(b)
is equivalent tom.update(a+b)
. - hash.digest()
returns a digest of all data passed to the update() function (a string in binary format) - hash.hexdigest()
returns digest information (string in hexadecimal format) of all data passed to update() function - hash.copy()
returns a copy("clone") of the hash object, which can be used to efficiently compute digests of data that share a common initialization substring. - hash.digest_size
The byte size of the hash result, that is,hash.digest()
the string length of the result returned by the method. This value is fixed for hash objects. md5:16, sha1:20, sha224:28. - hash.name
The standard name (lowercase) of the hash algorithm corresponding to the current hash object, which can be passed directly to thehashlib.new()
function to create another hash object of the same type.
Example of use1 2 3 4 5 6 7 8 9 10 11 12 13
>>> import hashlib >>> m = hashlib.md5() #Create a hash object >>> m.update('hello world'.encode('utf-8')) #Update the calculation data of the hash object >>> print(m) <md5 HASH object @ 0x00000131AE9489E0> >>> print(m.hexdigest()) #Return the corresponding data digest information (hexadecimal) 5eb63bbbe01eeed093cb22bb8f5acdc3 >>> print(m.digest()) #Return the corresponding data digest information (binary) b'^\xb6;\xbb\xe0\x1e\xee\xd0\x93\xcb"\xbb\x8fZ\xcd\xc3' >>> print(m.name) #return the hash algorithm used md5 >>> print(m.digest_size) #return the length of the string 16
important point:
- In actual use, the hexadecimal string is obtained, that is, use
hash.hexdigest()
. - Using the hashlib module is generally a three-step process, by
hashlib.md5()
creating a hash object; byupdate()
appending the data to be calculated; byhexdigest()
obtaining the hexadecimal string corresponding to the data (that is, the summary information).StringIO
StringIO
Mainly used to write strings and string caches in memory. Its interface is the same as that of file operations, and basically all methods related to files can be used.
For the method of file operation, you can click python file operation to view more content.
Example1 2 3 4 5 6
>>> from io import StringIO >>> s = StringIO() #Initialize StringIO object >>> s.write('hello world') #write string 11 >>> s.getvalue() #Get the string in the instance 'hello world'
stringIO.getvalue()
Returns the string in the StringIO instance.
To read strings in StringIO, you can also use file-like read
, readline
, readlines
etc. methods.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
>>> s = StringIO() >>> s.write('hello world') 11 >>> s.write('\n new line') 10 >>> s.seek(0) #Go back to the beginning of the file 0 >>> for line in s: ... print(line) ... hello world new line |
BytesIO
StringIO
The operation can only be str
, if you want to operate on binary data, you need to use BytesIO
.
BytesIO
Implements reading and writing bytes in memory.
1 2 3 4 5 6 7 8 9 10 11 12 |
>>> from io import BytesIO >>> f = BytesIO() >>> f.write('中文'.encode('utf-8')) 6 >>> print(f.getvalue()) b '\ xe4 \ xb8 \ limit \ xe6 \ x96 \ x87' >>> f.seek(0) #Go back to the beginning of the file 0 >>> f.read() #read b '\ xe4 \ xb8 \ limit \ xe6 \ x96 \ x87' >>> 'Chinese'.encode('utf-8') #Comparison of read content and write b '\ xe4 \ xb8 \ limit \ xe6 \ x96 \ x87' |
What is written here is not a 中文
string, but UTF-8 encoded bytes.
Json
Json (JavaScript Object Notation), it is a lightweight data interchange format. It is most widely used as a data format for communication between web servers and clients in AJAX, and is also commonly used in http requests now.
- 序列化和反序列化
将对象转换为可通过网络传输或可以存储到本地磁盘的数据格式(如:XML、JSON或特定数据格式)的过程称为序列化;反之则称为反序列化。
python的JSON模块序列化和反序列化的过程分别叫做:encoding和decoding。
encoding:把python对象转换成JSON字符串。
decoding:把JSON字符串转换成python对象。
json模块提供了下面的方法进行序列化和反序列化操作1 2 3 4
#序列化:将python对象转换成json字符串 dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw) #反序列化:将json字符串转换成python对象 loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
将序列化后得到的json数据保存到文件以及直接读取文件中的json数据进行反序列化操作
1 2 3 4 |
#序列化:将python对象转换成json字符串并存储到文件 dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw) #反序列化:读取指定文件中的json字符串并转换成python对象 load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw) |
json和python数据类型对应关系
- python转换为json
Python | JSON |
---|---|
dict | object |
list,tuple | array |
str | string |
int,float,int- & float-derived Enums | number |
True | true |
False | false |
None | null |
- json转换为python
JSON | Python |
---|---|
object | dict |
array | list |
string | str |
number(int) | int |
number(real) | float |
true | True |
false | False |
null | None |
序列化示例
1 2 3 |
>>> import json >>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}) '{"a": "str", "c": true, "e": 10, "b": 11.1, "d": null, "f": [1, 2, 3], "g": [4, 5, 6]}' |
sort_keys
参数: 表示序列化时是否对dict的key进行排序(dict默认是无序的)
1 2 |
>>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, sort_keys=True) '{"a": "str", "b": 11.1, "c": true, "d": null, "e": 10, "f": [1, 2, 3], "g": [4, 5, 6]}' |
indent
参数: 表示缩进,它可以使得数据存储的格式变得更加优雅、可读性更强;如果indent是一个非负整数或字符串,则JSON array元素和object成员将会被以相应的缩进级别进行打印输出;如果indent是0或负数或空字符串,则将只会插入换行,不会有缩进。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
>>> print(json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, sort_keys=True, indent=4)) { "a": "str", "b": 11.1, "c": true, "d": null, "e": 10, "f": [ 1, 2, 3 ], "g": [ 4, 5, 6 ] } |
separators
参数: 尽管indent参数可以使得数据存储的格式变得更加优雅、可读性更强,但是那是通过添加一些冗余的空白字符进行填充的。当json被用于网络数据通信时,应该尽可能的减少无用的数据传输,这样可以节省带宽并加快数据传输速度。json模块序列化Python对象后得到的json字符串中的’,’号和’:’号分隔符后默认都会附加一个空白字符,我们可以通过separators
参数重新指定分隔符,从而去除无用的空白字符。
该参数的值应该是一个tuple(item_separator, key_separator)
- 若indent是None,其默认值为(‘, ‘, ‘: ‘)
- 若indent不为None,则默认值为(‘,’, ‘: ‘)
- 我们可以通过为separator赋值为(‘,’, ‘:’)来消除空白字符
1 2 3 4
>>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}) '{"a": "str", "c": true, "e": 10, "b": 11.1, "d": null, "f": [1, 2, 3], "g": [4, 5, 6]}' >>> json.dumps({'a':'str', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, separators=(',',':')) '{"a":"str","c":true,"e":10,"b":11.1,"d":null,"f":[1,2,3],"g":[4,5,6]}'
ensure_ascii
参数: 当该参数的值为True(默认值)时,输出中的所有非ASCII字符(比如中文)都会被转义成’\uXXXX’组成的序列,得到的结果是一个完全由ASCII字符组成的str实例。如果我们想得到一个人类可读的输出结果,需要把ensure_ascii参数的值设置为False。
1 2 3 4 5 6 7 |
>>> stu={"name": "小明", "age" : 16} >>> stu_json = json.dumps(stu) >>> print(stu_json) {"name": "\u5c0f\u660e", "age": 16} >>> stu_json01 = json.dumps(stu, ensure_ascii=False) >>> print(stu_json01) {"name": "小明", "age": 16} |
说明:\u5c0f\u660e
是unicode字符对应的内存编码值,该内存编码名称为”unicode-escape”。可以通过unicodestr.encode('unicode-escape')
和decode('unicode-escape')
完成unicode字符串和Unicode内存编码序列进行相互转换。
反序列化示例
1 2 3 4 |
>>> json.loads('{"a": "str", "c": true, "b": 11.1, "e": 10, "d": null, "g": [4, 5, 6], "f": [1, 2, 3]}') {'a': 'str', 'c': True, 'b': 11.1, 'e': 10, 'd': None, 'g': [4, 5, 6], 'f': [1, 2, 3]} >>> json.loads('{"a":"str","c":true,"b":11.1,"e":10,"d":null,"g":[4,5,6],"f":[1,2,3]}') {'a': 'str', 'c': True, 'b': 11.1, 'e': 10, 'd': None, 'g': [4, 5, 6], 'f': [1, 2, 3]} |
load()和dump()
1 2 3 4 5 6 7 8 9 |
#序列化到文件中 >>> with open('test.json', 'w') as fp: ... json.dump({'a':'str中国', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g':(4, 5, 6)}, fp, indent=4) ... #反序列化文件中的内容 >>> with open('test.json', 'r') as fp: ... json.load(fp) ... {'a': 'str中国', 'c': True, 'e': 10, 'b': 11.1, 'd': None, 'f': [1, 2, 3], 'g': [4, 5, 6]} |
注意:如果试图使用相同的fp
调用dump()函数去序列化多个对象,将会产生一个无效的JSON文件。也就是说对于一个fd只能调用一次dump()
。
更多更详细的内容可以点击JSON encoder and decoder查看官网内容。