Python (built-in module Ⅰ)
A serialization module (very important)
Serialization of: a data structure (, list, dict ...), converted into a string of specific process
To solve the problem we have now is: If there is a special kind of string that can be interchangeable with any data structure.
Sequence module: a data structure will be converted to a particular sequence (a particular string, bytes) and may also be flipped back.
Two, json and pickle module
json module: it is recognized as a sequence of all languages, but limited support python data structure (STR BOOL dict int List (tuple), a float None),
json module is a data structure satisfies the condition converted to special string, and also can deserialize restore back
json sequence of two pairs of four methods
dumps loads: mainly used for network transmission
import json dic = {'k1':'v1','k2':'v2','k3':'v3'} str_dic = json.dumps(dic) #序列化:将一个字典转换成一个字符串 print(type(str_dic),str_dic) #<class 'str'> {"k3": "v3", "k1": "v1", "k2": "v2"} #注意,json转换完的字符串类型的字典中的字符串是由""表示的 dic2 = json.loads(str_dic) #反序列化:将一个字符串格式的字典转换成一个字典 #注意,要用json的loads功能处理的字符串类型的字典中的字符串必须由""表示 print(type(dic2),dic2) #<class 'dict'> {'k1': 'v1', 'k2': 'v2', 'k3': 'v3'} list_dic = [1,['a','b','c'],3,{'k1':'v1','k2':'v2'}] str_dic = json.dumps(list_dic) #也可以处理嵌套的数据类型 print(type(str_dic),str_dic) #<class 'str'> [1, ["a", "b", "c"], 3, {"k1": "v1", "k2": "v2"}] list_dic2 = json.loads(str_dic) print(type(list_dic2),list_dic2) #<class 'list'> [1, ['a', 'b', 'c'], 3, {'k1': 'v1', 'k2': 'v2'}]
dump load: access to individual data files
import json f = open('json_file.json','w') dic = {'k1':'v1','k2':'v2','k3':'v3'} json.dump(dic,f) #dump方法接收一个文件句柄,直接将字典转换成json字符串写入文件 f.close() # json文件也是文件,就是专门存储json字符串的文件。 f = open('json_file.json') dic2 = json.load(f) #load方法接收一个文件句柄,直接将文件中的json字符串转换成数据结构返回 f.close() print(type(dic2),dic2)
json sequence of stored data to a plurality of the same file
dic1 = {'name':'boy1'}
dic2 = {'name':'boy2'}
dic3 = {'name':'boy3'}
f = open('序列化',encoding='utf-8',mode='a')
str1 = json.dumps(dic1)
f.write(str1+'\n')
str2 = json.dumps(dic2)
f.write(str2+'\n')
str3 = json.dumps(dic3)
f.write(str3+'\n')
f.close()
f = open('序列化',encoding='utf-8')
for line in f:
print(json.loads(line))
pickle module: The pickle module is Python all data structures and the like and the object type is converted into bytes, then the reduction may also be deserialized back, can only be used in the python language, support all data types and means puthon
pickle serialized two pairs four methods
dumps loads: only is the network transmission
import pickle dic = {'k1':'v1','k2':'v2','k3':'v3'} str_dic = pickle.dumps(dic) print(str_dic) # bytes类型 dic2 = pickle.loads(str_dic) print(dic2) #字典
# 还可以序列化对象 import pickle def func(): print(666) ret = pickle.dumps(func) print(ret,type(ret)) # b'\x80\x03c__main__\nfunc\nq\x00.' <class 'bytes'> f1 = pickle.loads(ret) # f1得到 func函数的内存地址 f1() # 执行func函数
dump load: write only for files read with wb rb ab to handle file
dic = {(1,2):'oldboy',1:True,'set':{1,2,3}} f = open('pick序列化',mode='wb') pickle.dump(dic,f) f.close() with open('pick序列化',mode='wb') as f1: pickle.dump(dic,f1)
pickle sequence of stored data into a plurality of file
dic1 = {'name':'oldboy1'}
dic2 = {'name':'oldboy2'}
dic3 = {'name':'oldboy3'}
f = open('pick多数据',mode='wb')
pickle.dump(dic1,f)
pickle.dump(dic2,f)
pickle.dump(dic3,f)
f.close()
f = open('pick多数据',mode='rb')
while True:
try:
print(pickle.load(f))
except EOFError:
break
f.close()
Three, os module
Contents: Folders
working directory, the current directory, parent directory
Individual learning: __ file __ dynamically obtain the absolute path of the current file
The current working directory to perform this python files related work path
os.getcwd() 获取当前工作目录,即当前python脚本工作的目录路径 *** os.chdir("dirname") 改变当前脚本工作目录;相当于shell下cd ** os.curdir 返回当前目录: ('.') ** os.pardir 获取当前目录的父目录字符串名:('..') **
And folders related
os.makedirs() : 批量新建多级目录(文件)可生成多层递归目录*** os.removedirs() : 若目录为空,则删除,并递归到上一级目录,如若也为空,则删除,依此类推*** os.mkdir("abc") : 创建单级目录(文件)*** os.rmdir("abc") : 删除单级的目录(文件),当文件内有其他内容时,不会删除*** os.listdir(r"D:\s123") : 列出指定目录下的所有文件和子目录,包括隐藏文件,并以列表方式打印**
path associated with the path ***
os.path.abspath(path) 返回path规范化的绝对路径 *** os.path.split(path) 将path分割成目录和文件名二元组返回 *** os.path.dirname(path) 返回path的目录。其实就是os.path.split(path)的第一个元素 ** os.path.basename(path) 返回path最后的文件名。如何path以/或\结尾,那么就会返回空值,即os.path.split(path)的第二个元素。 ** os.path.exists(path) 如果path存在,返回True;如果path不存在,返回False *** os.path.isabs(path) 如果path是绝对路径,返回True ** os.path.isfile(path) 如果path是一个存在的文件,返回True。否则返回False *** os.path.isdir(path) 如果path是一个存在的目录,则返回True。否则返回False *** os.path.join(path1[, path2[, ...]]) 将多个路径组合后返回,第一个绝对路径之前的参数将被忽略 *** os.path.getatime(path) 返回path所指向的文件或者目录的最后访问时间 ** os.path.getmtime(path) 返回path所指向的文件或者目录的最后修改时间 ** os.path.getsize(path) 返回path的大小 ***
Note: os.stat ( 'path / filename') acquires file / directory structure of the information described
stat 结构: st_mode: inode 保护模式 st_ino: inode 节点号。 st_dev: inode 驻留的设备。 st_nlink: inode 的链接数。 st_uid: 所有者的用户ID。 st_gid: 所有者的组ID。 st_size: 普通文件以字节为单位的大小;包含等待某些特殊文件的数据。 st_atime: 上次访问的时间。 st_mtime: 最后一次修改的时间。 st_ctime: 由操作系统报告的"ctime"。在某些系统上(如Unix)是最新的元数据更改的时间,在其它系统上(如Windows)是创建时间(详细信息参见平台的文档)。
Four, sys module
Sys module is an interface to interact with the python interpreter
sys.argv 命令行参数List,第一个元素是程序本身路径 sys.exit(n) 退出程序,正常退出时exit(0),错误退出sys.exit(1) sys.version 获取Python解释程序的版本信息 sys.path 返回模块的搜索路径,初始化时使用PYTHONPATH环境变量的值 *** sys.platform 返回操作系统平台名称
Five, hashlib module
This module was called digest algorithm, also called the encryption algorithm or hashing algorithm, hash algorithm, etc., that through a function, the data of an arbitrary length into a data string of a fixed length in accordance with certain rules (usually hexadecimal string representation), which is a collection of a bunch of encryption algorithms
hashlib features and use of points:
- type data bytes ---> --- algorithm by hashlib> fixed-length string
- Different types of data into bytes of all the difference.
- The same type of data into bytes necessarily the same results.
- This conversion process is irreversible.
There are two main purposes hashlib:
Encrypted passwords
We have common Digest algorithm MD5, for example, to calculate the MD5 value of a string:
import hashlib md5 = hashlib.md5() md5.update('123456'.encode('utf-8')) print(md5.hexdigest()) # 计算结果如下: 'e10adc3949ba59abbe56e057f20f883e' # 验证:相同的bytes数据转化的结果一定相同 import hashlib md5 = hashlib.md5() md5.update('123456'.encode('utf-8')) print(md5.hexdigest()) # 计算结果如下: 'e10adc3949ba59abbe56e057f20f883e' # 验证:不相同的bytes数据转化的结果一定不相同 import hashlib md5 = hashlib.md5() md5.update('12345'.encode('utf-8')) print(md5.hexdigest()) # 计算结果如下: '827ccb0eea8a706c4c34a16891f84e7b'
Salt Encryption
Fixed salts
ret = hashlib.md5('xx流派'.encode('utf-8')) # xx流派就是固定的盐 ret.update('a'.encode('utf-8')) print(ret.hexdigest())
Dynamics of salt
username = '九阴真经' ret = hashlib.md5(username[::2].encode('utf-8')) # 针对于每个账户,每个账户的盐都不一样 ret.update('a'.encode('utf-8')) print(ret.hexdigest())
But for higher security requirements of enterprises, such as the financial industry, MD5 encryption method is not enough, may need higher encryption, such as sha series sha1, sha224, sha512 and so on, the larger the number, the more complex encryption method, the higher the security, but the efficiency will be slower.
# sha系列 : 安全系数高,耗时高 ret = hashlib.sha1() ret.update('jiuyinzhenjing'.encode('utf-8')) print(ret.hexdigest()) #也可加盐 ret = hashlib.sha384(b'asfdsa') ret.update('jiuyinzhenjing'.encode('utf-8')) print(ret.hexdigest()) # 也可以加动态的盐 ret = hashlib.sha384(b'asfdsa'[::2]) ret.update('jiuyinzhenjing'.encode('utf-8')) print(ret.hexdigest())
File consistency check
linux pay attention: everything is a file, we ordinary files, file, video, audio, images, and applications are all files.
Our online world is very safe, often encounter viruses, Trojans, md5 calculate the conversion value is the type of bytes of data, the same data into bytes using the same encryption method results necessarily the same, if different data bytes (even if one just removed a data space) is then converted to the same encryption result will be different. So, hashlib is also an important tool for file consistency verification.
Functions can use knowledge learned, you can do a test file md5 following way
def file_check(file_path): with open(file_path,mode='rb') as f1: sha256 = hashlib.sha256() while 1: content = f1.read(1024) if content: sha256.update(content) else: return sha256.hexdigest() print(file_check('pycharm-professional-2019.1.1.exe'))