Detailed Python common built-in module (6)

Python founder of Guido van Rossum (Guido van Rossum). During Christmas 1989, Guido van Rossum in Amsterdam in order to pass the time, determined to develop a new script interpreter, language as ABC one of the inheriting .Python is pure free software, the source code and the interpreter CPython follow the GPL (GNU General Public License) protocol philosophy of the python:. python advocate: "elegant", "clear", "simple", Python is the simplest and most elegant and most obvious way to solve the problem.



OS base module

OS module provides most of the functions of the operating system interface function module when the OS is introduced, it adaptive to different operating system platforms, the corresponding operation according to different platforms, the Python programming, and regular files, directories dealing function, so not from the OS modules, one module OS modules are most commonly used in development, this section will OS module provides a detailed interpretation, first look at OS commonly used parameter module bar.

import os

os.getcwd()                 #获取当前工作目录,即当前python脚本工作的目录路径
os.chdir("dirname")         #改变当前脚本工作目录,相当于shell下cd
os.curdir                   #返回当前目录: ('.')
os.pardir                   #获取当前目录的父目录字符串名:('..')
os.makedirs('dir1/dir2')    #生成多层递归目录,此处递归生成./dir1/dir2
os.removedirs('dirname')    #若目录为空,则删除,并递归到上一级目录,如若也为空,则删除,依此类推
os.mkdir('dirname')         #创建目录,创建一个新的目录
os.rmdir('dirname')         #删除空目录,若目录不为空则无法删除,报错
os.listdir('dirname')       #列出指定目录下的所有文件和子目录,包括隐藏文件,并以列表方式打印
os.walk('dirname')          #遍历所有目录,包括子目录
os.remove()                 #删除一个文件
os.rename("oldname","new")  #重命名文件/目录
os.stat('path/filename')    #获取文件/目录信息
#-----------------------------------------------------------------------------------
os.sep                      #查系统特定的路径分隔符,win下为"\\"; Linux下为"/"
os.name                     #查看字符串指示当前使用平台.win->'nt'; Linux->'posix'
os.linesep                  #查看平台使用的行终止符,win下为"\t\n"; Linux下为"\n"
os.pathsep                  #查看当前,用于分割文件路径的字符串
os.system("shell")          #运行shell命令,直接显示,不能保存执行结果
os.popen("shell").read()    #运行shell命令,可以保存执行结果
os.environ                  #获取系统环境变量
#-----------------------------------------------------------------------------------
os.path.abspath(path)       #返回path规范化的绝对路径
os.path.split(path)         #将path分割成目录和文件名二元组返回
os.path.dirname(path)       #返回path的目录,其实就是os.path.split(path)的第一个元素
os.path.basename(path)      #返回path最后的文件名,如何path以/或\结尾,那么就会返回空值.
os.path.exists(path)        #如果path存在,返回True.如果path不存在,返回False
os.path.isabs(path)         #如果path是绝对路径,返回True
os.path.isfile(path)        #如果path是一个存在的文件,返回True,否则返回False
os.path.isdir(path)         #如果path是一个存在的目录,则返回True,否则返回False
os.path.join(path)          #将多个路径组合后返回,第一个绝对路径之前的参数将被忽略
os.path.getatime(path)      #返回path所指向的文件或者目录的最后存取时间
os.path.getmtime(path)      #返回path所指向的文件或者目录的最后修改时间


SYS system module

SYS module provides access to Python's interpreter variables used or maintained, and the interpreter with a function to interact. Popular terms, the program is responsible for interaction with the Python interpreter SYS module that provides a set of functions and variables to control Python runtime environment, SYS module is also integrated by default Python module that is integrated in the Python interpreter, the module is required.

import sys

sys.argv              #命令行参数列表,第一个元素是程序本身路径
sys.exit(n)           #退出程序,正常退出时exit(0)
sys.version           #获取Python解释程序的版本信息
sys.path              #返回模块的搜索路径,初始化时使用PYTHONPATH环境变量的值
sys.platform          #返回操作系统平台名称
sys.stdin             #输入相关
sys.stdout            #输出相关
sys.stderror          #错误相关

Remove the command line parameters: Command line argument list, the first element is the path to the program itself, it may traverse a particular number of parameters passed.

import sys

for x in sys.argv:
    print(x)

Analyzing System version: By using the sys.platform()function, the system can determine the current version.

>>> import sys
>>>
>>> sys.platform
'win32'

Returns the current module path: through the use of sys.path()the function, the current path may traverse the python.

>>> sys.path[0]
''
>>> sys.path[1]
'C:\\Users\\LyShark\\AppData\\Local\\Programs\\Python\\Python37\\python37.zip'
>>> sys.path[2]
'C:\\Users\\LyShark\\AppData\\Local\\Programs\\Python\\Python37\\DLLs'
>>> sys.path[3]
'C:\\Users\\LyShark\\AppData\\Local\\Programs\\Python\\Python37\\lib'

Dynamic progress bar: standard input and output, to realize a small example of a dynamic progress bar.

import sys
import time

def view_bar(num,total):
    rate = num / total
    rate_num = int(rate * 100)
    r = '\r%s%d%%' % (">"*num,rate_num)
    sys.stdout.write(r)
    sys.stdout.flush()

if __name__ == '__main__':

    for i in range(0, 100):
        time.sleep(0.1)
        view_bar(i, 100)


Hashlib module

Python inside hashlib module provides a number of encryption algorithms, this module implements a number of different secure hash and message digest algorithm common interface, including FIPS secure hash algorithms SHA1, SHA224, SHA256, SHA384 and SHA512 and RSA MD5 algorithm, "secure hash" and "message digest" are interchangeable, older algorithm called a message digest, in modern terms is secure hash.

The MD5: MD5 message digest algorithm, a cryptographic hash function is widely used, it can produce a 128-bit hash value (hash value).

import hashlib

# ######## md5 ########
hash = hashlib.md5()
# help(hash.update)
hash.update(bytes('admin', encoding='utf-8'))
print(hash.hexdigest())
print(hash.digest())

SHA1 encryption: SHA secure hash algorithm is mainly applied to digital signature algorithm DSA, SHA1 produces a 160-bit message digest (have been eliminated).

import hashlib

######## sha1 ########
hash = hashlib.sha1()
hash.update(bytes('admin', encoding='utf-8'))
print(hash.hexdigest())

SHA256 encryption: SHA secure hash algorithm is mainly applied to digital signature algorithm DSA, hash value size SHA256 algorithm is 256 bits.

import hashlib

# ######## sha256 ########
hash = hashlib.sha256()
hash.update(bytes('admin', encoding='utf-8'))
print(hash.hexdigest())

SHA384 encryption: SHA secure hash algorithm is mainly applied to digital signature algorithm DSA, hash value size SHA256 algorithm is 384 bits.

import hashlib

# ######## sha384 ########
hash = hashlib.sha384()
hash.update(bytes('admin', encoding='utf-8'))
print(hash.hexdigest())

SHA512 encryption: SHA secure hash algorithm is mainly applied to the hash value of the size of the digital signature algorithm DSA, SHA256 algorithm is 512 bits.

import hashlib

# ######## sha512 ########
hash = hashlib.sha512()
hash.update(bytes('admin', encoding='utf-8'))
print(hash.hexdigest())

Salt MD5 encryption: more than a few cracked the encryption algorithm may be hit by the library, it is necessary to add a custom encryption algorithm KEY again doing double encryption.

import hashlib

# ######## md5 ########
hash = hashlib.md5(bytes('898oaFs09f',encoding="utf-8"))
hash.update(bytes('admin',encoding="utf-8"))
print(hash.hexdigest())

File calculated HASH value: We can HASH values of the two files, to compare whether the file has been modified, used to detect whether the file has been modified.

import hashlib
m = hashlib.md5()
with open(r'C:/lyshark.png','rb') as f:
    for line in f:
        m.update(line)
print(m.hexdigest())

import hashlib
m = hashlib.md5()
with open(r'D:/lyshark.png','rb') as f:
    for line in f:
        m.update(line)
print(m.hexdigest()) 


Random module

Random module implements a pseudo-random number generator used to generate a random number and the random number associated with the completion of, for integer selected from the range of unity, for the unified selection sequence, random elements, for generating a list of randomly arranged function, and a function for randomly sampled without replacement, Let us introduce several commonly used functions in this module.

import random

random.shuffle()                           #随机打乱列表元素排列
random.randint(1,20)                       #生成1到20的整数包括20
random.uniform(10,20)                      #生成10到20之间的浮点数
random.randrange(1,10)                     #生成1到10的整数不包括10
random.choice()                            #从序列中随机选择数据

Generates a random number: by using a random.randint()function to generate a random integer, character, and capitalization.

>>> import random
>>>
>>> random.randint(1,10)
6
>>> random.randint(100,9999)
1189
>>> chr(random.randint(97,122))    #随机生成a-z
>>> chr(random.randint(65,90))     #随机生成A-Z
>>> chr(random.randint(48,57))     #随机生成0-9

Random disrupted data: By using the random.shuffle()function, to achieve a random disruption to data list.

>>> import random
>>>
>>> lists = [1,2,3,4,5,6,7,8,9]
>>> print(lists)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
>>> random.shuffle(lists)
>>> print(lists)
[4, 7, 1, 8, 3, 9, 5, 6, 2]

Pop random data: by using a random.choice()random element from a pop-up list of the specified functions, implemented.

>>> import random
>>>
>>> lists=[1,2,3,4,5,6,7,8,9]
>>> string=["admin","guest","lyshark"]
>>>
>>> random.choice(lists)
2
>>> random.choice(lists)
5
>>>
>>> random.choice(string)
'lyshark'
>>> random.choice(string)
'guest'

Randomly generated codes: through random()function, with the loop, and a selection statement to achieve randomly generated codes.

import random

li = []
for i in range(6):
    r = random.randint(0, 4)
    if r == 2 or r == 4:
        num = random.randrange(0, 10)
        li.append(str(num))
    else:
        temp = random.randrange(65,91)
        c = chr(temp)
        li.append(c)

result = "".join(li)
print(result)


Time Time Module

Time Module is done by calling the C library, so some methods may not be able to call on some platforms, but most of its interface to the C standard library provides time.h basically the same, although the block has always been available, but not on all platforms All functions are available, most of the functions defined in this module calls the platform C library functions with the same name as the semantics of these functions vary by platform.

import time

time.sleep(4)                                    #暂停程序执行4秒
time.clock()                                     #返回处理器时间
time.process_time()                              #返回处理器时间
time.time()                                      #返回当前系统时间戳
time.ctime()                                     #当前系统时间,输出字符串格式化
time.ctime(time.time()-86640)                    #将时间戳转为字符串格式
time.gmtime()                                    #获取结构化时间
time.gmtime(time.time()-86640)                   #将时间戳转换成结构化格式
time.localtime(time.time()-86640)                #将时间戳转换成结构格式,但返回本地时间
time.mktime(time.localtime())                    #与localtime()功能相反,将结构时间转换为时间戳
time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) #将struct_time格式转成指定的字符串格式
time.strptime("2019-09-20","%Y-%m-%d")           #将字符串格式转换成struct_time格式


DataTime module

DateTime module provides class dates and times, both in a simple manner, there are complex ways, although it supports date and time arithmetic, but the focus of its implementation is to provide efficient operation for output formatting and attribute extraction function, this class module provides a simple and complex manner date and time, the date and time, while supporting the algorithm, but the focus is effective to achieve attribute extraction, and operations for the output format.

import datetime

datetime.date.today()                             #格式化输出今天时间
datetime.datetime.now()                           #格式化输出当前的时间
datetime.datetime.now().timetuple()               #以struct_time格式输出当前时间
datetime.date.fromtimestamp(time.time()-864400)   #将时间戳转成日期格式
#-----------------------------------------------------------------------------------
temp = datetime.datetime.now()                    #输出当前时间,并赋值给变量
temp.replace(2019,10,10)                          #替换输出内容中的,年月日为2019-10-10
#-----------------------------------------------------------------------------------
#时间替换关键字:<[year,month,day,hour,minute,second,microsecond,tzinfo>
str_to_date = datetime.datetime.strptime("19/10/05 12:30", "%y/%m/%d %H:%M") #将字符串转换成日期格式
new_date = datetime.datetime.now() + datetime.timedelta(days=10)             #在当前基础上加10天
new_date = datetime.datetime.now() + datetime.timedelta(days=-10)            #在当前基础上减10天
new_date = datetime.datetime.now() + datetime.timedelta(hours=-10)           #在当前基础上减10小时
new_date = datetime.datetime.now() + datetime.timedelta(seconds=120)         #在当前基础上加120秒


Shutil compression module

The shutil module files and collections offers many advanced operations, in particular, provides support file copy and delete functions, especially for the file copy and delete, the main function for the directory and file manipulation and compression operation Shutil module is the Python default It comes with standard libraries.

File Copy (1): the /etc/passwdcontent files, to copy the /tmp/passwdfile.

>>> import shutil
>>>
>>> shutil.copyfileobj(open("/etc/passwd","r"),open("/tmp/passwd","w"))

File Copy (2): the /etc/passwdcontents of the file, copies /tmp/passwdthe file to go, and the target document need not be present.

>>> import shutil
>>>
>>> shutil.copyfile("/etc/passwd","/tmp/passwd")

Recursive copy: recursively copy /etcall the files in the directory are copied to /tmpthe directory, the target directory does not exist, ignore the meaning is excluded.

>>> import shutil
>>>
>>> shutil.copytree("/etc","/tmp", ignore=shutil.ignore_patterns('*.conf', 'tmp*'))

Recursive delete: recursively delete /etcall the contents of the folder.

>>> import shutil
>>>
>>> shutil.rmtree("/etc")

File move: to achieve moving files, or to rename the file.

>>> import shutil
>>>
>>> shutil.move("file1","file2")

Filing: to achieve the /etc/files in the package placed in /home/the directory below.

>>> import shutil
>>>
>>> ret = shutil.make_archive("/etc/","gztar",root_dir='/home/')

ZIP file compression: By ZipFile module, compress the specified file in the specified directory.

>>> import zipfile
>>>
# 压缩
>>> z = zipfile.ZipFile('lyshark.zip', 'w')
>>> z.write('lyshark.log')
>>> z.write('data.data')
>>> z.close()

# 解压
>>> z = zipfile.ZipFile('lyshark.zip', 'r')
>>> z.extractall()
>>> z.close()

TAR file compression: By TarFile module, compress the specified file in the specified directory.

>>> import tarfile
>>>
# 压缩
>>> tar = tarfile.open('your.tar','w')
>>> tar.add('/bbs2.log', arcname='bbs2.log')
>>> tar.add('/cmdb.log', arcname='cmdb.log')
>>> tar.close()

# 解压
>>> tar = tarfile.open('your.tar','r')
>>> tar.extractall()  # 可设置解压地址


Logging module

Many programs have a record demand for logs, and the information contained in the log that is normal procedure to access log, you may also have errors, warnings, and other information output, Python's logging module provides a standard interface to the log, you can store it by each formats of logs, logging log can be divided into debug(),info(),warning(),error(),critical()five levels, let's look at how to use.

If you want to enter a log file to the monitor, then we can do the following directly.

>>> import logging
>>>
>>> logging.debug("hello debug")
>>> logging.warning("hello warning")
>>> logging.critical("hello critical")

#---输出结果-------------------------------
DEBUG:root:hello debug
WARNING:root:hello warning
CRITICAL:root:hello critical

Above can be seen logging.followed by three different parameters, in fact, in addition to the three log levels, logging also supports the following levels:

Log level Digital log Log information Description
DEBUG 10 For more information, usually makes sense only when the commissioning phase
INFO 20 Confirm things work as expected, sent during normal operation
WARNING 30 Warning level to indicate the occurrence of unexpected accidents
ERROR 40 Error, worse than the warning level, the software can not run
CRITICAL 50 A serious error, indicating that the program itself may not continue

If you want to write the log file level, then only you need to specify the path to the configuration when the program starts.

import logging
 
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S %p',
                    filename='test.log',
                    filemode='w')

#---参数调用-------------------------------
logging.debug('debug message')
logging.info('info message')
logging.warning('warning message')
logging.error('error message')
logging.critical('critical message')

Log format()associated list format as shown below, the above configuration can freely customize the format.

Format Name Role format
%(name)s Logger name
%(levelno)s Log level in digital form
%(levelname)s Text log level
%(pathname)s The full path name of the module call log output function
%(filename)s Call log output function module filename
%(module)s Call log output function module name
%(funcName)s Call log output function of the function name
%(lineno)d Line statement calls log output function located
%(created)f The current time, with the UNIX standard represents time
%(asctime)s String current time
%(thread)d Thread ID, may not be
%(threadName)s Thread name, may not be
%(process)d Process ID, may not be
%(message)s Message output to the user

In fact, the relevant functions are still many log files, including multi-file logging and so on, I believe that these features too cumbersome, easy to mix in development, grasp the common method of the above would have been sufficient, so no further extends downward.

Process module

Early versions of Python, we mainly through os.system()、os.popen().read()to perform functions such as command-line instruction, in addition to a little-used commands module, but from now on the official document recommended that the subprocess module, the module os module and commands correlation function here only to provide the use of a simple example, we want to introduce is important subprocess module.

Use popen execute the command: first to demonstrate what os.popen()function to execute the process of a command bar.

>>> import os
>>>
>>> temp=os.popen("ls -lh")
>>> temp
<open file 'ls -lh', mode 'r' at 0x7fd1d09b35d0>
>>> temp.read()
'total 4.0K\n-rw-------. 1 root root 1.2K Dec 20 01:53 anaconda-ks.cfg\n'

Use call () Run: Next, by using subprocess.call()a command is executed, return a status code, shell = False, the first argument must be a list, shell = True, the first parameter on the command can be directly input.

>>> import subprocess
>>>
>>> ret = subprocess.call(["ls","-lh"],shell=False)
>>> print(ret)
0
>>> ret = subprocess.call("ls -l", shell=True)
>>> print(ret)
0

Use check_call () Check command: execute command, if the execution status code is 0, 0 is returned, otherwise throw an exception.

>>> import subprocess
>>>
>>> ret = subprocess.check_call(["ls", "-l"],shell=False)
>>> ret = subprocess.check_call("exit 1",shell=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/subprocess.py", line 542, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1

Use check_output () Check command: Run, if the status code is 0, the result is returned or thrown exception, attention is returned here byte type, needs to be converted.

>>> import subprocess
>>>
>>> ret = subprocess.check_output(["echo", "Hello World!"],shell=False)
>>> print(str(ret,encoding='utf-8'))

>>> ret = subprocess.check_output("exit 1", shell=True)
>>> print(str(ret,encoding='utf-8'))

Using the run () run the command: python3.5 newly added features, instead of os.system, os.spawn.

>>> import subprocess
>>> 
>>> subprocess.run(["ls", "-l"])
total 56
-rw-rw-r-- 1 tomcat tomcat    61  8月 11 23:27 a.py
CompletedProcess(args=['ls', '-l'], returncode=0)
>>> 
>>> subprocess.run(["ls", "-l", "/dev/null"], stdout=subprocess.PIPE)
CompletedProcess(args=['ls', '-l', '/dev/null'], returncode=0, stdout=b'crw-rw-rw- 1 root root 1, 3  8\xe6\x9c\x88 11 09:27 /dev/null\n')

Use popen () command: This module is not os.popen()but subprocess inside a module, to perform complex operations.

>>> import subprocess
>>> 
>>> p = subprocess.Popen("ls -lh",shell=True,stdout=subprocess.PIPE)
>>> print(p.stdout.read())


Urllib module

URLlib is Python provides a module for operating a URL, the library will be frequently used in our crawling the web, the site is one of many tests, site status detection module commonly used, but is generally used to compare the write reptiles and more, there should also look at its role.

Fast crawl the web: using the most basic urllib crawl function, save the Baidu home page content to local directory.

>>> import urllib.request
>>>
>>> res=urllib.request.urlopen("https://www.baidu.com")
>>> print(res.read().decode("utf-8"))

>>> f=open("./test.html","wb")      #保存在本地
>>> f.write(res.read())
>>> f.close()

POST request achieved: the above-described example is obtained by requesting Baidu Baidu get request, using the following urllib the post request.

>>> import urllib.parse
>>> import urllib.request
>>>
>>> data=bytes(urllib.parse.urlencode({"hello":"lyshark"}),encoding="utf-8")
>>> print(data)
>>> response = urllib.request.urlopen('http://www.baidu.com/post',data=data)
>>> print(response.read())

Set TIMEOUT Time: We need to request to set a timeout period, rather than having the program has been waiting for the results.

import urllib.request

response = urllib.request.urlopen('http://www.baidu.com', timeout=1)
print(response.read())

Get web status: We can status, getheaders (), getheader ( "server"), to obtain the status code and header information.

>>> import urllib.request
>>>
>>> res=urllib.request.urlopen("https://www.python.org")
>>> print(type(res))
<class 'http.client.HTTPResponse'>
>>>
>>> res.status
>>> res.getheaders()
>>> res.getheader("server")

Camouflage visit: to request to add header information to customize their request website is the head of information at the time, and prevent harmony.

from urllib import request,parse

url = 'http://www.baidu.com'
headers = {
    'User-Agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)',
    'Host': 'mkdirs.org'
}
dict = {
    'name': 'LyShark'
}
data = bytes(parse.urlencode(dict), encoding='utf8')
req = request.Request(url=url, data=data, headers=headers, method='POST')
response = request.urlopen(req)
print(response.read().decode('utf-8'))

URL stitching: We have time, you can splice a web address, to achieve the next visit.

>>> from urllib.parse import urljoin
>>>
>>> urljoin("http://www.baidu.com","abuot.html")
'http://www.baidu.com/abuot.html'


Config module

ConfigParser module for reading the configuration file, the configuration file INI format similar to the profile with the windows, may comprise one or more sections (sectionTop), each node can have a plurality of parameters (key = value), configured to use the benefits of file is that some parameters without having to write the dead, can make the program more flexible configuration parameters.

For the following examples demonstrate the convenience, create a directory where Python test.ini配置文件, writes the following.

[db]
db_host = 127.0.0.1
db_port = 69
db_user = root
db_pass = 123123
host_port = 69

[concurrent]
thread = 10
processor = 20

Get all the nodes: All primary node name by using the following way, we can get to the specified file.

>>> import configparser
>>> 
>>> config=configparser.ConfigParser()
>>> config.read("test.ini",encoding="utf-8")
>>>
>>> result=config.sections()
>>> print(result)
['db', 'concurrent']

Gets the key: Use the following ways to traverse to get 指定节点(concurrent)all the keys in the right.

>>> import configparser
>>> 
>>> config=configparser.ConfigParser()
>>> config.read("test.ini",encoding="utf-8")
>>>
>>> result=config.items("concurrent")
>>> print(result)
[('thread', '10'), ('processor', '20')]

Gets the key: Use the following ways to traverse to get 指定节点(concurrent)all the key under.

>>> import configparser
>>> 
>>> config=configparser.ConfigParser()
>>> config.read("test.ini",encoding="utf-8")
>>>
>>> result=config.options("concurrent")
>>> print(result)
['thread', 'processor']

Specified value is obtained: the following manner traversal, to obtain 指定节点下指定键the corresponding value.

>>> import configparser
>>> 
>>> config=configparser.ConfigParser()
>>> config.read("test.ini",encoding="utf-8")
>>>
>>> result=config.get("concurrent","thread")
# result = config.getint("concurrent","thread")
# result = config.getfloat("concurrent","thread")
# result = config.getboolean("concurrent","thread")
>>> print(result)
10

Check & add & remove the primary node: check, add, delete data designated master node.

>>> import configparser
>>> 
>>> config=configparser.ConfigParser()
>>> config.read("test.ini",encoding="utf-8")

#--检查主节点---------------------------------------------
>>> has_sec=config.has_section("db")
>>> print(has_sec)
True
#--添加主节点---------------------------------------------
>>> config.add_section("lyshark")
>>> config.write(open("test.ini","w"))
#--删除主节点---------------------------------------------
>>> config.remove_section("lyshark")
True
>>> config.write(open("test.ini","w"))

Check & add & remove the specified key-value pairs: check, delete, set the key in the specified group right.

>>> import configparser
>>> 
>>> config=configparser.ConfigParser()
>>> config.read("test.ini",encoding="utf-8")

#--检查节点中的键值对--------------------------------------
>>> has_opt=config.has_option("db","db_host")
>>> print(has_opt)
True
#--设置节点中的键值对--------------------------------------
>>> config.set("test.ini","db_host","8888888888")
>>> config.write(open("test.ini","w"))
#--删除节点中的键值对--------------------------------------
>>> config.remove_option("db","db_host")
True
>>> config.write(open("test.ini","w"))


JSON module

JSON (JavaScript Object Notation), is a lightweight data interchange format, which is based on a subset of ECMAScript (European Computer Society norms established by js), using completely independent of the programming language of the text format to store and represent data, simple and clear hierarchy make JSON an ideal data-interchange language, easy to read and write, but also easy for machines to parse and generate, and effectively improve the efficiency of network transmission, to achieve the JSON data sharing between strings and programming languages and interaction of various common programming languages, JSON module provides four functions: dumps、dump、loads、loadthe following will detail its application scenarios.

dumps (): The basic data types Python is converted into a string.

>>> import json
>>>
>>> dic={"admin":"123","lyshark":"123123"}
>>>
>>> print(dic,type(dic))
{'admin': '123', 'lyshark': '123123'} <class 'dict'>
>>>
>>> result=json.dumps(dic)
>>> print(result,type(result))
{"admin": "123", "lyshark": "123123"} <class 'str'>

loads (): The basic form into the Python string data type.

>>> import json
>>>
>>> string='{"key":"value"}'
>>> print(string,type(string))
{"key":"value"} <class 'str'>

>>> dic=json.loads(string)
>>> print(dic,type(dic))
{'key': 'value'} <class 'dict'>

dump (): first designating data serialization, and then written to the file, the persistent storage, in one step.

>>> import json
>>>
>>> lists=[1,2,3,4,5,6,7,8,9,10]
>>> lists
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>>
>>> json.dump(lists,open("db.json","w",encoding="utf-8"))

>>> f=open("db.json","w")
>>> json.dump(lists,f)

load (): reading a sequence file, in which the content is loaded, to deserialize program.

>>> import json
>>>
>>> lists=json.load(open("db.json","r",encoding="utf-8"))
>>> lists
'{"admin": "123123", "guest": "456789"}'


XML module

XML Extensible Markup Language, XML data transmission purposes, XML protocol to achieve data exchange between different languages ​​or programs, XML is the only common language for data exchange, with almost json, json but simpler to use, but in json not born of the dark ages, we can only choose to use xml, has many traditional companies such as financial industry, many systems interface also mainly XML as a data communication interface, as we have to learn about it using this module.

In order to facilitate subsequent presentation content, create your own Python in the current directory lyshark.xmlthe following XML document.

<?xml version="1.0" encoding="UTF-8"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2019</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2020</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2029</year>
        <gdppc>13600</gdppc>
        <neighbor direction="W" name="Costa Rica" />
        <neighbor direction="E" name="Colombia" />
    </country>
</data>

Creating XML documents: through the use of XML functions, create an XML document, the default when no indentation native saved XML.

<root>
    <son name="1号儿子">
        <grand name="1号孙子"></grand>
    </son>
    <son name="2号儿子">
        <grand name="2号孙子"></grand>
    </son>
</root>
#--以下代码则可创建如上格式-------------------------------------------------
>>> import xml.etree.ElementTree as ET
>>>
>>> root=ET.Element("root")
>>>
>>> son1=ET.Element("son",{"name":"1号儿子"})
>>> son2=ET.Element("son",{"name":"2号儿子"})
>>>
>>> grand1=ET.Element("grand",{"name":"1号孙子"})
>>> grand2=ET.Element("grand",{"name":"2号孙子"})
>>>
>>> son1.append(grand1)
>>> son2.append(grand2)
>>>
>>> root.append(son1)
>>> root.append(son2)
>>>
>>> tree=ET.ElementTree(root)
>>> tree.write('lyshark.xml',encoding='utf-8',short_empty_elements=False)

Open XML documents: through the use of xml.etree.ElementTree, to achieve Open the XML file.

>>> import xml.etree.ElementTree as ET
>>> 
>>> tree = ET.parse("lyshark.xml")
>>> root = tree.getroot()
>>> print(root.tag)

Traversing XML documents (single layer): By using a round-robin fashion, to achieve the traversal of the XML document sub-tree.

>>> import xml.etree.ElementTree as ET
>>> 
>>> tree=ET.parse("lyshark.xml")
>>> root=tree.getroot()
>>>
>>> for child in root:
...     print(child.tag,child.attrib)
...
country {'name': 'Liechtenstein'}
country {'name': 'Singapore'}
country {'name': 'Panama'}

Traversing XML document (multi-layer): traversed by using a cyclic manner rootfollowing directory, to achieve the XML file subtree tree is traversed.

>>> import xml.etree.ElementTree as ET
>>> 
>>> tree=ET.parse("lyshark.xml")
>>> root=tree.getroot()
>>>     # 遍历XML文档的第二层
>>> for x in root:
        # 第二层节点的标签名称和标签属性
...     print("主目录: %s"%x.tag)
        # 遍历XML文档的第三层
...     for y in x:
        # 第三层节点的标签名称和内容
...             print(y.tag,y.attrib,y.text)
...
主目录: country
rank {'updated': 'yes'}
year {}
gdppc {}
neighbor {'direction': 'E', 'name': 'Austria'}
neighbor {'direction': 'W', 'name': 'Switzerland'}
主目录: country
rank {'updated': 'yes'}
year {}
gdppc {}
neighbor {'direction': 'N', 'name': 'Malaysia'}
主目录: country
rank {'updated': 'yes'}
year {}
gdppc {}
neighbor {'direction': 'W', 'name': 'Costa Rica'}
neighbor {'direction': 'E', 'name': 'Colombia'}

Traverse the specified node: by way of circulation, with root.iter()to achieve only traversal year node in the XML document.

>>> import xml.etree.ElementTree as ET
>>> 
>>> tree=ET.parse("lyshark.xml")
>>> root=tree.getroot()
>>>
>>> for node in root.iter("year"):
...     print(node.tag,node.text)
...
year 2019
year 2020
year 2029

Modify the XML field: using list, find the node yearof data lines, and its contents 自动加1, and will write XML documents.

>>> import xml.etree.ElementTree as ET
>>> 
>>> tree=ET.parse("lyshark.xml")
>>> root=tree.getroot()
>>>
>>> for node in root.iter("year"):     #遍历并修改每个字段内容
...     new_year=int(node.text) + 1    #先将node.text变成整数,实现加法
...     node.text=str(new_year)        #然后变成字符串,复制给内存中的text
...     node.set("updated","yes")      #在每个year字段上加上一段属性,updated=yes
...
>>> tree.write("lyshark.xml")          #回写到配置文件中,覆盖成最新的数据
>>> del node.attrib["name"]            #删除节点中的指定属性字段

Delete XML field: by walking the way, to find all the countrynodes and determine if the internal rank>50delete this countrynode.

>>> import xml.etree.ElementTree as ET
>>> 
>>> tree=ET.parse("lyshark.xml")
>>> root=tree.getroot()
>>>     # 遍历data下的所有country节点
>>> for country in root.findall("country"):
        # 获取每一个country节点下rank节点的内容
...     rank=int(country.find("rank").text)
...     if rank > 50:
        # 删除指定country节点
...             root.remove(country)
...
>>> tree.write("output.xml",encoding="utf-8")


Guess you like

Origin www.cnblogs.com/LyShark/p/11297586.html