Python introductory tutorial | Overview of Python's commonly used standard libraries

Python3 standard library overview

The Python standard library is very large and provides a wide range of components. We can use the standard library to allow you to easily complete various tasks.

The following are some modules in the Python3 standard library:

  • os module: The os module provides many functions for interacting with the operating system, such as creating, moving, and deleting files and directories, and accessing environment variables.

  • sys module: The sys module provides functions related to the Python interpreter and system, such as the version and path of the interpreter, and information related to stdin, stdout, and stderr.

  • time module: The time module provides functions for processing time, such as getting the current time, formatting date and time, timing, etc.

  • datetime module: The datetime module provides more advanced date and time processing functions, such as processing time zones, calculating time differences, calculating date differences, etc.

  • random module: The random module provides functions for generating random numbers, such as generating random integers, floating point numbers, sequences, etc.

  • math module: The math module provides mathematical functions, such as trigonometric functions, logarithmic functions, exponential functions, constants, etc.

  • re module: The re module provides regular expression processing functions, which can be used for text search, replacement, segmentation, etc.

  • json module: The json module provides JSON encoding and decoding functions, which can convert Python objects into JSON format and parse Python objects from JSON format.

  • urllib module: The urllib module provides functions for accessing web pages and processing URLs, including downloading files, sending POST requests, processing cookies, etc.

operating system interface

The os module provides many functions related to the operating system.

>>> import os
>>> os.getcwd()      # 返回当前的工作目录
'C:\\Users\\Lenovo'
>>> os.chdir(r'C:\Users\Lenovo\Desktop') # 修改当前的工作目录 使用r原始字符串(raw string),可以不需要双反斜杠
>>> os.system('mkdir today') # 执行windows shell系统命令 mkdir,生成名为today的文件夹
0

It is recommended to use the "import os" style instead of "from os import *". This ensures that os.open(), which changes with different operating systems, will not overwrite the built-in function open().

The built-in dir() and help() functions are very useful when working with large modules like os:

>>> import os
>>> dir(os)
['DirEntry', 'EX_OK', 'F_OK', 'GenericAlias', 'Mapping', 'MutableMapping', 'O_APPEND', 'O_BINARY', 'O_CREAT', 'O_EXCL', 'O_NOINHERIT', 'O_RANDOM', 'O_RDONLY', 'O_RDWR', 'O_SEQUENTIAL', 'O_SHORT_LIVED', 'O_TEMPORARY', 'O_TEXT', 'O_TRUNC', 'O_WRONLY', 'P_DETACH', 'P_NOWAIT', 'P_NOWAITO', 'P_OVERLAY', 'P_WAIT', 'PathLike', 'R_OK', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'TMP_MAX', 'W_OK', 'X_OK', '_AddedDllDirectory', '_Environ', '__all__', '__builtins__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_check_methods', '_execvpe', '_exists', '_exit', '_fspath', '_get_exports_list', '_walk', '_wrap_close', 'abc', 'abort', 'access', 'add_dll_directory', 'altsep', 'chdir', 'chmod', 'close', 'closerange', 'cpu_count', 'curdir', 'defpath', 'device_encoding', 'devnull', 'dup', 'dup2', 'environ', 'error', 'execl', 'execle', 'execlp', 'execlpe', 'execv', 'execve', 'execvp', 'execvpe', 'extsep', 'fdopen', 'fsdecode', 'fsencode', 'fspath', 'fstat', 'fsync', 'ftruncate', 'get_exec_path', 'get_handle_inheritable', 'get_inheritable', 'get_terminal_size', 'getcwd', 'getcwdb', 'getenv', 'getlogin', 'getpid', 'getppid', 'isatty', 'kill', 'linesep', 'link', 'listdir', 'lseek', 'lstat', 'makedirs', 'mkdir', 'name', 'open', 'pardir', 'path', 'pathsep', 'pipe', 'popen', 'putenv', 'read', 'readlink', 'remove', 'removedirs', 'rename', 'renames', 'replace', 'rmdir', 'scandir', 'sep', 'set_handle_inheritable', 'set_inheritable', 'spawnl', 'spawnle', 'spawnv', 'spawnve', 'st', 'startfile', 'stat', 'stat_result', 'statvfs_result', 'strerror', 'supports_bytes_environ', 'supports_dir_fd', 'supports_effective_ids', 'supports_fd', 'supports_follow_symlinks', 'symlink', 'sys', 'system', 'terminal_size', 'times', 'times_result', 'truncate', 'umask', 'uname_result', 'unlink', 'unsetenv', 'urandom', 'utime', 'waitpid', 'waitstatus_to_exitcode', 'walk', 'write']
>>> help(os)
Help on module os:

NAME
    os - OS routines for NT or Posix depending on what system we're on.

MODULE REFERENCE
    https://docs.python.org/3.11/library/os.html

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This exports:
      - all functions from posix or nt, e.g. unlink, stat, etc.
      - os.path is either posixpath or ntpath
      - os.name is either 'posix' or 'nt'
      - os.curdir is a string representing the current directory (always '.')
      - os.pardir is a string representing the parent directory (always '..')
      - os.sep is the (or a most common) pathname separator ('/' or '\\')
      - os.extsep is the extension separator (always '.')
      - os.altsep is the alternate pathname separator (None or '/')
      - os.pathsep is the component separator used in $PATH etc
      - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
      - os.defpath is the default search path for executables
      - os.devnull is the file path of the null device ('/dev/null', etc.)

-- More  --

For daily file and directory management tasks, the shutil module provides an easy-to-use high-level interface:

C:\Users\Lenovo>cd Desktop #切换到桌面文件夹下
C:\Users\Lenovo\Desktop>python  # 进入python交互模式
Python 3.11.4 (tags/v3.11.4:d2340ef, Jun  7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import shutil #导入shutil模块
>>> shutil.copyfile('test.txt', 'test1.txt') #复制桌面的test.txt文件并生成test1.txt
'test1.txt'
>>> shutil.move('sourceFolder', 'targetFolder') #sourceFolder文件夹移动到targetFolder文件夹下
'targetFolder\\sourceFolder'
>>>

File wildcard

The glob module provides a function for generating a file list from a directory wildcard search:

>>> import glob #导入模块 
>>> glob.glob('*.py') #匹配当前目录下的所有文件名为.py的文件
['dog.py', 'fibo.py', 'support.py', 'support1.py', 'test.py', 'using_name.py']

Command line parameters

Common tool scripts often invoke command line arguments. These command line parameters are stored in the argv variable of the sys module in the form of a linked list . In Python, sys.argv is a list that contains the command line arguments used when you run a Python script. sys.argv[0] is the name of the script (that is, the .py file being executed), sys.argv[1] is the first command line parameter, sys.argv[2] is the second command line parameter, and so on. And so on.

Here is a simple example to show how to use sys.argv:

import sys  
  
def main(argv):  
    # argv[0] 是脚本的名字  
    print(f"脚本的名字是: {
      
      argv[0]}")  
  
    # 从 argv[1] 开始,是我们传入的命令行参数  
    for i in range(1, len(argv)):  
        print(f"参数 {
      
      i} 是: {
      
      argv[i]}")  
  
if __name__ == "__main__":  
    main(sys.argv)

If you save this script as script.py and then run python script.py arg1 arg2 arg3 from the command line , you will see the following output:

C:\Users\Lenovo\Desktop>python script.py arg1 arg2 arg3
脚本的名字是: script.py
参数 1 是: arg1
参数 2 是: arg2
参数 3 是: arg3

Error output redirection and program termination

sys also has stdin, stdout and stderr properties, the latter of which can be used to display warning and error messages even when stdout is redirected.

>>> import sys
>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one
47
>>>

Most scripts use "sys.exit()" or "exit()" for directed termination.

String regular match

The re module provides regular expression facilities for advanced string processing. For complex matching and processing, regular expressions provide concise and optimized solutions:

>>> import re #导入模块
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest') #‘which foot or hand fell fastest’ 匹配字符串中的以空格和f开头的字符串
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'

If you only need simple functionality, you should consider string methods first, as they are very simple and easy to read and debug:

>>> import re #导入模块
>>> 'tea for too'.replace('too', 'two') # 将文本中的too替换为tweo
'tea for two'

math

The math module provides access to the underlying C library for floating point operations:

>>> import math #导入math模块
>>> math.cos(math.pi / 4) 
0.70710678118654757
>>> math.log(1024, 2) #求1024 计算以 2 为底数的对数
10.0

random provides tools for generating random numbers.

>>> import random #导入模块
>>> random.choice(['apple', 'pear', 'banana']) # 在集合中随机输出一个元素
'apple'
>>> random.sample(range(100), 10)   # 在0-100中随机抽取10个数字,不包含100
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random()    # 在0-1之间随机生成一个小数
0.17970987693706186
>>> random.randrange(6)    # 在0-6中随机返回一个证书 不包含6
4

access the internet

There are several modules for accessing the Internet and handling network communication protocols. The simplest two of these are urllib.request for handling data received from urls

>>> from urllib.request import urlopen
>>> for line in urlopen('https://www.baidu.com/'):
...     line = line.decode('utf-8')  # Decoding the binary data to text.
...     print(line)

#输出百度网页文本...

date and time

The datetime module provides both simple and complex methods for date and time processing.

While supporting date and time algorithms, the implementation focuses on more efficient processing and formatting of output.

>>> import datetime #导入模块
>>> current_datetime = datetime.datetime.now() #获取当前日期和时间
>>> print(current_datetime)
2023-10-04 21:35:23.999185
>>> current_date = datetime.date.today() #获取当前日期
>>> print(current_date)
2023-10-04
>>> formatted_datetime = current_datetime.strftime("%Y-%m-%d %H:%M:%S") #格式化日期
>>> print(formatted_datetime)
2023-10-04 21:35:23
>>>

This module also supports time processing:

>>> from datetime import date   #导入了 datetime 模块中的 date 类
>>> now = date.today() #当前日期
>>> now
datetime.date(2023, 10, 4)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.") #格式化输出时间
'10-04-23. 04 Oct 2023 is a Wednesday on the 04 day of October.'
>>> birthday = date(1991, 9, 20) #创建了一个表示生日的日期对象
>>> age = now - birthday  # 计算两个日期之间的时间差
>>> age.days  # 变量age的days属性,表示时间差的天数
11702

data compression

The following modules directly support common data packaging and compression formats: zlib, gzip, bz2, zipfile , and tarfile .

>>> import zlib #导入模块
>>> s = b'witch which has which witches wrist watch'
>>> len(s) 输出长度
41
>>> zlib.crc32(s)  进行 CRC32 校验
226805979
>>> t = zlib.compress(s) #压缩文本
>>> len(t) 输出长度
37
>>> print(t)
b'x\x9c+\xcf,I\xceP(\xcf\xc8\x04\x92\x19\x89\xc5PV9H4\x15\xc8+\xca,.Q(O\x04\xf2\x00D?\x0f\x89'
>>> s=zlib.decompress(t) #解压
>>> print(s)
b'witch which has which witches wrist watch'
>>> zlib.crc32(s)  进行 CRC32 校验
226805979
>>>
  • CRC32 checksums can also be used for data integrity verification. For example, when a file is copied to another location, a CRC32 checksum can be used to verify that the copied file is the same as the original file. After calculating the CRC32 checksum of the file, store the checksum in another location or transmit it with the file. Then, when the file integrity needs to be verified, the file's CRC32 checksum can be recalculated and compared to the original checksum.

Performance metrics

Some users are interested in understanding the performance differences between different approaches to solving the same problem. Python provides a measurement tool that provides direct answers to these questions.

For example, using tuple wrapping and unpacking to swap elements seems much more tempting than using traditional methods, and timeit proves that modern methods are faster.
Swap the values ​​of a and b

>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit() #计算执行代码耗时
0.03332749998662621
>>> Timer('a,b = b,a', 'a=1; b=2').timeit() #计算执行代码耗时
0.024641399970278144 

Compared with the fine-grainedness of timeit , the profile and pstats modules provide time measurement tools for larger code blocks.

test module

One way to develop high-quality software is to develop test code for each function and test it frequently during the development process

The doctest module provides a tool that scans modules and executes tests based on docstrings embedded in the program.

Testing a construct is as simple as cutting and pasting its output into a docstring.

doctest is a built-in module of Python that allows you to write unit tests by embedding executable examples in your code. These examples can be extracted and executed by the doctest module, and their actual output compared to the expected output. If the two match, the test passes; otherwise, the test fails.

Here is a simple example using doctest:

def add(a, b):  
    """  
    This function adds two numbers.  
      
    >>> add(1, 2)  
    3  
    >>> add(-1, -2)  
    -3  
    """  
    return a + b  
  
if __name__ == "__main__":  
    import doctest  
    doctest.testmod()

In this example, the add function's docstring contains two doctest examples. When you run this script, doctest.testmod() will find these examples and execute them, then compare their actual output with the expected output.

The output of this script should look like this:

**********************************************************************  
File "__main__", line 7, in __main__.add  
Failed example:  
    add(1, 2)  
Expected:  
    3  
Got:  
    3  
**********************************************************************  
File "__main__", line 9, in __main__.add  
Failed example:  
    add(-1, -2)  
Expected:  
    -3  
Got:  
    -3  
**********************************************************************  
2 items had failures:  
   1 of   2 in __main__.add  
   2 of   2 in __main__  
***Test Failed*** 2 failures.
  • Expected (expected result) and Got (actual result) are the same, indicating that the test case is correct.

The unittest module is not as easy to use as the doctest module, but it can provide a more comprehensive test set in a separate file:
Create a test.py script file with the following code:

import unittest

def average(values):
     return sum(values) / len(values)

class TestStatisticalFunctions(unittest.TestCase):

    def test_average(self):
        self.assertEqual(average([20, 30, 70]), 40.0)
        self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
        self.assertRaises(ZeroDivisionError, average, [])
        self.assertRaises(TypeError, average, 20, 30, 70)

unittest.main() # Calling from the command line invokes all tests

The output of executing the test.py file is as follows:

C:\Users\Lenovo\Desktop>python test.py
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

What we have seen above is only a part of the modules in the Python3 standard library. There are many other modules. You can view the complete standard library documentation in the official documentation: https://docs.python.org/zh-cn/3/library/index .html

Guess you like

Origin blog.csdn.net/weixin_40986713/article/details/133561965