Python advanced learning (1)

background

In the context of programming languages, Python is generally considered an easy-to-use scripting language;

As the application of Python becomes more and more widespread, it is not enough to just stay at running Python programs;

This column mainly records some advanced usage and some mechanisms of Python language;

The ability to master the Python language:

1. In-depth grasp of Python programming mechanism, master advanced grammar functions, and be able to solve more complex programming problems;

2. Understand the principles of common Python programming problems and be able to solve them quickly;

3. In-depth understanding of the role of object-oriented and function-oriented, to write high-quality code;

4. Master advanced programming methods such as concurrent programming and asynchronous programming to solve high-concurrency problems;

5. Master the integration of Python and C language, scientific acceleration and other performance optimization capabilities;

6. Possess the actual combat ability of enterprises, and solve tasks with high complexity and high performance requirements;

Python's running speed compared to other languages:

insert image description here

knowledge points

1. Exce use case, a programming method combined with String

def decision_process(conditions, outs):
    base_string = f"if string.startswith('{
      
      conditions[0]}'): print('{
      
      outs[0]}')"    

    for c, out in zip(conditions[1:], outs[1:]):
        base_string += f"\n\telif string.startswith('{
      
      c}'): print('{
      
      out}')"    


    return base_string


def create_func(conditions, outs):
    return f"""def complexit_if(string):
\t{
      
      decision_process(conditions, outs)} 
    """

programming = create_func(['0', '1', '2'], ['none', 'first', 'second'])

exec(programming)				# 执行这一行创建了一个函数对象complexit_if
exec("complexit_if('001231')")	
exec("complexit_if('112311')")
exec("complexit_if('201231')")

Explanation: This case mainly wants to show that exec can be used as a function to execute code, and also reflects the interpreted language of Python;

Extension: There is also a function method of eval(), which can also be executed on string, and the return value is obtained;

2. New features of Python 3.10 - match case

After Python 3.10, a pattern matching method was introduced, which has better generic capabilities;

The following simply creates a json dictionary example:

    parsed_json = {
    
    
        "Age": 19,
        "user_id": "uuid1231241",
        "goods_info": {
    
    
            "price": 100,
            "createtime": 2022
        }
    }

Next use match case to match:

    match parsed_json:
        case {
    
    "Age": age, "user_id": userd_id, "goods_info": {
    
    "price": p, 'createtime': time_}}:
            print(f"{
      
      age} with id {
      
      userd_id} bought {
      
      p} goods")
        case {
    
    'Age': age, 'user_id': userd_id, "action_info": {
    
    "last_login": p}}:
            print(f"{
      
      userd_id} with age {
      
      age} last login is {
      
      p}")
        case _:
            print("None")

The result is of course the output of the first matching object. It can be seen that pattern matching is still very powerful, and it can also match other types;

3. A quick way to initialize a class - dataclass

First, under normal circumstances, we define a class and initialize it:

class OldPerson:
    def __init__(self, name="Tom", age=10, location=10.0, weight=20.0):
        self.name = name
        self.age = age
        self.location = location
        self.weight = weight

    def __repr__(self):
        return f"Person(name={
      
      self.name}, age={
      
      self.age}, location={
      
      self.location}, weight={
      
      self.weight})"

As can be seen from the above code, the process is somewhat complicated, especially for categories with many or complex parameters;

Here's a new way to define a class:

from dataclasses import dataclass

@dataclass
class Person:
    name: str = ""
    age: int = 18
    location: float = 10.0
    weight: float = 20.0

person = Person()
person.name = "Jack"
print(person)				# Person(name='Jack', age=18, location=10.0, weight=20.0)

It can be seen that a lot of code is saved, making the code more friendly;

4. The role of generator and yield

Usually, our program processing is process-oriented. The following simulates an operation of reading a file. The code is as follows:

# 导入头文件
import time
from collections import defaultdict
import datetime

# 设定处理一个文件的操作
def count_words(filename):
    counts = defaultdict(int)    
    time.sleep(1)	# 假设整个流程需要一秒
    return counts

# 传入一个文件列表，对其中的每个文件都进行上述操作
def get_all_results(files):
    results = []
    for f in files:
        results.append(count_words(f))	# 遍历列表中的每个文件，当全部遍历完后才会返回结果
    return results

# 对得到的数据做处理
def collect_results(files):

    for c in get_all_results(files):
        print('get one {}'.format(datetime.datetime.now()))		# 假设处理一次就打印一次处理的时间

if __name__  == '__main__':
    files = ['some_file'] * 10		# 假设有10个文件
    print('programming running at {}'.format(datetime.datetime.now()))		# 程序开始运行的时间
    collect_results(files)		# 开始处理

operation result:

insert image description here

There is a problem:

1. Do not start the subsequent operation until all the files are read. If there are too many files, it will be stuck at the reading step;

2. If the program is interrupted or crashes during the reading process, the subsequent processing will also be invalid, and the reading must be re-read afterwards;

3. It takes up a lot of memory resources, and stores all file contents in one space;

Ways to use yield generators:

# 只需要在处理文件部分做修改
def get_all_results(files):
    
    for f in files:
        yield count_words(f)	# 使用yield返回结果，这个函数也被作为一个生成器
    
    # return (count_words(f) for f in files)	# 也可以直接返回一个可迭代对象，也是一个生成器

operation result:

insert image description here

illustrate:

It can be seen that there is no need to wait for all files to be read, and the file will be processed after each read, which saves space and improves security;

expand:

Map and filter are actually a kind of generator function, returning an iterable object;

5. Decorator

Function: Make changes to the function, and you can cancel this change at any time;

Essence: Fun = anthor (Fun), abbreviated as @anthor, which is to change the function;

Code combat:

# 首先定义一个函数，来修改传进来的函数
def memory(f):
    memory.cache = {
    
    }  # function attribute

    def _wrap(n):
        if n in memory.cache:
            print('hit {}'.format(n))
            return memory.cache[n]
        else:
            r = f(n)
            memory.cache[n] = r
            return r
    return _wrap

@memory
# 实现一个斐波那契数列
def fib(n):
    return fib(n - 1) + fib(n - 2) if n >= 2 else 1


if __name__ == '__main__':
    # fib = memory(fib)     // 使用装饰器相当于执行这行代码
    print(fib(10))

illustrate:

The above is the use of a decorator, which acts as a buffer mechanism, making the function fib run more efficiently;

Six, PYTHONPATH environment variable

An environment variable needs to be introduced here: PYTHONPATH

If you put the path of a custom py file into this environment variable, the code is as follows:

export PYTHONPATH=${
    
    PWD}

Then in other paths, you can directly refer to the package under the environment variable path;

Function: It is a very convenient way to introduce external toolkits, avoiding the duplication of some project files and complicated file calls;

7. Some practical utility functions

1、reduce

Function: perform an operation on all elements in a collection in sequence;

Code:

from functools import reduce

some_lists = [
    [1, 2],
    [3, 5],
    [5, 6, 7, 1, 10.1, 11.1],
    [121.4, 11.34],
    [11.31, 1921, 321.],
]

print(reduce(lambda a, b: a + b, some_lists))

illustrate:

The above is the implementation of adding multiple arrays. The function of reduce is very powerful, and it can perform multiple operations on any data type;

expand:

operator library
```
import operator
```
This is a summary library of operator operations. Some of the operations that can be used are described in its built-in files. The following are the operator methods it contains:

2、cache

effect:

It plays a role of saving cached data, saving the results of some repeated data, and improving the efficiency of program operation;

Code example:

from functools import cache

@cache
def fib(n):
	return fib(n - 1) + fib(n - 2) if n > 2 else 1

Explanation: The above code implements a Fibonacci sequence, and the cache toolkit can improve its operating efficiency;

If you want to specify how much data to cache, you can use the lru_cache library. Essentially, cache also calls this library function;

from functools import lru_cache

@lru_cache(maxsize=2**8)

3、Partial

effect:

The Chinese meaning is a partial function, which refers to changing a function into a function with a default value;

Code example:

from functools import partial

def load_info(id, name, age, sex):
  print(sex)

# 定义默认参数的值
id1_config = {
    
    
    "id": '001',
    "name": "hero"
    }

# 创建一个新函数，设定默认参数
load_info_1 = partial(load_info, **id1_config)
load_info_1(age=10,sex="男")

Description: This tool can simplify the code. You don’t need to pass in the same parameters every time, you only need to create an object with fixed parameters; it better reflects the concept of object-oriented programming and can optimize the memory space ;

4、singledispatch

effect:

The Chinese meaning is the singleton distribution mode. Often a function can receive different types of parameters. If you want to modify the implementation of different types of parameters, you often need to change the function body, which is not easy to maintain; the singleton distribution mode can be used without modifying the source code. In some cases, it is more in line with the development logic to distribute the implementation;

Code example:

from functools import singledispatch

@singledispatch
def multiply(arg1, arg2):
  pass

@multiply.register
def _(arg1:str, arg2:str): return arg1+arg2

@multiply.register
def _(arg1:int, arg2:int): return arg1*arg2

print(multiply(1, 2))		# 2

Note: It should be noted here that the type annotation supported by the register() attribute was updated in version 3.7, and the type annotation cannot be used in version 236 at the beginning;

Summarize

The relevant knowledge points of this article are listed below:

exceed
match case（python 3.10）
dataclass（python 3.8）
yield and generator
decorator
PYTHONPATH environment variable configuration
Tool functions: reduce, cache, partial, singledispatch

The above knowledge points can be used in engineering projects to optimize the code, especially the environment configuration and the use of generators; cache decorators can cache part of the data, making the program run more efficiently;