[Python] Improve Python running speed (general direction)

3.2 Improve the running speed of Python (general direction)

Now that we know how to calculate the running time of Python code, the next step is to find ways to improve the running speed of Python. Generally, the following points should be followed to improve the running speed of Python code:

3.2.1 Try to run with multiple threads

Multithreading (multithreading) refers to the technology that realizes the concurrent execution of multiple threads from software or hardware. Computers with multi-threading capabilities can execute more than one thread at a time due to hardware support, thereby improving overall processing performance.
In a program, these independently running program fragments are called "threads" (Thread), and the concept of programming using it is called "multithreading". The use of multi-threading technology can significantly improve the speed of some specific code.
Now common computer hardware supports multi-threading technology. If a program task can be decomposed into multi-threaded form, the running speed will be greatly improved.
This technique will be covered in a future tutorial.

3.2.2 Optimize loop processing

Loop processing is a situation often encountered when writing code. Because large loops generally run more than thousands of times, in the loop structure, every small code fragment will be optimized to greatly improve the running speed of the code.
Some practical and common optimization tips will be introduced in detail later.

3.2.3 Using built-in modules, functions and data types

The built-in modules and functions of Python are defined and installed when Python is installed, which is not only easy to import, but also runs much faster than the code written by individuals.
速度对比_内置模块

from math import factorial
from time import time
from timeit import timeit

def slow(n=100):
    if n == 0 or n == 1:
        return 1
    else:
        return n * slow(n - 1)

def fast():
factorial(100)

b1 = time()
timeit(slow, number=10_0000)
e1 = time()
print(f'函数 slow 运行 10 万次的时间为：{
      
      e1 - b1}')
b2 = time()
timeit(fast, number=10_0000)
e2 = time()
print(f'函数 fast 运行 10 万次的时间为：{
      
      e2 - b2}')

The time for function slow to run 100,000 times is: 1.9230999946594238
The time for function fast to run 100,000 times is: 0.1594223976135254

速度对比_内置函数

from mdTools import ftDecTimeIt  # 小编自己编写的一个计时装饰器

@ftDecTimeIt(100_0000)
def slow():
    new_list = []
    word_list = ["i", "am", "a", "bad", "boy"]
    for word in word_list:
        new_list.append(word.capitalize())

@ftDecTimeIt(100_0000)
def fast():
    word_list = ["i", "am", "a", "bad", "boy"]
new_list = list(map(str.capitalize, word_list))

slow()
fast()

The total time taken to run the function slow 1,000,000 times is: 0.9304 seconds
The total time taken to run the function fast 1,000,000 times is: 0.8292 seconds

速度对比_内置方法

from mdTools import ftDecTimeIt

@ftDecTimeIt(100_0000)
def slow():
    new_list = ""
    word_list = ["i", "am", "a", "bad", "boy"]
    for word in word_list:
        new_list += word.capitalize()

@ftDecTimeIt(100_0000)
def fast():
    word_list = ["i", "am", "a", "bad", "boy"]
new_list = ''.join(word_list)

slow()
fast()

The total time taken to run the function slow 1,000,000 times is: 1.0097 seconds
The total time taken to run the function fast 1,000,000 times is: 0.2878 seconds

3.2.4 Using a newer Python version

The Python version is constantly iteratively updated, and each update will include optimization content, which will not only optimize built-in functions, built-in methods, update modules, add new syntax, modules, etc., but also optimize the running speed. As of now, the latest Python 11 has some convenient running speeds that are about 30% faster than the old version.

3.2.5 Use lru_cache to cache data

When you can save the data that needs to be calculated, don't calculate it repeatedly. If you have a function that is used frequently and returns predictable results, it would be nice to be able to cache it in memory. Subsequent function calls will return the result immediately if they are the same.
The decorator in the Python native library functools: @functools.lru_cache, which can cache the latest call of the function, which is very useful when the cached value remains unchanged for a certain period of time, such as recursive call problems.
速度对比_缓存数据

from functools import lru_cache
from time import time
from timeit import timeit

def slow(n=10):
    if n == 1:
        return 1
    if n == 2:
        return 2
    else:
        return slow(n-2) + slow(n-1)

@lru_cache()
def fast(n=10):
    if n == 1:
        return 1
    if n == 2:
        return 2
    else:
        return slow(n - 2) + slow(n - 1)

b1 = time()
timeit(slow, number=10_0000)
e1 = time()
print(f'函数 slow 运行 10 万次的时间为：{
      
      e1 - b1}')
b2 = time()
timeit(fast, number=10_0000)
e2 = time()
print(f'函数 fast 运行 10 万次的时间为：{
      
      e2 - b2}')

The time for function slow to run 100,000 times is: 1.344538927078247
The time for function fast to run 100,000 times is: 0.007016420364379883

3.2.6 Using specialized third-party libraries

We all know that professional people do professional things because they tend to know more and have more experience than ordinary people, and they are more comfortable in handling them. Similarly, in Python programming, if there is a dedicated library, it will be more efficient and efficient. The jit decorator is provided in numba, which can JIT-compile the function it decorates into a machine code function, and return a wrapper object that can call the machine code in Python.
速度对比_numba库

from mdTools import ftDecTimeIt
from numba import jit

@ftDecTimeIt(1)
def slow(x=1, y=1_0000_0000):
    s = 0
    for i in range(x, y):
        s += i
return s

@ftDecTimeIt(1)
@jit
def fast(x=1, y=1_0000_0000):
    s = 0
    for i in range(x, y):
        s += i
return s

slow()
fast()

The total time spent running the function slow once is: 5.2450 seconds
The total time spent running the function fast once is: 0.2750 seconds

速度对比_numpy库
If you have to deal with very large data and want to perform calculations on them efficiently, then the numpy library is a very good choice. Numpy uses C instead of some key codes, which can process arrays faster than native Python and store data more efficiently.
Using numpy can also create large data more conveniently and faster. Let's look at the following example:

from mdTools import ftDecTimeIt
import numpy as np

array = np.random.random(1_0000_0000)

@ftDecTimeIt(1)
def slow():
sum(array)

@ftDecTimeIt(1)
def fast():
np.sum(array)

slow()
fast()

The total time spent running the function slow once is: 8.1115 seconds
The total time spent running the function fast once is: 0.1173 seconds