Python dictionaries, lists, generators

Python notes

As for the efficiency of programming languages, it generally refers to development efficiency and operating efficiency. Different languages ​​have different focuses. Python language undoubtedly cares more about coding efficiency. From the feeling of brushing the questions, in different cases, Python is several to several tens of times slower than C++ or C. But as a programmer, you must know not only what is happening but also why. Some reasons are listed below (each one is very detailed, and I don’t understand it clearly at the moment):
First: Python is a dynamic language
. The type of the object pointed to by a variable is determined at runtime. The compiler cannot make any predictions. There is no way to optimize. Give a simple example: r = a + ba and b are added, but the types of a and b are only known at runtime. For addition operations, different types have different treatments, so each time it is run, a will be judged. And b type, and then perform the corresponding operation. In static languages ​​such as C++ or Java, the runtime code is determined when compiling.
Second: Python is interpreted and executed, but it does not support JIT (I will use the Pypy interpreter to optimize it later, Pypy has this kind of just-in-time compiler implementation)
Third: Everything in python is an object, and every object needs to be maintained Reference counting adds extra work.
Fourth: python GIL (Global Interpreter Lock)
is essentially a mutual exclusion lock. Since it is a mutual exclusion lock, the essence of all mutexes is the same. They all turn concurrent operation into serial and control the sharing at the same time. Data can only be modified by one task to ensure data security. With the existence of GIL, only one thread is executed in the same process at the same time (multi-core cannot be used). GIL protects data at the interpreter level. To protect the user's own data, you need to lock it yourself.
Fifth: Garbage collection (everything in python is an object, when the reference count is 0, the life of the object ends)
Advantages of the reference counting mechanism: 1. Simple. 2. Real-time: Once there is no reference, the memory is directly released. There is no need to wait for a specific time like other mechanisms. The real-time nature also brings an advantage: the time for processing and reclaiming memory is allocated to normal times.
Disadvantages of the reference counting mechanism:
1. Maintaining reference counting consumes resources
2. Circular references:
list1 = []
list2 = []
list1.append(list2)
list2.append(list1)
list1 and list2 refer to each other, if there are no other object pairs Their references, the reference counts of list1 and list2 are still 1, and the memory occupied can never be reclaimed.


Dictionaries and lists

The hash table is used in the Python dictionary, so the complexity of the search operation is O(1), and the list is actually an array. In the list, the search needs to traverse the entire list, and the complexity is O(n), so the search for members It is faster to access the dictionary and other operations than the list. In actual operation, list = dict.fromkeys(list,True) is often used to improve query speed after converting it into a dictionary.
2. Set and list:
If it involves finding the intersection of list, union or difference, it can be converted to set to operate. Set(lista)&set(listb)
loop optimization applicable to all languages: add the length calculation outside the loop.
3. String optimization:
try to use join() instead of + in the use of string concatenation. The string object in python is immutable, so any string operation such as splicing, modification, etc. will generate a new string object and affect performance.
4. Use list comprehensions and generator expressions:
Generally, list
comprehensions are used to generate lists when brushing questions. However, due to memory limitations, the capacity of the list is definitely limited, so the generator expression list analysis for large amounts of data : expr for iter_var in iterable if cond_expr]
generator expression: (expr for iter_var in iterable if cond_expr)


a,b = b,a # 交换变量
a = 'hello, world!'	a[::-1] # 翻转
a = ['hello', 'world']		 " ".join(a)     # hello world" 拼接字符串
a = [1, 1, 1, 2, 3 ,4 ,4, 5] 	a = list(set(a))        # [1, 2, 3, 4, 5] 列表去重
# 复制列表	
import copy	
a = [1,'a',['x']]
# 浅复制
b = copy.copy(a)
b = a[:]
b = list(a) # 使用工厂函数
# 深复制
b = copy.deepcopy(a)"

ood_list = [i for i in xrange(1,101) if i % 2 == 1] # 列表推导
# 字典推导式 快速更换k,v
mcase = {
    
    'a': 10, 'b': 34}
mcase_frequency = {
    
    v: k for k, v in mcase.items()}
# 读写文件	
with open('/path/to/file', 'r') as f: 
    do something...

Guess you like

Origin blog.csdn.net/Pioo_/article/details/107466588