Writing quality python code effective python- 59 valid methods - reading notes 36-38

Concurrent Computer seems to be doing a lot of different things at the same time. This staggered implementation of the program the way, created a false impression that we think that these programs can be run simultaneously.

Parallel computer is indeed a lot of different things at the same time. The computer includes a plurality of CPU cores, capable of simultaneously executing a plurality of programs. Each program instructions are run separately in each of the above core CPU, these programs will be able to move forward at the same time.

In the same internal procedures, concurrency is a tool that allows programmers to more easily solve specific types of problems.

The key difference between parallel and concurrent lies can not speed.

Concurrent programs written in python language, is relatively easy. System call, the child process and C language extensions and other mechanisms can also deal with some matters in parallel with a python.

However, in order to make concurrent python code to run truly parallel fashion, it is quite difficult.

Therefore, we must understand: How can a subtle difference in these situations, the most appropriate use of the features of Python has to offer.

 

Article 36 with subprocess module to manage a child process

python provides some very robust library, to run and manage a child process, which makes python can very well be a command line utility tools such as bonding together.

A plurality of sub-processes python initiated, it can operate in parallel, which allows us to take advantage of all CPU cores in the computer program in python, so far as possible to enhance the processing capabilities of the program.

Run child process with subprocess module, is relatively simple.

Shabi window can not run this section of the program, forget it.

Highlights:

1. The subprocess module can run sub-process, and manages the input stream and an output stream.

2.python interpreter to run multiple processes in parallel slivers, which allows developers to take full advantage of the processing power of the CPU.

3. timeout parameters can be passed to communicate ways to avoid deadlocks or unresponsive child process.

 

Article 37 can be performed by a thread blocking io, but do not use it to do parallel computing

Standard python implementation is called cpython. cpython two steps to run the python program.

First, the text is parsed and compiled source code into bytecode. Then, with one byte code to run this stack-based interpreter.

Python program is executed, the bytecode interpreter must remain coherent state.

using python GIL mechanisms to ensure such coordination.

 

GIL is actually a mutex to prevent interference cpython preemptive multi-thread switching operation.

The so-called preemptive multithreading switching means may be a thread to acquire control of the program interrupted by another thread way.

If the execution timing of this operation is improper interference, that will undermine the status interpreter.

 

GIL has a very significant negative impact.

Although the python program also supports multi-threading, but due to the GIL protected, so the same time, only one thread can execute forward.

This means that if we want to take advantage of multi-threaded parallel computing to do, and would like to python speed, then the results will be very disappointing.

from time import time
def factorize(number):
    for i in range(1, number + 1):
        if number % i == 0:
            yield i
            
numbers = [2139079,1214759,151673,1852285]
start = time()
for number in numbers:
    list(factorize(number))
end = time()
print('took %.3f seconds'%(end-start))

took 0.659 seconds

 

from threading import Thread 
from time import time

class FactorizeThread(Thread):
    def __init__(self, number):
        super().__init__()
        self.number = number
    def run(self):
        self.factors = list(factorize(self.number))
        
start = time()
threads = []
for number in numbers:
    thread = FactorizeThread(number)
    thread.start()
    threads.append(thread)

for thread in threads:
    thread.join()
end = time()
print('took %.3f seconds'%(end-start))


took 0.645 seconds

 

Such results suggest that multi-threaded standard cpython interpreter affected by the GIL.

That being the case, python why should it support multithreading?

First of all, multi-threading makes the program look like they can do many things at the same time.

The second reason, is blocking the processing of I / O operations, python when performing certain system calls triggered such operations.

System call refers to the python program requests the computer's operating system interacts with the external environment to meet the needs of the program. (Read and write files, communications, and interact with the display device between networks)

In response to this blocking request, the operating system must spend some time, and developers can use the thread, the python program with these time-consuming I / O operations to isolate.

 

window and can not perform normal code book. select.select () execution error.

 

GIL While making python code can not be parallel, but its system call does not have any negative impact.

Since python threads in the implementation of the system call releases the GIL, and has been to wait until it is finished it will reacquire, so GIL will not affect the system call.

 

In addition to the thread, there are other ways, blocking can handle I / O operations, such as the built asyncio module.

While those ways have very significant advantages, but they require developers to have to spend some effort, code re-constitute another execution model.

If you do not want to significantly modify the program, but also to perform multiple blocking I / O operations in parallel, use multithreading to achieve, it would be more simple.

 

Highlights:

1. Because of the limited global interpreter lock, so that a plurality of threads can not execute bytecode python in parallel in a plurality of CPU cores above.

2. Although subject to GIL, but python multithreading function is still useful, it can easily simulate the effect of the same time to perform multiple tasks.

3. By python thread, we can execute in parallel a plurality of system calls, the program which makes it possible to simultaneously perform blocking I / O operation, perform some arithmetic operation.

 

Article 38 in the thread Lock to prevent data races

After understand the global interpreter lock mechanism, many python programming novice might think: When writing your own python code does not need to use a mutex.

Please note that the truth is not the case.

In fact, GIL and does not protect the code written by developers themselves.

The same time of course, only a python thread to be executed, but when this thread is operating a data structure, other threads may interrupt it,

In other words, python interpreter in the implementation of two consecutive bytecode instructions, other threads may suddenly be inserted in the middle.

If the developer tries to access an object at the same time from multiple threads, then the above situation will lead to dangerous consequences.

This disruption could happen at any time, in the event, it will destroy the state of the program, so that the relevant data structure can not maintain its consistency.

 

from threading import Thread, Lock
from time import time


class Counter:
    def __init__(self):
        self.count = 0
    def increment(self, offset):
        self.count += offset

def worker(sernsor_index, how_many, counter):
    for _ in range(how_many):
        counter.increment(1)

def run_threads(func, how_many, counter):
    threads = []
    for i in range(5):
        args = (i, how_many, counter)
        thread = Thread(target=func, args=args)
        threads.append(thread)
        thread.start()
    for thread in threads:
        thread.join()

how_many = 10**5
counter = Counter()
start = time()
run_threads(worker, how_many, counter)
end = time()
print('Counter should be %d, found %d' %(5*how_many, counter.count))
print('took %.3f seconds'%(end-start))


Counter should be 500000, found 449003
took 0.133 seconds

To ensure that all threads are able to execute fairly, python interpreter will give each thread roughly equal processor time.

In order to achieve such an allocation strategy, python system might when a thread is executing, pause it, and then make another thread to continue down.

The problem is that developers can not accurately know the python will be suspended and when these threads.

There are some operations, it looks like an atomic operation, but the system is still possible to execute python half the time it will suspend the thread. So it happened that the above situation.

 

To prevent data-competitive practices such as these, python provides a robust set of tools built-in threading module, so that developers can protect their data structures from destruction.

Among them, the simplest and most useful tool is the Lock class, which is equivalent to the mutex.

We can use a mutex to protect Counter objects so that multiple threads access the value of value, not the value of the damage.

That only one thread can obtain the lock.

Sample code with statement to acquire and release the mutex, write, people can make reading the code easier to see: the thread when you have a mutex, what is that part of the code execution.

 

from threading import Thread, Lock
from time import time


class LockingCounter:
    def __init__(self):
        self.lock = Lock()
        self.count = 0
    def increment(self, offset):
        with self.lock:
            self.count += offset

def worker(sernsor_index, how_many, counter):
    for _ in range(how_many):
        counter.increment(1)

def run_threads(func, how_many, counter):
    threads = []
    for i in range(5):
        args = (i, how_many, counter)
        thread = Thread(target=func, args=args)
        threads.append(thread)
        thread.start()
    for thread in threads:
        thread.join()

how_many = 10**5
counter = LockingCounter()
start = time()
run_threads(worker, how_many, counter)
end = time()
print('Counter should be %d, found %d' %(5*how_many, counter.count))
print('took %.3f seconds'%(end-start))


Counter should be 500000, found 500000
took 2.697 seconds

Such an operating result is what we want answers.

Thus, Lock data objects to solve the issue of competition.

(However, the added time statistics features for finding the difference between the rate of twenty times, I thought the only difference of about five times)

 

Highlights:

1.python does have a global interpreter lock, but when writing your own programs, still have to try to prevent more than one thread contention for the same data.

2. If without locking allows multiple threads to modify the same object, then the data structure of the program may be impaired.

3. python threading module built in, there is a class called Lock, it uses the standard approach to achieve the mutex.

 

Guess you like

Origin www.cnblogs.com/tsxh/p/10781303.html