Python study notes 9 - multithreading and multiprocessing

1. Threads & Processes

For the operating system, a task is a process. For example, opening a browser is to start a browser process, opening a notepad to start a notepad process, and opening two notepads to start two notes In this process, opening a Word starts a Word process. A process is a collection of many resources.

 

Some processes also do more than one thing at the same time, such as Word, which can do typing, spell checking, printing, etc. at the same time. In a process, if you want to do multiple things at the same time, you need to run multiple "subtasks" at the same time. We call these "subtasks" in the process as threads.

Since each process has to do at least one thing, a process has at least one thread. Of course, a complex process like Word can have multiple threads, and multiple threads can execute at the same time. The execution method of multiple threads is the same as that of multiple processes. The threads all alternate briefly and appear to be executing at the same time. Of course, true simultaneous multi-threading requires multi-core CPUs to be possible. A thread is the smallest unit of execution, and a process consists of at least one thread.

When we are doing things, it is slower for one person to do it. If multiple people do it together, it is faster. The program is the same. If we want to run faster, we have to use multiple processes, or more Thread, in python, multi-threading is criticized by many people, why, because the Python interpreter uses a GIL called the global interpreter lock, it cannot use multi-core CPU, it can only run on one cpu, but you are running When the program is running, it seems that it is still running together, because the operating system alternately executes each task, task 1 executes for 0.01 seconds, switches to task 2, task 2 executes for 0.01 seconds, then switches to task 3, executes 0.01 seconds ...and so on. On the surface, each task is executed alternately, but because the CPU is so fast, we feel as if all tasks are executing at the same time. This is called a context switch.

2. Multithreading, multithreading in python uses the theading module

Here is a simple multithreading

    import threading
    import time
    def sayhi(num): #Define the function to be run by each thread
     
        print("running on number:%s" %num)
     
        time.sleep(3)
     
    if __name__ == '__main__':
        t1 = threading.Thread(target=sayhi,args=(1,)) #Generate a thread instance
        t2 = threading.Thread(target=sayhi,args=(2,)) #Generate another thread instance
        t1.start() #Start thread
        t2.start() #Start another thread

Here is another way to start multithreading, inheritance

    import threading
    import time
    class MyThread(threading.Thread):
        def __init__(self,num):
            threading.Thread.__init__(self)
            self.num = num
     
        def run(self): #Define the function to be run by each thread
     
            print("running on number:%s" %self.num)
    
            time.sleep(3)
     
    if __name__ == '__main__':
     
        t1 = MyThread(1)
        t2 = MyThread(2)
        t1.start()
        t2.start()

There is no difference between the two methods, just two ways of writing, I personally like to use the first one, which is simpler.

Thread waiting. When multiple threads are running, each thread runs independently and is not interfered by other threads. If you want to do other operations after a thread has finished running, you have to wait for it to complete. How to wait? Well, use join and wait for the thread to end

            import threading
            import time
            def run():
                print('qqq')
                time.sleep(1)
                print('done!')
            lis = []
            for i in range(5):
                t = threading.Thread(target=run)
                lis.append(t)
                t.start()
            for t in lis:
                t.join()
            print('over')

Daemon thread, what is a daemon thread, it is equivalent to you being a king (non-daemon thread), and then you have many servants (daemon thread), these servants are all for you, once you die, then your Your servants will be buried with you.

            import threading
            import time
            def run():
                print('qqq')
                time.sleep(1)
                print('done!')
            for i in range(5):
                t = threading.Thread(target=run)
                t.setDaemon(True)
                t.start()
            print('over')

Thread lock, thread lock means that when many threads are operating a data together, there may be problems. It is necessary to add a lock to the data, and only one thread can operate the data at the same time.

        import threading
        from threading import Lock
        num = 0
        lock = Lock()#Apply for a lock
        def run():
            global in a
            lock.acquire()#lock
            num+=1
            lock.release()#unlock
        
        lis = []
        for i in range(5):
            t = threading.Thread(target=run)
            t.start()
            lis.append(t)
        for t in lis:
            t.join()
        print('over',num)

Let's take a simple crawler to see the effect of multi-threading

        import threading
        import requests,time
        urls  ={
            "baidu":'http://www.baidu.com',
            "blog":'http://www.nnzhp.cn',
            "besttest":'http://www.besttest.cn',
            "taobao":"http://www.taobao.com",
            "jd":"http://www.jd.com",
        }
        def run(name,url):
            res = requests.get(url)
            with open(name+'.html','w',encoding=res.encoding) as fw:
                fw.write(res.text)
        
        
        start_time = time.time()
        lis = []
        for url in urls:
            t = threading.Thread(target=run,args=(url,urls[url]))
            t.start()
            lis.append(t)
        for t in lis:
            t.join()
        end_time = time.time()
        print('run time is %s'%(end_time-start_time))
        
        #The following is the execution time of a single thread
        # start_time = time.time()
        # for url in urls:
        #     run(url,urls[url])
        # end_time = time.time()
        # print('run time is %s'%(end_time-start_time))

3. Multi-process. As mentioned above, multi-threading in Python cannot use multi-core CPU. If you want to use multi-core CPU, you must use multi-process. Multi-process in python uses multiprocessing module.

    from multiprocessing import Process
    import time
    def f(name):
        time.sleep(2)
        print('hello', name)
    p = Process(target=f, args=('niu',))
    p.start()
    p.join()

 

  

  

  

  

  

  

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325024587&siteId=291194637