Python multi-process and multi-threaded control the number of concurrent summary

I. Introduction
originally wrote the script for the violence to crack the code, but 1 second attempt a password 2,220,000 passwords of my day, but only want to use multiple threads for a full, open 2.22 million threads do it? Had to learn to control the number of threads, the official document does not look good, that structure is not clear, the Internet to find a lot of articles is also not very clear, only for full thread, did not specify the number of threads of control, according to the final and finally articles and official documents considered thoroughly understand how to achieve multi-thread based method, afraid for a long time and do not forget, he found trouble posted this, and a novice like me can also refer to reference.
Let me talk about the difference between processes and threads:
address space: an execution unit within the process; the process has at least one thread; they share the address space of a process; and the process has its own separate address space;
resource owners: the process of resource allocation and have units, with threads within a process share the process of resource
thread is the basic unit of processor scheduling, but the process is not.
both can be executed concurrently.
can not understand the words of a simple analogy is the same process as a program, concurrent and do not interference. Threads of a process by one or more threads execute processing, concurrent cpu is constantly switch back and forth to perform, of course, you can not feel the approaching.

Take the above difficulties I encountered for instance, large amounts of data need to perform the same process, a middle of the operation there may be some waiting time, a waste of a lot of time to perform, then while performing it, we can use two parallel approaches :

Parallel processes or threads in parallel

Advantages and disadvantages, depending on the situation, not absolute, do not discuss this, this raises the following two Python parallel processing (annotation feel very clear in detail, say no more)

We recommend learning Python buckle qun: 774711191, look at how seniors are learning! From basic web development python script to, reptiles, django, data mining, etc. [PDF, actual source code], zero-based projects to combat data are finishing. Given to every little python partner! Every day Daniel explain the timing Python technology, to share some of the ways to learn and need to pay attention to small details, click on Join us python learner gathering
Second, the process approach

#coding:utf-8
import random
from time import sleep
import sys
import multiprocessing
import os
#
#需求分析:有大批量数据需要执行,而且是重复一个函数操作(例如爆破密码),如果全部开始线程数N多,这里控制住线程数m个并行执行,其他等待
#
lock=multiprocessing.Lock()#一个锁
def a(x):#模拟需要重复执行的函数
  lock.acquire()#输出时候上锁,否则进程同时输出时候会混乱,不可读
  print '开始进程:',os.getpid(),'模拟进程时间:',x
  lock.release()
   
  sleep(x)#模拟执行操作
   
  lock.acquire()
  print '结束进程:',os.getpid(),'预测下一个进程启动会使用该进程号'
  lock.release()
list=[]
for i in range(10):#产生一个随机数数组,模拟每次调用函数需要的输入,这里模拟总共有10组需要处理
  list.append(random.randint(1,10))
   
pool=multiprocessing.Pool(processes=3)#限制并行进程数为3
pool.map(a,list)#创建进程池,调用函数a,传入参数为list,此参数必须是一个可迭代对象,因为map是在迭代创建每个进程

Output: Three, threading method: Here Insert Picture Description
Third, the threading method:

#coding:utf-8
import threading
import random
import Queue
from time import sleep
import sys
#
#需求分析:有大批量数据需要执行,而且是重复一个函数操作(例如爆破密码),如果全部开始线程数N多,这里控制住线程数m个并行执行,其他等待
#
#继承一个Thread类,在run方法中进行需要重复的单个函数操作
class Test(threading.Thread):
  def __init__(self,queue,lock,num):
    #传递一个队列queue和线程锁,并行数
    threading.Thread.__init__(self)
    self.queue=queue
    self.lock=lock
    self.num=num
  def run(self):
    #while True:#不使用threading.Semaphore,直接开始所有线程,程序执行完毕线程都还不死,最后的print threading.enumerate()可以看出
    with self.num:#同时并行指定的线程数量,执行完毕一个则死掉一个线程
      #以下为需要重复的单次函数操作
      n=self.queue.get()#等待队列进入
      lock.acquire()#锁住线程,防止同时输出造成混乱
      print '开始一个线程:',self.name,'模拟的执行时间:',n
      print '队列剩余:',queue.qsize()
      print threading.enumerate()
      lock.release()
      sleep(n)#执行单次操作,这里sleep模拟执行过程
      self.queue.task_done()#发出此队列完成信号
threads=[]
queue=Queue.Queue()
lock=threading.Lock()
num=threading.Semaphore(3)#设置同时执行的线程数为3,其他等待执行
#启动所有线程
for i in range(10):#总共需要执行的次数
  t=Test(queue,lock,num)
  t.start()
  threads.append(t)
  #吧队列传入线程,是run结束等待开始执行,放下面单独一个for也行,这里少个循环吧
  n=random.randint(1,10)
  queue.put(n)#模拟执行函数的逐个不同输入
#吧队列传入线程,是run结束等待开始执行
#for t in threads:
#  n=random.randint(1,10)
#  queue.put(n)
#等待线程执行完毕
for t in threads:
  t.join()
queue.join()#等待队列执行完毕才继续执行,否则下面语句会在线程未接受就开始执行
print '所有执行完毕'
print threading.active_count()
print threading.enumerate()

Output:Here Insert Picture Description

Published 15 original articles · won praise 7 · views 10000 +

Guess you like

Origin blog.csdn.net/haoxun02/article/details/104216320