day50-1 producer-consumer model

Producer consumer model

Fixation model is to solve a problem or routine

Manufacturer: one refers to data generated

Consumers: refers to one of the data processing

What used to solve the problem

  • Case
    • Hotel dining room is the producer
    • We are consumers

Chef assume only one dish, then in his cooking, we need to wait, and we eat, and so he also needs

Inefficiencies, both sides wait for each other, because both the efficiency inconsistent

Specific solutions:

  1. First unlock the coupling between the two sides, to allow different processes responsible for different tasks
  2. Provide a shared container, the ability to balance the two sides, as long as both sides use the container on it. It recommended queue because the queue can share memory between processes

E.g:

from multiprocessing import Process, Queue
import requests, re, time, random, os

def product(urls, q):
    '''生产者,爬取数据'''
    for ind, url in enumerate(urls):
        response = requests.get(url)
        response.encoding = response.apparent_encoding
        q.put(response.text)
        print(f'第{ind+1}个网站,爬取状态为{response.status_code},进程编号{os.getpid()}')


def customer(q):
    '''消费者,处理数据'''
    i = 0
    while True:
        text = q.get()
        time.sleep(random.random())
        res = re.findall('src=//(.*?) width', text)
        i += 1
        print("第%s个任务获取到%s个img%s个编码信息" % (i, len(res), len(text)))


if __name__ == '__main__':
    urls = [
        'http://www.baidu.com',
        'http://www.jd.com',
        'http://www.taobao.com',
    ]
    q = Queue()
    p = Process(target=product, args=(urls, q))
    p.start()

    c = Process(target=customer, args=(q,))
    c.start()
第1个网站,爬取状态为200,进程编号1672
第1个任务获取到1个img2287个编码信息
第2个网站,爬取状态为200,进程编号1672
第2个任务获取到0个img90221个编码信息
第3个网站,爬取状态为200,进程编号1672
第3个任务获取到0个img141513个编码信息
...

But do have a problem, consumers do not know when to stop

If only a producer, you can put in a to q None after the end of production, as the production end of the logo, but but producers have more time, this approach infeasible

JoinableQueue

  • Inherited from the Queue, consistent usage
  • Add the join (wait) and task_done (task completion)

Where the join is a blocking function will block until the number is equal to the number of calls task_done into elements will be released to represent the task queue processing is complete

Case:

from multiprocessing import Process, JoinableQueue
import time, random
 '''如何判定今天的热狗真的吃完了
    1.确定生成者任务完成
    2.确定生出来的数据已经全部处理完成'''

# 生产热狗
def product(q, name):
    for i in range(3):
        dog = f"{name}的热狗{i + 1}"
        time.sleep(random.random())
        print("生产了", dog)
        q.put(dog)


# 吃热狗
def customer(q):
    while True:
        dog = q.get()
        time.sleep(random.random())
        print("消费了%s" % dog)
        q.task_done()  # 标记这个任务处理完成


if __name__ == '__main__':
    # 创建一个双方能共享的容器
    q = JoinableQueue()

    # 生产者进程
    p1 = Process(target=product, args=(q, "上海分店"))
    p2 = Process(target=product, args=(q, "北京分店"))

    p1.start()
    p2.start()

    # 消费者进程
    c = Process(target=customer, args=(q,))
    c.daemon = True # 将消费者设置为守护进程 当主进程确认 任务全部完成时 可以随着主进程一起结束
    c.start()
    
    p1.join()
    p2.join()  # 代码走到这里意味着生产方完成 

    q.join()  # 意味着队列中的任务都处理完成了

    # c.terminate()  # 也可以直接终止消费者进程
   

Various queues (such as redis message queue, MQ message queue) are typical consumer producer model.

Clipping is mainly used for traffic, ensure that the server will not collapse because of high concurrency

Guess you like

Origin www.cnblogs.com/lucky75/p/11134837.html