python爬虫【第2篇】

一、多进程

1.fork方法(os模块,适用于Lunix系统)

fork方法:调用1次,返回2次。原因:操作系统经当前进程(父进程)复制出一份进程(子进程),两个进程几乎完全相同,fork方法分别在父进程、子进程中返回,子进程返回值为0,父进程中返回的是子进程的ID。

普通方法:调用1次,返回1次

import os

if __name__ == '__main__':
    print 'current Process (%s) start ....'%(os.getpid())        #getpid()用户获取当前进程ID
    pid = os.fork()
    if pid <0:
        print 'error in fork'
    elif pid == 0:
        print 'I am child process (%s)' and my parent process is (%s)',(os.getpid(),os.getppid())
    else:
        print 'I (%s) created a child process (%s).',(os.getpid(),pid)


运行结果如下:
current Process (3052) start ....
I (3052) created a child process (3053).
I am child process (3053) and my parent process is (3052)

2.multiprocessing模块(跨平台)

import os

# 从multiprocessing模块中导入Process类
from multiprocessing import Process

def run_proc(name):
    print 'Child process %s (%s) Running...' % (name,os.getpid())
if __name__ == '__main__':
    print 'Parent process %s.' % os.getpid()
    for i in range(5):
        p = Process(target = run_proc,args = (str(i),))
        print 'Process will start'
        #用于启动进程
        p.start()
    # 用于实现进程间的同步
    p.join()
    print 'Process end'

执行结果如下:
Parent process 2392.
Process will start.
Process will start.
Process will start.
Process will start.
Process will start.
Child process 2 (10748) Runing...
Child process 0 (5324) Runing...
Child process 1 (3196) Runing...
Child process 3 (4680) Runing...
Child process 4 (10696) Runing...
Process end        

 3.multiprocessing模块(进程池)

  功能:提供指定数量的进程供影虎调用,默认为CPU核数。新请求提交至Pool中,若池没有满,则创建新的进程执行该请求

猜你喜欢

转载自www.cnblogs.com/loser1949/p/9215498.html