python之thread/threading

背景介绍

某次任务, 想将influxdb中某张表的数据导入到elasticsearch中,
想到python应该方便一些.
然后发现比较慢(依次发多个请求), 想快一些, 就想用多线程来实现.
开始用了thread模块, 发现不知道什么时候导入结束了, 后来又使用了threading模块.
记录之.

环境:

Python 2.7.16

thread示例

thread模块用起来很简单, 示例如下(thread_demo.py):

# coding=utf-8
import thread
import time

# 线程实际要做的工作
def work(threadID, delay=1):
    for i in xrange(0, 5):
        print "%s, %s" % ("#" + str(threadID), time.ctime(time.time()))
        time.sleep(delay)

try:
    thread.start_new_thread(work, (1, 1))
    thread.start_new_thread(work, (2, 2))
except Exception as e:
    print "error:", e
    exit(1)

# 10秒后退出, 如果没有这句, 会报错:
# Unhandled exception in thread started by
# sys.excepthook is missing
# lost sys.stderr
time.sleep(10)

实现的效果是: 线程1每隔一秒打印一次当前时间,线程2每隔两秒打印一次当前时间.10秒后两个线程都终止退出.
输出结果可能如下:

#1, Sun Sep  8 09:42:10 2019
#2, Sun Sep  8 09:42:10 2019
#1, Sun Sep  8 09:42:11 2019
#1, Sun Sep  8 09:42:12 2019#2, Sun Sep  8 09:42:12 2019

#1, Sun Sep  8 09:42:13 2019
#1, Sun Sep  8 09:42:14 2019#2, Sun Sep  8 09:42:14 2019

#2, Sun Sep  8 09:42:16 2019
#2, Sun Sep  8 09:42:18 2019
[Finished in 10.1s]

可以看到, 有一些行两个线程会同时输出.

用go实现这个小功能(thread_demo.go):

package main

import (
    "fmt"
    "time"
)

func work(threadID int, delay int) {
    for i := 0; i < 5; i++ {
        fmt.Printf("#%d, %v\n", threadID, time.Now())
        time.Sleep(time.Second * time.Duration(delay))
    }
}

func main() {
    go work(1, 1)
    go work(2, 2)
    time.Sleep(time.Second * 10)
}

输出可能为:

#1, 2019-09-08 09:44:11.2297827 +0800 CST m=+0.002990601
#2, 2019-09-08 09:44:11.2297827 +0800 CST m=+0.002990601
#1, 2019-09-08 09:44:12.2405974 +0800 CST m=+1.013805301
#2, 2019-09-08 09:44:13.2403188 +0800 CST m=+2.013526701
#1, 2019-09-08 09:44:13.2413163 +0800 CST m=+2.014524201
#1, 2019-09-08 09:44:14.2415413 +0800 CST m=+3.014749201
#2, 2019-09-08 09:44:15.2404524 +0800 CST m=+4.013660301
#1, 2019-09-08 09:44:15.2424471 +0800 CST m=+4.015655001
#2, 2019-09-08 09:44:17.24238 +0800 CST m=+6.015587901
#2, 2019-09-08 09:44:19.2427524 +0800 CST m=+8.015960301
> Elapsed: 11.039s

可以看到, 并没有两个协程打印到一行的(可以多运行几次试试), 这是因为"goroutine是非抢占式"的吗(欢迎看到的朋友指点)?

threading示例

thread模块有一个问题, 就是不能优雅地知道多个线程什么时候都退出.
如果你需要的是当所有线程都工作结束了就退出的话, 那么
threading是你想要的.

上面的任务, 用threading来写一下大概是这样的(threading_demo.py):

# coding = utf-8

import time
import threading

class WorkerThread(threading.Thread):
    def __init__(self, threadID, delay):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.delay = delay

    def run(self):
        self.work()

    def work(self):
        for i in xrange(0, 5):
            print "%s, %s" % ("#" + str(self.threadID), time.ctime(time.time()))
            time.sleep(self.delay)

t1 = WorkerThread(1, 1)
t2 = WorkerThread(2, 2)
threads = [t1, t2]

for t in threads:
    t.start()

for t in threads:
    t.join()

print "%s, %s" % ("ended at:", time.ctime(time.time()))

运行结果和thread_demo.py运行结果类似, 不过会在两个线程都工作结束后立刻终止.

一开始文件名是threading.py, 结果报错:

AttributeError: ‘module’ object has no attribute ‘Thread’

文件名和模块名重复也不行? 改个名字就行(比如改成threading_demo.py).
如果还不行, 看看有没有threading.pyc文件, 删除再运行看看.

用go也来写一下吧(threading_demo.go):

package main

import (
    "fmt"
    "sync"
    "time"
)

func work(wg *sync.WaitGroup, threadID int, delay int) {
    defer wg.Done()
    for i := 0; i < 5; i++ {
        fmt.Printf("#%d, %v\n", threadID, time.Now())
        time.Sleep(time.Second * time.Duration(delay))
    }
}

func main() {
    wg := &sync.WaitGroup{}
    wg.Add(2)
    go work(wg, 1, 1)
    go work(wg, 2, 2)
    wg.Wait()
}

运行结果和thread_demo.go类似, 不过会在两个协程都退出后立即结束.

总结:

  • python里多线程还是用threading模块方便一些.

欢迎补充指正!

发布了231 篇原创文章 · 获赞 77 · 访问量 52万+

猜你喜欢

转载自blog.csdn.net/butterfly5211314/article/details/100619242
今日推荐