Detailed explanation of 8 ways to implement timing tasks in python

        In daily work, tasks that need to be executed periodically are often used. One way is to use the crond that comes with the Linux system combined with the command line to achieve. Another way is to use Python directly.        

        When a program needs to be executed every once in a while, or a certain task is executed in a reciprocating cycle, it is necessary to use a timed task to execute the program. For example, if you want to crawl a certain target, you need to use real-time tasks.

The commonly used timing tasks in python mainly include the following 8 methods:

  1. while True:+sleep()
  2. threading.Timer timer
  3. Timeloop library executes timing tasks
  4. Scheduling module sched
  5. Scheduling module schedule
  6. Task framework APScheduler
  7. Distributed message system celery executes scheduled tasks
  8. Use the scheduled tasks that come with windows

Next, use the above 8 methods to complete the Task() task defined below. The sample code is as follows:

from datetime import datetime


def task():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts)

1. Use while True:+sleep() to implement timing tasks

        The easiest way should be to use the time module to implement timing tasks, put the tasks to be executed in the loop, and then sleep for a period of time before executing them. The implementation makes the currently executing thread pause for n seconds before continuing to execute. The so-called pause means that the current thread enters the blocked state. When the time specified by the sleep() function is reached, the blocked state is turned into a ready state, waiting for CPU scheduling.

Sample code:

from datetime import datetime
import time


def task():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts)


def func():
    while True:
        task()
        time.sleep(3)


func()

operation result:

Advantages and disadvantages: only synchronous tasks can be realized, and asynchronous tasks cannot be performed. Although it is relatively simple to execute, it is not easy to control, and sleep is a blocking function. Only the interval can be set, and the specific time point cannot be specified.

2. Use the threading.Timer() timer to implement timing tasks

        The most basic understanding of timer is timer, which can start multiple timing tasks. These timer tasks are executed asynchronously, so there is no problem of waiting for sequential execution.

Timer method illustrate
Timer(interval, function, args=None, kwargs=None) create timer
cancel() cancel timer
start() Execute using threads
join(self, timeout=None) Wait for thread execution to end

Sample code:

from datetime import datetime
from threading import Timer


def task():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts)


def func():
    task()
    t = Timer(3, func)
    t.start()


func()

operation result:

Advantages and disadvantages: Asynchronous tasks can be implemented, which are non-blocking, but when there are too many runs, an error will appear: Pyinstaller maximum recursion depth exceeded Error Resolution reaches the maximum recursion depth, and then I think of modifying the maximum recursion depth,

sys.setrecursionlimit(100000000)

But when running to reach the maximum CPU, python will directly destroy the program.

For more timer usage, see the blog post: threading.Timer() timer implements timing tasks

3. Use the Timeloop library to perform scheduled tasks

        Timeloop is a library that can be used to run multi-period tasks. This is a simple library that runs marker functions in threads using the decorator pattern.

Sample code:

from datetime import datetime, timedelta
from timeloop import Timeloop

tl = Timeloop()


def task():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts + '333!')


def task2():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts + "555555!")


@tl.job(interval=timedelta(seconds=2))
def sample_job_every_2s():
    task()


@tl.job(interval=timedelta(seconds=5))
def sample_job_every_5s():
    task2()

For more timeloop usage, see the blog post for details:    Detailed Explanation of the Usage of Timeloop Library for Timing Tasks in Python

4. Use the scheduling module sched to realize timing tasks

        sched is a scheduling (delay processing mechanism). The sched module implements a general-purpose event scheduler, which uses a delay function in the scheduler class to wait for a specific time and execute tasks. At the same time, multi-threaded applications are supported, and the delay function will be called immediately after each task is executed to ensure that other threads can also execute.

The main method of the scheduler object:

  • enter(delay, priority, action, argument), schedule an event to delay delay time units.
  • cancel(event): remove the event from the queue. This method will raise a ValueError if the event is not currently in the queue.
  • run(): Runs all scheduled events. This function will wait (using the delayfunc() function passed to the constructor), and then execute the event until there are no more scheduled events.

Sample code:

import sched
import time
from datetime import datetime

# 初始化sched模块的scheduler类
# 第一个参数是一个可以返回时间戳的函数,第二个参数可以在定时未到达之前阻塞。
schedule = sched.scheduler(time.time, time.sleep)


def task(inc):
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts)
    schedule.enter(inc, 0, task, (inc,))


def func(inc=3):
    # enter四个参数分别为:
    # 间隔事件、优先级(用于同时间到达的两个事件同时执行时定序)、被调用触发的函数、给该触发函数的参数(tuple形式)
    schedule.enter(0, 0, task, (inc,))
    schedule.run()


func()

operation result:

For more sched usage, see blog post:   https://blog.csdn.net/weixin_44799217/article/details/127353545

5. Use the scheduling module schedule to realize timing tasks

        schedule is a third-party lightweight task scheduling module, which can execute time according to seconds, minutes, hours, dates or custom events.
        If you want to perform multiple tasks, you can also add multiple tasks.

Sample code:

import schedule
from datetime import datetime


def task():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts)


def task2():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts + '666!')


def func():
    # 清空任务
    schedule.clear()
    # 创建一个按3秒间隔执行任务
    schedule.every(3).seconds.do(task)
    # 创建一个按2秒间隔执行任务
    schedule.every(2).seconds.do(task2)
    while True:
        schedule.run_pending()


func()

operation result:

Advantages and disadvantages: It needs to be used in conjunction with while Ture, and it takes up more CPU than other types, and takes up a lot of memory.

For more schedule usage, see blog post:  https://blog.csdn.net/weixin_44799217/article/details/127352957

6. Use the task framework ASPcheduler to implement timing tasks

        APScheduler is a scheduled task framework for Python, which is used to execute periodic or scheduled tasks. The framework can not only add and delete scheduled tasks, but also store tasks in the database to achieve task persistence, which is very convenient to use.

Sample code:

from datetime import datetime
from apscheduler.schedulers.blocking import BlockingScheduler


def task():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts)


def task2():
    now = datetime.now()
    ts = now.strftime("%Y-%m-%d %H:%M:%S")
    print(ts + '666!')


def func():
    # 创建调度器BlockingScheduler()
    scheduler = BlockingScheduler()
    scheduler.add_job(task, 'interval', seconds=3, id='test_job1')
    # 添加任务,时间间隔为5秒
    scheduler.add_job(task2, 'interval', seconds=5, id='test_job2')
    scheduler.start()


func()

operation result:

For more details about the usage of apschedule, please refer to the blog post: Detailed Explanation of the Usage of the Apscheduler Library for Timed Tasks in Python - Programmer Sought 

7. Use the distributed message system celery to perform scheduled tasks

        Celery is a simple, flexible, and reliable distributed system for processing large volumes of messages while providing operations with the tools needed to maintain such systems, and also for task scheduling. The configuration of Celery is cumbersome. If you just need a lightweight scheduling tool, Celery will not be a good choice.

        Celery is a powerful distributed task queue, which allows the execution of tasks to be completely separated from the main program, and can even be assigned to run on other hosts. We usually use it to implement asynchronous tasks (async task) and timing tasks (crontab). Asynchronous tasks are time-consuming operations such as sending emails, uploading files, image processing, etc. Timing tasks are tasks that need to be executed at a specific time.

Note: Celery itself does not have the task storage function. When scheduling tasks, it must be stored. Therefore, when using celery, it is necessary to use some tools with storage and access functions, such as: message queue, Redis Caches, databases, etc. The official recommendation is the message queue RabbitMQ, and sometimes using Redis is also a good choice.

8. Use the scheduled tasks that come with windows

        slightly. I won't go into details here!

Guess you like

Origin blog.csdn.net/weixin_44799217/article/details/127352531