Celery quick learning notes

Purpose

In the context of economic downturn, we can only expand our technology stack and improve our competitiveness.

Structure diagram

Insert image description here
1. Broker: usually Redis or MQ, which stores task queues.
2. Worker: consumer.
3. Backend: results storage queue, which is also Redis or MQ.
4. Task: Personally, it is the producer.

Install

pip install celery

I personally use 5.3.1 directly, and it is the latest version as soon as I get started, which leads to a pitfall. There will be problems when reading other people's previous tutorials.

The first pit

Traceback (most recent call last):
  File "C:\Python39\lib\site-packages\billiard\pool.py", line 362, in workloop
    result = (True, prepare_result(fun(*args, **kwargs)))
  File "C:\Python39\lib\site-packages\celery\app\trace.py", line 635, in fast_trace_task
    tasks, accept, hostname = _loc
ValueError: not enough values to unpack (expected 3, got 0)

This error will be reported when celery is executed through your_func.delay() task. I asked about GPT being bypassed, but later I found a solution on the Internet. I will just copy the homework below.

solve

  • Windows 10 system.
    According to other people’s descriptions, this problem will occur when running celery4.x on win10. The solution is as follows. The principle is unknown: first install an eventlet.
pip install eventlet

Then add a parameter when starting the worker, as follows:

celery -A celery_tasks.main worker -l info -P eventlet

Then you can call it normally

Basic usage

Create a python file named: my_task.py

import celery
import time

backend='redis://127.0.0.1:6379/1'
broker='redis://127.0.0.1:6379/2'
cel=celery.Celery('first_task',backend=backend,broker=broker)

@cel.task
def send_email(name):
    print("向%s发送邮件..."%name)
    time.sleep(5)
    print("向%s发送邮件完成"%name)
    return "ok"

@cel.task
def send_msg(name):
    print("向%s发送短信..."%name)
    time.sleep(5)
    print("向%s发送短信完成"%name)
    return "ok"

Look at the naming of my_task, cel, and frist_task above. Try to differentiate the naming so that you can compare which parameters are which when you run the debugging information output later.

run

D:\代码\my_code\study\study_celery>celery -A my_task worker -l info -P eventlet

 -------------- celery@WINDOWS-3T5S0CL v5.1.2 (sun-harmonics)
--- ***** -----
-- ******* ---- Windows-10-10.0.10586-SP0 2023-07-02 20:24:05
- *** --- * ---
- ** ---------- [config]
- ** ---------- .> app:         first_task:0x2b9d54582e0
- ** ---------- .> transport:   redis://:**@120.79.235.7:6379/63
- ** ---------- .> results:     redis://:**@120.79.235.7:6379/63
- *** --- * --- .> concurrency: 8 (eventlet)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** -----
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery


[tasks]
  . my_task.send_email
  . my_task.send_msg

[2023-07-02 20:24:06,312: INFO/MainProcess] Connected to redis://:**@120.79.235.7:6379/63
[2023-07-02 20:24:06,394: INFO/MainProcess] mingle: searching for neighbors
[2023-07-02 20:24:08,250: INFO/MainProcess] mingle: sync with 1 nodes
[2023-07-02 20:24:08,254: INFO/MainProcess] mingle: sync complete
[2023-07-02 20:24:08,368: INFO/MainProcess] pidbox: Connected to redis://:**@120.79.235.7:6379/63.
[2023-07-02 20:24:08,500: INFO/MainProcess] celery@WINDOWS-3T5S0CL ready.

The above prompts to create too many different names to facilitate us to distinguish what each name represents.

  • celery -A my_task worker -l info -P
    In the eventlet startup command, my_task is the file name. Do not include .py in the startup command

  • app: first_task:0x2b9d54582e0
    The name of the app is the first parameter passed when your celery.Celery generates the app

  • transport: redis://: @120.79.235.7:6379/63
    results: redis://:
    @120.79.235.7:6379/63
    Your intermediate redis and result redis, if you have more projects, tests, and formal environments , remember to check more when starting

  • [tasks]
    . my_task.send_email
    . my_task.send_msg
    Here you can clearly see which tasks are in your celery app

  • celery@WINDOWS-3T5S0CL ready.
    Seeing this message, redis is running normally.

Don’t underestimate the startup information. Please check the information inside every time you start it.

Let your celery app run

First create a file, produce_task.py:

from my_task import send_email, send_msg

email_result = send_email.delay("hello")
print(email_result.id)

msg_result = send_msg.delay("world")
print(msg_result.id)

Executing the above code means sending two execution commands to the running celery, namely send_msg and send_email. The parameters passed directly in delay are the parameters of send_msg.

Step on the trap

Here is another pitfall. If the broker and backend are in the same redis, it will cause two delays. Only one is executed, and the problem cannot be found. Then just replace it with two delays. We’ll find the root cause of the problem later. Don’t plan on learning progress now.

View execution results

Insert image description hereInsert image description here

You can view all execution results in the redis of backend. Success and failure are stored in string type. Directly save to the DB root directory

Get execution results

# 成功结果
async_succ_result = AsyncResult(id="23c7ebfd-5bf6-4f83-b7c8-9c870b168d17", app=cel)
# 失败结果
async_fail_result = AsyncResult(id="923360d1-caa4-41ca-9887-3bc404ff0803", app=cel)
# 判断是否成功
success_flag1 = async_succ_result.successful()
success_flag2 = async_fail_result.successful()
# 判断是否失败
failure_flag1 = async_succ_result.failed()
failure_flag2 = async_fail_result.failed()
# 获取结果,如果是成功,则返回之前函数里的return值
result = async_succ_result.get()
# 获取结果,如果是失败,则会报异常
try:
    result = async_fail_result.get()
except Exception as e:
    print(e)

Operating status

# 运行结果状态
if async_succ_result.status == 'PENDING':
    print('任务等待中被执行')
elif async_succ_result.status == 'RETRY':
    print('任务异常后正在重试')
elif async_succ_result.status == 'STARTED':
    print('任务已经开始被执行')
elif async_succ_result.status == 'FAILURE':
    print('任务执行失败')
elif async_succ_result.status == 'SUCCESS':
    print('任务执行成功')

The remarks in the function only have these five states

Delete results

async_succ_result.forget()

Note that if the results are deleted, the retrieval will be stuck.

multitasking structure

  • When you need to perform a lot of tasks, you need to structurally layer the tasks. At this time, you need to distinguish multiple files for maintenance.
    Insert image description here

Create a new directory: my_multi_tasks as our multitasking root directory

  • Create celery.py file
import celery

backend='redis://127.0.0.1:6379/1'
broker='redis://127.0.0.1:6379/2'
second_cel = celery.Celery(
    'second_tasks',
    backend=backend,
    broker=broker,
    include={
    
    
        'my_multi_tasks.my_task_01',
        'my_multi_tasks.my_task_02'
    }
)

The file name here must be celery, because when the celery.exe controller executes the command, it detects the celery file under your project to configure your multi-tasking structure. If the name is not celery, an error will be reported as follows:

D:\代码\my_code\study\study_celery>celery -A my_multi_tasks worker -l info -P eventlet
Usage: celery [OPTIONS] COMMAND [ARGS]...

Error: Invalid value for '-A' / '--app':
Unable to load celery application.
Module 'my_multi_tasks' has no attribute 'celery'

Module 'my_multi_tasks' has no attribute 'celery', indicating that my_multi_tasks does not have the celery attribute. It can be seen that the default program is to .celery

  • Create my_task_01.py file
import time
from my_multi_tasks.elery import second_cel

@second_cel.task
def second_send_email(name):
    print(f"向{name}发送邮件".center(30, '1'))
    time.sleep(5)
    print(f"向{name}发送邮件完成".center(30, '1'))
    return "ok"
  • Create my_task_02.py file
import time
from my_multi_tasks.elery import second_cel

@second_cel.task
def second_send_msg(name):
    print(f"向{name}发送短信".center(30, '2'))
    time.sleep(5)
    print(f"向{name}发送短信完成".center(30, '2'))
    return "ok"

Now put send_email and send_msg in task01 and task02 respectively. Remember to add the celery of the multi-task structure in front of the function.

  • Create the produce_task_second.py file
from my_multi_tasks.my_task_01 import second_send_email
from my_multi_tasks.my_task_02 import second_send_msg

email_result = second_send_email.delay('second_hello')
msg_result = second_send_msg.delay('second_world')
  • After executing the command celery -A my_multi_tasks worker -l info -P eventlet
    to start celery, use the produce_task_second.py file to send the task. Can view the task execution process

scheduled tasks

The task structure is still a task body that directly copies the above multi-task structure. Explanation on adding one more produce file to p for scheduled tasks

from datetime import datetime
from my_multi_tasks.my_task_01 import second_send_email
from my_multi_tasks.my_task_02 import second_send_msg

# 指定时间发送
time_temp = datetime(2023, 7, 4, 6, 19, 00)
time_temp_utc = datetime.utcfromtimestamp(time_temp.timestamp())
result = second_send_email.apply_async(args=['kola', ], eta=time_temp_utc)
print(result.id)

# 延迟发送
from datetime import timedelta
time_now = datetime.now()
time_now_utc = datetime.utcfromtimestamp(time_now.timestamp())
time_now_utc_new = time_now_utc + timedelta(seconds=20)
result2 = second_send_msg.apply_async(args=['cat', ], eta=time_now_utc_new)
print(result2.id)
  • Note:
    1. Celery's scheduled tasks are based on the international standard UTC by default, so UTC conversion is required for Chinese time.
    2. After executing the above statement, the script ends and the task has been thrown into the queue. Execute after the waiting time reaches
    3. Pay attention to the parameters passed in args. It is ['kola', ]. There is a comma, which means that a tuple is passed in.

Start scheduled tasks through configuration information and beat

beat is a very important producer of celery. It is a script service, just like the worker. It is a script program that has been running after it is started. Now configure it.

import celery
from celery.schedules import crontab


backend='redis://127.0.0.1:6379/1'
broker='redis://127.0.0.1:6379/2'
second_cel = celery.Celery(
    'second_tasks',
    backend=backend,
    broker=broker,
    include={
    
    
        'my_multi_tasks.my_task_01',
        'my_multi_tasks.my_task_02'
    }
)


# 设置定时任务的执行时间的时区,改为中国,关闭utc
second_cel.conf.timezone = 'Asia/Shanghai'
second_cel.conf.enable_utc = False


# 发送任务的调度器
second_cel.conf.beat_schedule = {
    
    
    # 随意命名
    'add-task-every-1-minute': {
    
    
        # 传参
        'args': ('张三',),
        # 执行函数的路径,指定second_send_email函数
        'task': 'my_multi_tasks.my_task_01.second_send_email',
        # 调度时间
        # 'schedule': 10.0,                     # 每几秒,执行一次
        # 'schedule': timedelta(seconds=6),     # 每几秒,执行一次,但timedelta更丰富,比如day,hour
        'schedule': crontab(minute="*/1"),    # 每一分钟,执行一次
    },
    'add-task-birthday': {
    
    
        'args': ('李四',),
        'task': 'my_multi_tasks.my_task_02.second_send_msg',
        # 每年411号,842分执行
        'schedule': crontab(minute=10, hour=5, day_of_month=5, month_of_year=7),
    },
}

Celery supports directly modifying the default time zone for executing tasks: second_cel.conf.timezone
generates a scheduled task list through configuration: second_cel.conf.beat_schedule

  • Startup command: celery -A my_multi_tasks beat -l info
    Note that there is no need to add -P eventlet here, because -P specifies the executor of the consumer. This producer, so no need.

After execution and observation, it can be found that the producer beat and consumer worker are completely separated. beat is also a program. After it is started through the command line, it just throws scheduled tasks to a key celery of the middleware broker according to the time set in the configuration table. The following is an example

{
    
    
	"body": "W1siXHU2NzRlXHU1NmRiIl0sIHt9LCB7ImNhbGxiYWNrcyI6IG51bGwsICJlcnJiYWNrcyI6IG51bGwsICJjaGFpbiI6IG51bGwsICJjaG9yZCI6IG51bGx9XQ==",
	"content-encoding": "utf-8",
	"content-type": "application/json",
	"headers": {
    
    
		"lang": "py",
		"task": "my_multi_tasks.my_task_02.second_send_msg",
		"id": "1dce9df1-da1e-4c4e-ac6b-be170b815400",
		"shadow": null,
		"eta": null,
		"expires": null,
		"group": null,
		"group_index": null,
		"retries": 0,
		"timelimit": [null, null],
		"root_id": "1dce9df1-da1e-4c4e-ac6b-be170b815400",
		"parent_id": null,
		"argsrepr": "['\u674e\u56db']",
		"kwargsrepr": "{}",
		"origin": "gen12600@WINDOWS-3T5S0CL",
		"ignore_result": false
	},
	"properties": {
    
    
		"correlation_id": "1dce9df1-da1e-4c4e-ac6b-be170b815400",
		"reply_to": "3f3250fb-6ed4-3d13-83db-a7d8e30c8ee9",
		"delivery_mode": 2,
		"delivery_info": {
    
    
			"exchange": "",
			"routing_key": "celery"
		},
		"priority": 0,
		"body_encoding": "base64",
		"delivery_tag": "88d2a3dd-5612-4ef7-ae36-0d2484b9a468"
	}
}

It feels like there are only two parameter values ​​that are meaningful to observe, one is the task function, and the other is the parameter
"task": "my_multi_tasks.my_task_02.second_send_msg",
"argsrepr": "['\u674e\u56db']",

  • Note that after beating is started, if tasks are not consumed in time, more and more tasks will accumulate. When the worker starts, all these tasks will be executed. Some expiration non-execution mechanisms may be needed

Celery application in Django

Insert image description here
Create a new Django project, name app01 in the Application name in More Settings, and an app will be created when creating the project.

Insert image description hereCreate a test path in urls pointing to the view of app01
and then add the test function in the views view function of app01. Simply test that the website accesses the page 127.0.0.1/test and confirms that it returns ok.


  • Insert image description here
    Then create a my_celery folder under the root directory of the Django project to store the configuration of your celery directory as above, an email task, an sms task, config is the configuration file, and main is the main program
  • config.py (configuration file, pay special attention here to the two variable names of broker_url and result_backend, they must not be wrong )
broker_url = 'redis://127.0.0.1:6379/15'
result_backend = 'redis://127.0.0.1:6379/14'
  • tasks.py (task file, celery tasks must be written in the tasks.py file, other file names are not recognized )
import time
import logging
from my_celery.main import app
log = logging.getLogger("django")

@app.task  # name表示设置任务的名称,如果不填写,则默认使用函数名做为任务名
def send_sms(mobile):
    """发送短信"""
    print("向手机号%s发送短信成功!"%mobile)
    time.sleep(5)
    return "send_sms OK"

@app.task  # name表示设置任务的名称,如果不填写,则默认使用函数名做为任务名
def send_sms2(mobile):
    print("向手机号%s发送短信成功!" % mobile)
    time.sleep(5)
    return "send_sms2 OK"
  • main.py (main program, creates celery app and loads Django configuration file)
import os
from celery import Celery

# 创建celery的实例对象,做解耦,不在这里传参broker和backend,通过配置文件进行设置
app = Celery('my_django_celery')

# 把celery和Django进行结合,识别和加载Django的配置文件,注意这里是在环境中加入Django的配置
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'celeryPros.settings.dev')

# 通过app对象加载配置,把config.py里面的配置读到celery里面去
app.config_from_object('my_celery.config')

# 加载任务,注意!默认会去目录底下的tasks.py文件里面找,所以tasks.py这个文件命名有特殊意义
app.autodiscover_tasks([
    'my_celery.my_email',
    'my_celery.my_sms',
])

# 启动Celery的命令
# 强烈建议切换目录到mycelery根目录下启动
# celery -A my_celery.main worker --loglevel=info
  • Startup command: celery -A my_celery.main worker --loglevel=info
    Note that the path here should be outside the celery folder, and then execute the command. You can locate the path through my_celery.main

After the above configuration is completed, go back to the views.py function in Django's app01 project to trigger your task.

from django.shortcuts import render, HttpResponse
from my_celery.my_sms.tasks import send_sms
from my_celery.my_sms.tasks import send_sms2
from my_celery.my_email.tasks import send_email
from my_celery.my_email.tasks import send_email2

# Create your views here.

def test(request):
    # 异步任务
    # 1. 声明一个和celery一模一样的任务函数,但是我们可以导包来解决
    # send_sms.delay("110")
    # send_sms2.delay("119")
    # # send_sms.delay()        #如果调用的任务函数没有参数,则不需要填写任何内容

    # 定时任务
    from datetime import timedelta, datetime
    time_now = datetime.now()
    time_now_utc = datetime.utcfromtimestamp(time_now.timestamp())
    time_now_utc_new = time_now_utc + timedelta(seconds=20)
    send_email.apply_async(args=['cat', ], eta=time_now_utc_new)

    return HttpResponse('ok')

Start your Django project, and then access the path http://127.0.0.1:8000/test/ in the browser to trigger the task by accessing the vies view function.

Guess you like

Origin blog.csdn.net/weixin_43651674/article/details/131504406