Guide to Locust performance testing based on Python (detailed explanation of ten thousand words long text)

Table of contents

Locust

why we choose locust

Core components of Locust

Locust internally runs the calling link

locust practice

checkpoint (assertion)

weight ratio

to parameterize

Tag

assembly point

distributed

docker run locust

High performance FastHttpUser

Test other protocols like gRPC

other

 Data acquisition method


Locust

Locust is a relatively common performance testing tool, the bottom layer is based on  gevent . The official introduction is  that it is an easy-to-use, scriptable and extensible performance testing tool that allows us to define user behavior using regular Python code without having to fall into UI or restrictive domain-specific languages.

LocustIt has unlimited scalability ( as long as the client python code is provided, it is suitable for performance testing of all protocols ).

This article is a record of learning related content when developing a performance automation comparison platform.

why we choose locust

features illustrate
open source free Locust is an open source project, free to use and customize.
easy to learn and use Written in Python, it has a gentle learning curve, rich libraries and community support.
High scalability and flexibility Tests can be customized as needed to more accurately assess application performance. There are many third-party plug-ins, which are easy to expand.
real-time statistics Provides real-time statistics function and web interface to facilitate monitoring and analysis of test results.
easy to integrate Can be easily integrated with continuous integration and continuous deployment tools to automatically run performance tests.
Suitable for large-scale performance testing Distributed support makes it easy to run tests on multiple machines to simulate a large number of users. This makes it ideal for large-scale performance testing.

Core components of Locust

Master node

Responsible for coordinating and managing the entire testing process, including starting and stopping testing, distributing tasks, collecting and summarizing test results, etc.

Worker node

The node that actually executes the test task simulates user behavior according to the tasks assigned by the Master node.

Web UI

Provide a visual test interface, which is convenient for users to view test results and monitor test progress.

Test script (Load Test Script)

Test scripts, which define the logic and parameters that simulate user behavior, are executed by Worker nodes.

Locust internally runs the calling link

The timing diagram is as follows:

Click to view the description of the sequence diagram

RunnerEventHookEnvironmentUserTaskSetstart()1fire()2create Environment and add user classes3start users and run tasks4run()5send requests and record statistics6wait for other users to finish7return8stop users and tasks9stop()10fire()11RunnerEventHookEnvironmentUserTaskSet

Note: The fire() method is a method in the EventHook class in Locust, which is used to fire the event. In Locust's test life cycle, there are multiple events that can be triggered, such as test start, test end, user start, user complete task, etc. When these events occur, the EventHook class calls the fire() method, passing the event to all callback functions registered for the event.

locust practice

locust install

step illustrate
Click to jump to install Python3.7+ The new version annotation requires Python3.7 or later
pip install locust install locust
implementlocust Check if the installation is successful
$ locust -V
locust 2.15.1 from /Users/bingohe/Hebinz/venvnew/lib/python3.9/site-packages/locust (python 3.9.17)

Getting started example

When writing use cases with locust, convention is greater than configuration:

test_xxx (general test framework convention)
dockerfile (docker convention)
locustfile.py (locust convention)

# locustfile.py
from locust import HttpUser, task

class HelloWorldUser(HttpUser): # 父类是个User,表示要生成进行负载测试的系统的 HTTP“用户”。每个user相当于一个协程链接 ,进行相关系统交互操作
    @task  # 装饰器来标记为一个测试任务, 表示用户要进行的操作:访问首页 → 登录 → 增、删改查
    def hello_world(self):
        wait_time = between(1, 5)
        # self.client发送 HTTP 请求,模拟用户的操作
        self.client.get("/helloworld")

start test

GUI mode start locust

locustfile.pyExecute the command directly in the directory with the file locust, and then visit: http://0.0.0.0:8089/ to see the following interface:

$ locust   
[2023-07-06 16:15:16,868] MacBook-Pro.local/INFO/locust.main: Starting web interface at http://0.0.0.0:8089 (accepting connections from all network interfaces)
[2023-07-06 16:15:16,876] MacBook-Pro.local/INFO/locust.main: Starting Locust 2.15.1

image

Detailed indicators:

  • Number of users The number of simulated users, the default is 1
  • Spawn rate : production number (per second), =>jmeter : Ramp-Up Period (in seconds), default 1
  • Host (eg  http://www.example.com ) => the absolute address of the test target svr

After filling in the host and clicking start, a request will be made to the service under test http://{host}/helloworld . Request statistics are as follows:

image

Description of WebUI Tab:

Tab name Functional description
New test Click this button to edit the total number of virtual users simulated and the number of virtual users started per second
Statistics Aggregate report similar to Listen in JMeter
Charts The curve display graph of the test result trend, including the number of completed requests per second (RPS), response time, and the number of virtual users at different times
Failures Display interface for failed requests
Exceptions Display interface for abnormal requests
Download Data The test data download module provides three types of downloads in CSV format, namely: Statistics, responsetime, exceptions

It should be noted that the webui mode has many limitations and is mainly used for debugging. The command line mode described below is more commonly used.

Command line mode start locust
locust -f locustfile.py --headless -u 500 -r 10  --host http://www.example.com  -t 1000s

locust​ The framework is run through commands , and the common parameters are:

parameter meaning
-f or --locustfile Specifies the path to the test script file
--headless Run tests in non-GUI mode
-u or --users Specify the number of concurrent users
-r or --spawn-rate Specifies the user generation rate (i.e. the number of users generated per second)
-t or --run-time Specify the maximum time (in seconds) for the test to run, use with --no-web
--csv Output test results to a CSV file
--html Output test results as HTML report
--hostor -H Specify the address of the service under test
-L log level, default is INFO

checkpoint (assertion)

By default, Locust will judge whether the request is successful or not based on the HTTP status code. For responses with HTTP status codes in the range 200-399, Locust will consider them successful. For responses with HTTP status codes between 400-599, Locust will treat them as failures.

If you need to judge whether the request is successful based on the response content or other conditions, you need to manually set the checkpoint:

  • Use the catch_response=True parameter provided by self.client, and add the context method of the ResponseContextManager class provided by locust to manually set the checkpoint.
  • There are two methods in ResponseContextManager to declare success and failure, namely successand failure. Among them, the failure method requires us to pass in a parameter, and the content is the reason for the failure.
from locust import HttpUser, task, between

class MyUser(HttpUser):
    # 思考时间:模拟真实用户在浏览应用程序时的行为
    wait_time = between(1, 5)

    @task
    def my_task(self):
        # 基于Locust提供的ResponseContextManager上下文管理器,使用catch_response=True 参数来捕获响应,手动标记成功或失败,
        with self.client.get("/some_page", catch_response=True) as response:
            # 检查状态码是否为200且响应中包含 "some_text"
            if response.status_code == 200 and "some_text" in response.text:
                # 如果满足条件,标记响应为成功
                response.success()
            else:
                # 如果条件不满足,根据具体情况生成错误信息
                error_message = "Unexpected status code: " + str(response.status_code) if response.status_code != 200 else "Expected text not found in the response"
                # 标记响应为失败,并报告错误信息
                response.failure(error_message)

weight ratio

If you need to request different ratios, in Locust, you can assign weights to tasks by @tasksetting parameters in the decorator . weightThe higher the weight, the more frequently the task is executed.

from locust import HttpUser, task, between

class MyUser(HttpUser):
    wait_time = between(1, 5)

    # 设置权重为3,这个任务将被执行的频率更高
    @task(3)
    def high_frequency_task(self):
        self.client.get("/high_frequency_page")

    # 设置权重为1,这个任务将被执行的频率较低
    @task(1)
    def low_frequency_task(self):
        self.client.get("/low_frequency_page")

In this example, we high_frequency_taskset a weight of 3 for the task and low_frequency_taska weight of 1 for the task. This means that when simulating a user performing a task, high_frequency_taskthe task will be executed low_frequency_task3 times as often as the task. By setting weights, we can adjust the execution frequency of different tasks in performance testing according to actual needs.

Click to view the implementation principle of weight in Locust

to parameterize

In the real world, user behavior is usually diverse. They may use different devices, operating systems, network conditions, etc. To better simulate these scenarios, we need to use different parameters in the tests.

In performance testing, parameterization is a very important technical means. It allows us to run the same test scenario with a different dataset, which better simulates real-world user behavior. There are two commonly used parameterization methods.

Use Locust's built-in parameterization capabilities

from locust import HttpUser, task, between
from locust.randoms import random_string, random_number

class MyUser(HttpUser):
    wait_time = between(1, 5)

    @task
    def random_data(self):
        random_str = random_string(10)
        random_num = random_number(0, 100)
        self.client.post("/random", json={"text": random_str, "number": random_num})

Read parameters from external file

Take the authentication session that has been configured as a whitelist as an example:

import csv
from locust import HttpUser, task


class CSVReader:
    def __init__(self, file, **kwargs):
        try:
            file = open(file)
        except TypeError:
            pass
        self.file = file
        self.reader = csv.reader(file, **kwargs)  # iterator

    def __next__(self):
        try:
            return next(self.reader)
        except StopIteration:
            # 如果没有下一行,则从头开始读
            self.file.seek(0, 0)
            return next(self.reader)


session = CSVReader("session.csv")

class MyUser(HttpUser):
    @task
    def index(self):
        customer = next(ssn_reader)
        self.client.get(f"/pay?session={customer[0]}")

Tag

In Locust, tag (Tag) is a method used to classify and filter tasks. By adding tags to tasks, Locustonly tasks with a specific tag can be executed at runtime. This is useful when performing performance testing of a specific scenario or organizing a large number of tasks.

scenes to be used

Sometimes we write multiple test scenarios in the same file, but we only want to run some of them when running, that is, when there is more than one task in a test file, we can classify the tasks by labeling them, and execute the @tagtest , by --tags nameexecuting the specified task with a label.

Here is an example using tags:

from locust import HttpUser, task, between, tag

class MyUser(HttpUser):
    wait_time = between(1, 5)

    # 给任务添加一个名为 "login" 的标签
    @tag("login")
    @task
    def login_task(self):
        self.client.post("/login", json={"username": "user", "password": "pass"})

    # 给任务添加一个名为 "profile" 的标签
    @tag("profile")
    @task
    def profile_task(self):
        self.client.get("/profile")

    # 给任务添加两个标签:"shopping" 和 "checkout"
    @tag("shopping", "checkout")
    @task
    def checkout_task(self):
        self.client.post("/checkout")

In this example, we have added different labels to each of the three tasks. login_taskTasks have "login"tags, profile_tasktasks have "profile"tags, and checkout_tasktasks have "shopping"and "checkout"two tags.

When running Locust, you can specify which tags to execute or exclude by using the --tagsand options. --exclude-tagsFor example, to only execute "login"tasks with tags, you can run:

locust --tags login

To exclude "shopping"tasks with tags, you can run:

locust --exclude-tags shopping

This way, we can perform scenario-specific performance testing as needed without modifying the code.

Click to see how to specify the tag in the Locust attribute

assembly point

What is a rendezvous point?

Rendezvous points are used to synchronize virtual users so that tasks are performed at exactly the same time. In [Test Plan], it may be required that the system can withstand 1,000 people submitting data at the same time. You can add a rendezvous point before the data submission operation, so that when the virtual user runs to the rendezvous point for submitting data, check how many users are running at the same time When arriving at the meeting point, if there are less than 1000 people, the users who have already arrived at the meeting point are waiting here. When the number of users waiting at the meeting point reaches 1000, 1000 people will submit data at the same time, so as to meet the requirements in the test plan.

Note: The Locust framework itself does not directly encapsulate the concept of a rendezvous point. It needs to be implemented indirectly through the gevent concurrency mechanism and using gevent locks.

Before implementing rendezvous in Locust, we first understand two concepts:

  • geventSemaphore semaphores  in 
  • locust event hook in all_locusts_spawned
Semaphore

A semaphore is a synchronization primitive used to control access to shared resources. It is an important concept in computer science and concurrent programming, first proposed by the famous computer scientist Edsger Dijkstra in the 1960s. Semaphores are used to solve critical section problems in multithreaded or multiprocess environments to prevent competing access to shared resources.

Click to view the implementation principle of `Semaphore`

all_locusts_spawned 事件

Events are a very important concept in Locust. Events allow us to perform custom actions at specific moments in the Locust lifecycle. By listening to and handling these events, we can extend the functionality of Locust to meet testing needs.

spawning_complete It is an event in Locust, indicating that all Locust users (users) have been generated. When Locust starts running tests and generating users, it gradually creates user instances. spawning_complete The event is fired once all users have been created . You can perform some specific actions on this event, such as outputting log messages, collecting statistics, or performing other custom actions.

To listen to  spawning_complete events, you can use  locust.events.spawning_complete event hooks. For example:

from locust import events

@events.spawning_complete.add_listener
def on_spawning_complete():
    print("All users have been spawned!")

In this example, when all users have been spawned, we will output a message "All users have been spawned!". You can substitute other operations as needed.

Click to view other events in the `Locust` life cycle

After understanding the above two concepts, we only need to take two steps:

  • At script startup, use all_locust_spawned.acquire() to block the process
  • Write a function that triggers all_locust_spawned.release() when all users are created

Sample code:

from locust import HttpUser, task, between
from gevent.lock import Semaphore
from locust import events

all_locust_spawned = Semaphore()
all_locust_spawned.acquire()  # 阻塞


class MyUser(HttpUser):
    wait_time = between(1, 1)

    def on_start(self):
        global all_locust_spawned
        all_locust_spawned.wait(3)  # 同步锁等待时间

    @task
    def task_rendezvous(self):
        self.client.get("/rendezvous")


# 添加集合点事件处理器
@events.spawning_complete.add_listener  # 所有的Locust实例产生完成时触发
def on_spawning_complete(**_kwargs):
    global all_locust_spawned
    all_locust_spawned.release()

distributed

When we need a large number of concurrent users, and a single computer may not be able to generate enough load to simulate this situation, distributed stress testing can solve this problem, we can generate a larger load by distributing the stress test to multiple computers , and more accurately evaluate the performance of the system.

Limitations of Locust

Locust uses Python's asyncio library to implement asynchronous I/O, which means it can take full advantage of the performance of multi-core CPUs. However, due to Python's Global Interpreter Lock (GIL) limitations, a single Python process cannot fully utilize a multi-core CPU.

To solve this problem, Locust supports running multiple worker nodes on a single computer, which can take full advantage of the performance of multi-core CPUs.

When running multiple slave nodes on a single computer, each slave node will run in a separate process, avoiding the limitations of the GIL. In this way, we can take full advantage of the performance of multi-core CPUs and generate larger loads.

Stand-alone master-slave mode

Note: The number of slave nodes should be less than or equal to the number of processors of the machine

In stand-alone master-slave mode, both master and slave nodes run on the same computer. This mode is suitable for stress testing in a local development environment, or on a single server with a multi-core CPU. The following are the steps to implement distributed stress testing in stand-alone master-slave mode:

  1. Install Locust: Install Locust on your computer, use  pip install locust the command to install.

  2. Write a Locust test script: Write a Locust test script that will be run on the master and slave nodes. Save this script as  locustfile.py.

  3. Start the master node: Run the command on the computer  locust --master to start the master node, listening on the default port (8089).

  4. Start a slave node: Run the command on your computer  locust --worker --master-host 127.0.0.1 to start a slave node. Multiple slave nodes can be started as needed.

  5. Run the distributed stress test: Visit Locust's web interface ( http://127.0.0.1:8089 ) to start the test.

In stand-alone mode, how to make each slave node run on a different CPU

Click to see how in stand-alone mode, each slave node runs on a different CPU

Multi-machine master-slave mode

The operation is basically the same as the stand-alone mode, when accessing Locust's Web interface, the address of the master node (http://MASTER_IP_ADDRESS:8089) is accessed.

Because the master node and the slave node communicate through the network. Therefore, when selecting the computers of the master node and the slave node, it is necessary to ensure that the network connection between them is smooth. Also, for accurate test results, it is important to ensure that the network latency between the master and slave nodes is low.

Command parameters in distributed mode

command parameters illustrate
--master Run the current Locust instance as the master node.
--worker Run the current Locust instance as a worker node.
--master-host Specify the IP address or hostname of the master node. The default value is  127.0.0.1.
--master-port Specifies the port number of the master node. The default value is  5557.
--master-bind-host Specifies the IP address or hostname to which the master node is bound. The default is  *(all interfaces).
--master-bind-port Specifies the port number that the master node binds to. The default value is  5557.
--expect-workers Specifies the number of slave nodes that the master expects to connect to. The default value is  1.

--expect-workers The parameter is used to specify the number of slave nodes that the master node expects to connect to. If the number of slave nodes actually connected does not reach this value, the master node will continue to wait until enough slave nodes are connected.

When actually running a distributed stress test, the master node will display the number of connected slave nodes on the web interface. If the number of slave nodes actually connected does not reach  --expect-workers the specified value, you can see a warning message on the web interface, prompting you that the master node is waiting for more slave nodes to connect.

docker run locust

The advantages and disadvantages of running locust in a container are obvious:

Advantage describe
Environment Consistency Docker ensures that the Locust environment running on different machines is consistent.
easy to deploy Using Docker simplifies the Locust deployment process.
easy to expand Docker can be used in conjunction with container orchestration tools to enable automatic scaling of Locust slave nodes.
isolation Docker 容器提供了一定程度的隔离性,将 Locust 运行环境与宿主机系统隔离。
缺点 描述
性能开销 Docker 容器可能存在一定程度的性能损失,与在宿主机上直接运行 Locust 相比。
学习曲线 对于不熟悉 Docker 的用户,可能需要一定时间学习 Docker 的基本概念和使用方法。
系统资源占用 运行 Docker 容器需要消耗一定的系统资源(如 CPU、内存、磁盘空间等)。

但是以下这些场景使用 Docker 来运行 Locust 是一个更好的选择:

  1. 分布式压力测试:在分布式压力测试中,需要在多台计算机上运行 Locust 主节点和从节点。使用 Docker 可以确保所有节点的运行环境一致,简化部署过程。

  2. 云环境部署:如果你需要在云环境(如 AWS、Azure、GCP 等)中进行压力测试,使用 Docker 可以简化部署过程,并充分利用云平台提供的容器服务(如 Amazon ECS、Google Kubernetes Engine 等)。

  3. CI/CD 集成:如果你需要将压力测试集成到持续集成/持续部署(CI/CD)流程中,使用 Docker 可以简化集成过程。许多 CI/CD 工具(如 Jenkins、GitLab CI、Travis CI 等)都支持 Docker 集成。

  4. 避免环境冲突:如果你的开发或测试环境中已经安装了其他 Python 应用程序,可能会出现依赖项冲突。使用 Docker 可以将 Locust 运行环境与宿主机系统隔离,避免潜在的环境冲突。

  5. 团队协作:在团队协作过程中,使用 Docker 可以确保每个团队成员都使用相同的 Locust 运行环境,从而避免因环境差异导致的问题。

具体使用步骤

  1. 首先,确保你已经安装了 Docker。如果尚未安装,请参考 Docker 官方文档 以获取适用于你的操作系统的安装说明。

  2. 编写一个 Locust 测试脚本。例如,创建一个名为 locustfile.py 的文件,内容如下:

from locust import HttpUser, task, between

class MyUser(HttpUser):
    wait_time = between(1, 5)

    @task
    def my_task(self):
        self.client.get("/")
  1. 使用以下命令从 Docker Hub 拉取官方的 Locust 镜像:
docker pull locustio/locust
  1. 使用以下命令在 Docker 中运行 Locust。
docker run --rm -p 8089:8089 -v $PWD:/mnt/locust locustio/locust -f /mnt/locust/locustfile.py --host TARGET_HOST

在这个命令中,我们将当前目录(包含 locustfile.py 文件)挂载到 Docker 容器的 /mnt/locust 目录。然后,我们使用 -f 参数指定要运行的 Locust 测试脚本,并使用 --host 参数指定目标主机地址。

  1. 访问 Locust 的 Web 界面。在浏览器中打开 http://localhost:8089,你将看到 Locust 的 Web 界面。在这里,你可以开始压力测试并查看结果。

通过以上步骤,你可以在 Docker 中运行 Locust,无需在本地环境中安装 Locust。

总之,在需要确保环境一致性、简化部署过程、集成到 CI/CD 流程、避免环境冲突或团队协作的场景下,使用 Docker 来运行 Locust 是一个很好的选择。通过使用 Docker,你可以轻松地在不同的计算机或云环境中运行压力测试,从而实现更大规模的分布式压力测试。

高性能 FastHttpUser

Locust 的默认 HTTP 客户端使用http.client。如果计划以非常高的吞吐量运行测试并且运行 Locust 的硬件有限,那么它有时效率不够。

FastHttpUser 是 Locust 提供的一个特殊的用户类,用于执行 HTTP 请求。与默认的 HttpUser 不同,FastHttpUser 使用 C 语言库 gatling 编写的 httpclient 进行 HTTP 请求, 有时将给定硬件上每秒的最大请求数增加了 5 到 6 倍。在相同的并发条件下使用FastHttpUser能有效减少对负载机的资源消耗从而达到更大的http请求。

优势

  1. 性能FastHttpUser 的主要优势是性能。由于它使用 C 语言库进行 HTTP 请求,它的性能通常比默认的 HttpUser 更高。这意味着在相同的硬件资源下,你可以使用 FastHttpUser 生成更大的负载。

  2. 资源占用:与默认的 HttpUser 相比,FastHttpUser 通常具有较低的资源占用(如 CPU、内存等)。这意味着在进行压力测试时,你可以在同一台计算机上运行更多的并发用户。

  3. 更高的并发能力:由于 FastHttpUser 的性能和资源占用优势,它可以更好地支持大量并发用户的压力测试。这对于需要模拟大量并发用户的场景(如高流量 Web 应用程序、API 等)非常有用。

然而需要注意的是FastHttpUser 也有一些局限性。例如,它可能不支持某些特定的 HTTP 功能(如自定义 SSL 证书、代理设置等)。在选择使用 FastHttpUser 时,需要权衡性能优势和功能支持。如果测试场景不需要大量并发用户,或者需要特定的 HTTP 功能,使用默认的 HttpUser 可能更合适。

以下是一个使用 FastHttpUser 的 Locust 测试脚本示例:

from locust import FastHttpUser, task, between

class MyFastHttpUser(FastHttpUser):
    wait_time = between(1, 5)

    @task
    def my_task(self):
        self.client.get("/")

测试gRPC等其他协议

locust 并非 http 接口测试工具 , 只是内置了 “HttpUser” 示例 ,理论上来说,只要提供客户端,它可以测试任何协议。

其他

主流性能测试工具对比

下面是 Locust、JMeter、Wrk 和 LoadRunner 四款性能测试工具的优缺点和支持的功能的对比表格:

工具名称 优点 缺点 支持的功能
Locust - 简单易用,支持 Python 语言
- 可以在代码中编写测试场景,灵活性高
- 可以使用分布式部署,支持大规模测试
- 支持 Web 和 WebSocket 测试
- 功能相对较少,不支持 GUI
- 对于非 Python 开发人员不太友好
- 在大规模测试时需要手动管理分布式节点
- HTTP(S)、WebSocket 测试
- 支持断言、参数化、数据驱动等功能
- 支持分布式测试
JMeter - 功能丰富,支持多种协议
- 支持 GUI,易于使用
- 支持分布式部署,支持大规模测试
- 支持插件扩展,可以扩展功能
- 性能较差,不适合高并发测试
- 内存占用较高,需要较大的内存
- 学习曲线较陡峭
- HTTP(S)、FTP、JDBC、JMS、LDAP、SMTP、TCP、UDP 等多种协议的测试
- 支持断言、参数化、数据驱动等功能
- 支持分布式测试
Wrk - 性能优异,支持高并发测试
- 支持 Lua 脚本编写,灵活性高
- 支持多种输出格式,方便结果分析
- relatively few functions, no GUI support
- only HTTP protocol testing
- steep learning curve
- HTTP(S) testing
- supports assertion, parameterization, data-driven, etc.
LoadRunner - Rich in functions, supports multiple protocols
- Supports GUI, easy to use
- Supports distributed deployment, supports large-scale testing
- Supports plug-in extensions, which can extend functions
- The price is high, not suitable for small teams
- The learning curve is steep
- The support for non-Windows platforms is not friendly enough
- Testing of various protocols such as HTTP(S), FTP, JDBC, JMS, LDAP, SMTP, TCP, UDP, etc.
- Support functions such as assertion, parameterization, and data drive
- Support distributed testing

It should be noted that the advantages and disadvantages of these tools and the supported functions are only relative terms, and the specific use needs to be selected according to actual needs and scenarios.


 Data acquisition method

【Message 777】

Friends who want to get source code and other tutorial materials, please like + comment + bookmark , triple!

After three times in a row , I will send you private messages one by one in the comment area~

Guess you like

Origin blog.csdn.net/GDYY3721/article/details/132131819