Concurrency patterns of different performance testing tools

       The performance testing tools that everyone is familiar with are Loadrunner, JMeter, and other niche tools, such as Locust, Ngrinder, Gatling, etc., so do you know the difference between these tools? Why some tools can simulate thousands of tens of thousands of concurrency, and some tools can only simulate one or two thousand concurrency on a single machine? What is the reason for this? So in this class, I will tell you the side of performance testing tools that you don't know about: concurrency mode .

1. Multi-process/multi-thread concurrent mode

Multiprocessing: Execute multiple programs at the same time. For example, run WeChat, QQ, and various browsers (you can see multiple programs running in the process list).
Multithreading: Execute multiple threads at the same time. For example, use a browser to watch news, listen to songs, and download at the same time (only one browser process is started, running multi-threaded tasks).

1. Process and thread switching mode

The representative tool that supports the dual mode of process and thread is Loadrunner

For the difference between Loadrunner running VUSER per thread and running VUSER per process:

(1) VUSER is run by thread . By default, LR starts a process mmdrv.exe for every 50 users; when the controller scene runs, the process mmdrv.exe will also end accordingly;

In the Runtime setting, it is set to run VUSER by thread. If the number of virtual users in the Controller is set to be less than or equal to 50, open the Windows Explorer and you can see that there is a process mmdrv.exe; set the number of virtual users in the Controller to be between 51 and 100. If you open Windows Explorer, you can see that there are two processes mmdrv.exe.
(2) Run VUSER by process , and the system opens a process mmdrv.exe for each user; when the controller scene is finished, the process mmdrv.exe also will end accordingly;


The process method consumes a lot of resources because a large number of mmdrv.exe are required (processes are exclusive resources, unlike threads which are shared memory space), and more concurrency cannot be supported under the same resources, but the stability of the process is exchanged. Safety and security (processes have exclusive resources, and there will be no memory sharing contention like threads, so the error rate is extremely low), and exceptions are not easy to occur during stress testing. These protocols do not support multi-threaded concurrency in LR: Sybase-Dblib, Infomix, Tuxedo, and PeopleSoft-Tuxedo, because these protocols do not natively support thread safety (share contention occurs).

2. Multi-threaded concurrent mode

The representative tool that supports multi-threaded concurrent mode is JMeter

(1) It is heavily dependent on the development language and the operating system's support for multithreading
(2) When multithreading is switched, the resource consumption is relatively large, and in the case of the same resources, the number of effective concurrency generated is small;
(3) Multithreading is also relatively It is easy to produce errors, such as deadlocks, and the shared data is disordered;
(4) Some of the above errors caused by secondary development can be reduced through rich interfaces;
(5) The lack of concurrency can be met by extending development and plug-ins to achieve distributed distribution;
( 6) The multi-threaded application technology is relatively mature, and will continue to be used in many performance testing tools for a long time in the future.

As a representative tool of multi-threaded concurrency, Jmeter is definitely lighter than multi-process tools, but effective concurrency is still insufficient, which requires the use of distributed agents, but a distributed agent can only start one process (slave), and a process only It can run a job task (the process has a single communication port, and the process is concurrency through multi-threading), so Jmeter does not support distributed multi-task concurrency, but because Jmeter's master (main node) supports multi-process (starting multiple jmeters) ), so some stress testing platforms, such as MeterSphere, take advantage of this, by controlling multiple Jmeters for multitasking concurrency (multiprocess parallel tasks + multithreading concurrent testing), rather than relying on distributed agents such as single process multithreading The way:

For details, please refer to my article " Understanding the Performance Test Architecture of MeterSphere

3. Multi-process and multi-thread mode

The representative tool to take full advantage of the process and thread concurrency mode is Ngrinder

Conversion relationship of virtual users:
Number of processes: how many processes are started from each server to run
Threads: the number of new threads created by each process
Concurrency = number of agents x number of processes x number of threads


nGrinder supports multiple tests and dynamic agent assignment , so agents are dynamically assigned to tests only when real tests are executed. This makes nGrinder the only solution among all competitors. Due to the relatively small number of agents, multiple users can run multiple tests simultaneously. The number of possible concurrent tests depends on the number of free agents.

Summary: Compared with multi-threading and multi-process, multi-threading is obviously much lighter, and it can make full use of the concurrent processing capability of multi-core CPU, which is much more efficient. The process of processing request sending, waiting and receiving. As long as this process does not end, thread resources will never be released. Therefore, multithreading needs to reasonably compete for CPU resources through context switching, and multithreading context switching will seriously affect The execution speed of multiple threads, so the effective thread concurrency of a single machine is insufficient, and the pursuit of higher concurrency can only be done by adding agents .

Second, the message loop (EventLoop) concurrency mode

The representative tool, Locust, is based on coroutines (micro-threads, equivalent to functions, more lightweight) rather than callback methods. It can only run on a single-core CPU, and can be distributed to achieve multi-core operation. 

1. The biggest advantage of the EventLoop model is that a large amount of concurrency is accomplished in one thread, thereby avoiding various problems caused by multithreading. We can see that sending messages and receiving messages are independent, and one thread does not need to be responsible to the end, which avoids the problem of context switching of multiple threads.
2. The disadvantage is that multiple cores of a multi-core processor cannot be used at the same time, so hardware resources cannot be fully utilized, because one thread realizes multiple concurrency, and it is enough to use a single-core CPU, which causes other CPUs to be idle (otherwise). A wasteful behavior), which needs to make up for this problem by starting multi-threading with distribution and running multiple instances.
3. The number of concurrent users in this concurrency model can only be configured with a fixed value and cannot be changed during the stress test; this feature is different from JMeter and Gatling, because both JMeter and Gatling can change concurrent users during the running process quantity.

3. Actor Concurrency Mode

This concurrency model is relatively new and belongs to the concurrency model of old technologies and new applications. The representative tool is Gatling (this tool was released relatively late, so this novel concurrency technology was adopted);
with the advent of the multi-core era and distributed systems, The shared model (multi-threading technology mentioned above) is not suitable for concurrent programming, so the Actor model that has appeared decades ago has received renewed attention. MapReduce is a typical Actor mode, and Erlang, the programming language that supports Actors at the language level, has become popular again. Scala also provides Actors, but not at the language level. Java also has third-party Actor packages, Go language The channel mechanism is also an Actor-like model.

1. This model combines the advantages of the multi-threaded concurrency model and the message loop concurrency model, avoiding multi-threading problems and making full use of hardware resources;
2. Based on message passing, and using each virtual user based on an Actor can achieve relative It is independent (no lock mechanism) and communicates through message passing, so it has the ability to perform high concurrency in a single thread;
3. Mail Box is a communication bridge between actors, and the mailbox stores the sender through a FIFO message queue.
4. You can also easily dynamically increase and decrease the number of concurrent virtual users (Actor) at runtime ;

Due to the lightweight and high concurrency of the Actor model, and the fact that the Scala language is based on the JVM, Gatling's concurrency model combines the advantages of JMeter and Locust, which avoids some of the problems of multithreading as much as possible, and can make full use of hardware Resources: Multicore. Secondly, the core of the Actor model is based on message passing, which has the same ability to perform high concurrency in a single thread as the message loop model. And it can easily dynamically increase and decrease the number of concurrent virtual users (Actors) at runtime. Although its concurrency model is very good, it needs to use the Scala language for development, which makes many testers discouraged, resulting in the use of Gatling is not very extensive. However, this does not stop people from yearning for this technology. It is foreseeable that future performance stress testing tools will prefer to adopt this concurrency mode .

If you have any doubts about Actor, please refer to this article " Understanding Actor Mode in Ten Minutes "

4. Traffic copy playback mode

Traffic replication and playback is not a concurrent mode, nor is it related to concurrency technology, but the reason why I put it together is because this method is also a type of mode used by performance testing tools. It does not need to create concurrency, but only needs to produce The traffic of the environment is copied and amplified to simulate the effect of millions of concurrency. The purpose of our simulation of concurrency, from the perspective of the business level, is also to simulate a large amount of traffic. For the Internet era, traffic is life and blood. Copying and multiplexing this kind of traffic is sometimes more complicated than simply simulating concurrency. significance.

What is traffic replication?
We define the data transmission caused by user access to the system as traffic, then during the process of user access to the system, we can copy the incoming and outgoing data, save it for subsequent use, that is, offline mode, or forward it to a new Server, use immediately, ie online mode.
What is traffic playback?
After obtaining the copied traffic, we transmit them one by one to the service to be tested according to the time sequence of receipt, and let the test service generate a corresponding response; it is equivalent to the actual user helping us to test.
There are usually the following scenarios for playback testing:
(1) Play back whatever content is copied, that is, full playback;
(2) The copied content is filtered by some preset rules, or played back after special processing. , that is, selective playback;
(3) The copied content is processed to obtain necessary data items, such as the search term mentioned above, that is, keyword playback.
TCPCOPY is a relatively common and excellent tool for copying and replaying traffic, but it has high requirements for networking. The following is the networking architecture diagram:

1. Online Server (OS): TCPcopy is deployed above, and the request data packets are captured from the data link layer (pcap interface), and the packets are sent from the IP layer;
2. Test Server (TS): TS sets routing information,
3. Assistant Server (AS): This is an independent auxiliary server. In principle, an idle server on the same network segment must be used as the auxiliary server . The AS intercepts the response packet at the data link layer, extracts useful information from it, and returns it to the TCPcopy process on the corresponding OS.

In addition to TCPCOPY, there is also a more popular drainage tool Gor (GoReplay) , and interested students can find out for themselves.

This is the introduction to the concurrent mode of performance testing tools. The content of this article is organized from a small piece of content in my recording and broadcasting course " Perplexity of Core Knowledge of Performance Testing ". If you are interested, please visit my recording and broadcasting. Course study, the following is the knowledge structure diagram of this course:

My course:  https://edu.csdn.net/lecturer/5782

Reference article - Liu Ran: Gatling, a new generation of server performance testing tools    for performance testing, and a comparison of concurrent models for performance testing (JMeter, Locust and Gatling)

Guess you like

Origin blog.csdn.net/smooth00/article/details/112600406