Misunderstandings among performance testing novices: number of users and pressure

If the same project and the same performance requirements are tested by different testers, will the results be the same?

Suppose there is such a small forum, and the requirements obtained by the performance testers are "support 50 concurrent users, and the response time must be within 3 seconds." Performance testers A and B start performance testing at the same time (each doing their own thing).

Only consider the operation of posting. The test scenario designed by A is for 50 people to post concurrently. The test result obtained is that the average completion time is 5 seconds. So he raised this issue, believing that the system did not meet performance expectations and needed to be optimized by developers.

The test scenario designed by B is that 50 people are online, and each person posts a post within 5 minutes, that is, 10 people post within 1 minute. The final test result is an average completion time of 2 seconds. So he concluded that the system passed the performance test and could meet the pressure of going online.

Two people got different test results and completely opposite test conclusions. Who made the mistake?

Perhaps this example is too extreme. Absolute concurrency and evenly distributed access pressure are of course completely different. Let's look at a more realistic example.

It is still a small forum, and the requirement is "when 100 people are online, the page response time should be less than 3 seconds." A and B started working at the same time. At this time, they have grown up, become more experienced, and know how to design more realistic test scenarios. Assume that they have confirmed that the user's operation process is "log in - enter the sub-forum - (browse list - browse posts) × 10 - post", that is, each user reads 10 posts and posts one post. So they all recorded the same test script.

A believes that a 30-second interval between each user's operations is more appropriate, so he adds a 30-second wait (thinking time) between each two transactions in the script.

B thought about the scene when he was browsing the forum. It seemed that there was an average interval of 1 minute between each mouse click, so he added a 1-minute wait between each two transactions in the script.

They all thought that their test scenarios were closer to the actual situation, but unfortunately the test results were different. It was obvious that the pressure of scenario A was twice that of scenario B. So who is wrong? Or some people say it’s caused by unclear requirements. So what kind of requirements do you need?

Take a look at the questions I randomly found on the Internet (51testing), which are exactly the same as the above. There must be many performance testers who receive this kind of demand every day and carry out tests like this. The results can be imagined.

I would like to ask a few questions here. I hope you will think about it after reading the above small examples:

If another person tests the same system as you, will your test results be consistent?
If not, who is correct?
How to prove that the test results are valid?

If you have some doubts and are less confident about the previous test results, then please continue.

Server perspective vs. user perspective

A very important part of performance testing is to simulate the expected pressure and test the user experience when the system is running under this pressure.

So what is pressure? The stress is that the server is constantly processing things, even many things at the same time. Stress is a "thing" that is handled directly by the server, not by users far away on the other side of the network.

In the figure below, each colored line segment represents an operation. At any time, the server knows that it has 10 transactions to process, and these 10 transactions are also generated by 10 users. But what it doesn't know is how many users interacted with the system for all transactions during the entire time period.

This sentence seems a bit convoluted, so I’ll try to explain it more vividly. At time 1, 10 current transactions are initiated by 10 users. At time 2, there are still 10 ongoing transactions, but they may be initiated by 10 completely different people. During this period of time, the server is processing 10 transactions at every moment, but the number of people involved in this interaction process (putting pressure on the server) may reach hundreds, or it may only be the first 10.

So, what is the pressure on the server? Obviously there are only these 10 simultaneous transactions at each moment, and there is not much difference whether there are 10 people or 1,000 people (disregarding session and other issues for the time being).

Let’s take a look at it from the user’s perspective. In actual situations, it is impossible for many users to start operations at the same time, but in a certain time sequence. As shown in the figure below, a total of 23 users performed operations during this time period.

But can the server see these users? All it knows is how many transactions are being executed at a certain point in time. You can count, there are still 10 concurrent transactions at any time in this picture.

In fact, these two pictures describe the same scene, but the observer's perspective is different.

So think about it, what do the "concurrent users" most commonly seen in performance requirements refer to?

concurrent users

Many people who use the term "concurrent users" are not thinking from a server perspective. They think about the number of people sitting in front of a computer using the system and stressing the system. For this reason, I rarely use this misleading term, but make a more detailed classification. There are mainly the following: number of system users (number of registered users), number of online users (relative number of concurrent users), and absolute number of concurrent users.

The "concurrent users" mentioned in the above examples are actually the number of online users. In fact, I prefer to call it the relative number of concurrent users, because this word is more likely to make people feel "stressed". The relative number of concurrent users refers to the number of users who interact with the server and put pressure on the server within a period of time. This time period can be one day or one hour. And it is exactly this content that the requirements personnel must describe.

Absolute concurrent users are mainly tested for a certain operation, that is, multiple users initiate the same request at the same time. It can be used to verify whether there are concurrent logic processing problems, such as thread insecurity, deadlock, etc.; it can also provide some performance reference information, such as 1 user taking 1 second, but 10 users taking 30 seconds concurrently. , then there is likely to be a problem and requires attention, because it should only take 10 seconds to queue 10 user requests. But this kind of absolute concurrency test has little to do with user experience under actual pressure.

Going back to the concept of relative concurrency, what is its relationship with the pressure on the server? If you understand all the previous content, then you will know that the two are not directly related (of course, in the same test case, the more users there are, the greater the pressure). In other words, with the performance requirements you get, you have no way of knowing how much pressure the server will be under.

So how to conduct performance testing?

How to simulate pressure

Now that we know that the so-called pressure is actually from the perspective of the server, and the transactions that the server has to handle are the pressure, then we will start from here to explore the information needed for performance testing. Still using the previous small forum as an example, we need to test whether the system performance can still provide a good user experience when there are 500 active users.

Assume that there are 50 active users now (or it can be calculated through another similar system), the average number of posts per day is 50, and the posts are viewed 500 times, that is, each person posts one post and views ten posts per day. (For the convenience of explanation, it is assumed that the forum only has these two basic functions). Then we can deduce that when active users reach 500, the daily business volume will also increase proportionally, that is, on average, 500 new posts will be generated and posts will be viewed 5,000 times a day.

Further analysis of the data revealed another finding. The time period for users to use the forum is very concentrated, basically concentrated between 11:00 and 1:00 noon and 18:00 and 20:00 in the evening. In other words, these daily tasks are actually completed in 4 hours, as shown in the picture below (I found a random picture to give an idea, but the numbers are inconsistent).

Then our test scenario is to use 500 users to complete the workload of "each person posts one post and views ten posts" within 4 hours.

Pay attention to the two places above, "on average every day..." and "distributed over 4 hours...". Sensitive testers should be able to find that this scenario measures the average pressure, which is the pressure of a system on the most common day. I like to call it daily pressure.

Obviously, in addition to daily pressure, the system will also have more stressful usage scenarios. For example, if an important thing happens one day, users will discuss it more enthusiastically. This pressure, which I am accustomed to calling peak pressure, requires a specially designed test scenario.

What data is needed for this scenario? We can still analyze it from the existing data. For example, the above mentioned is "the average daily total number of posts...", then this time we will find the highest one-day business volume in the past. "Distributed in 4 hours" also needs to be modified accordingly, such as checking the historical distribution map to see if there is a more concentrated distribution, or using the simpler and more general 80-20 principle, 80% of the work is completed in 20% of the time . Based on these data, appropriate adjustments can be made to design test scenarios for peak periods.

In actual work, more test scenarios may be needed, such as peak pressure scenarios. What is peak pressure? For example, a bank website may experience a sudden increase in visits due to the release of a major news. This sudden pressure also needs to be considered by performance testers.

It is necessary to pay attention to the difference between peak pressure and peak pressure. Peak pressure refers to a peak of the normal and expected pressure of the system. Peak stress refers to stress that is not within normal expectations and may only occur once every few years.

Here is just the simplest example, the actual work is much more complicated than this. What data is needed and how to obtain it? It is likely that it will take a lot of effort to obtain this data. This actually involves a very important content, the establishment of user models and pressure models, which will be discussed in a special article in the future.

Why spend so much effort collecting this information? This is because only through these effective data can we accurately simulate user scenarios, accurately simulate pressure, and obtain a more realistic user experience. Only in this way will it be possible for "different testers to measure the same results", and the results will be accurate and effective.

Key points review

Finally, let’s summarize and review through a few small questions:

Do you really understand the meaning of "concurrent users"?

What are user perspective and server perspective?

What is stress?

How to simulate expected pressure?

Finally: The complete software testing video tutorial below has been compiled and uploaded. Friends who need it can get it by themselves [guaranteed 100% free]

Software Testing Interview Document

We must study to find a high-paying job. The following interview questions are from the latest interview materials from first-tier Internet companies such as Alibaba, Tencent, Byte, etc., and some Byte bosses have given authoritative answers. After finishing this set I believe everyone can find a satisfactory job based on the interview information.

Guess you like

Origin blog.csdn.net/AI_Green/article/details/132887808