Introduction to performance testing (1): what the indicators in performance testing tell us

Performance Testing

Performance testing is to test various performance indicators of the system by simulating a variety of normal, peak and abnormal load conditions through automated testing tools.

According to different goals, it can be divided into load testing, stress testing, capacity testing, and stability testing. If it is not a professional testing organization in normal work, what developers or operation and maintenance personnel do basically belong to stress testing.

Stress testing is a test to obtain the maximum service level that a system can provide by identifying bottlenecks or unacceptable performance points of a system.

Performance

QPS

Currently in the industry to tell others about the performance indicators of my system, it is easier to say QPS . QPS is also sometimes called TPS, referring to requests/transactions per second. Usually someone tells you that his interface concurrency of 3000 usually refers to QPS=3000, which can be understood as his system accepts and processes 3000 requests in 1 second.

The algorithm of QPS is: the number of completed requests / the time it takes to complete the request

If within 10 seconds, the system receives 3,000 requests, returns 2,000, and reports errors for the remaining 1,000. Its QPS=2000/10=200

Response time

Response time, in terms of a single request, is the time it takes for the service to respond to a request. However, in performance testing, the response time of a single request has no reference value. Usually, the average response time and median time to complete all requests are considered.

The average response time is well understood as the total time taken to complete the request/total number of completed requests. However, the average response time is a little unreliable, because the operation of the system is not smooth and smooth. If the time of some requests is too short or too long, the average will deviate a lot. You can refer to the average salary, average house price, etc. of reading news, and you will know why it is not so reliable. So sometimes we use the median response time.

The so-called median means that a group of data is arranged in order of size, and the number in the middle position is called the median of this group of data, which means that at least 50% of the data are lower than or higher than the middle position. digits. Of course, the most correct statistical practice is to use percentage distribution statistics. That is, TP – Top Percentile in English, TP50 means that 50% of the requests are less than a certain value, and TP90 means that 90% of the requests are less than a certain time.

Concurrency

Concurrency is a particularly confusing concept because it may not mean the same in various contexts. From the perspective of performance test results

Concurrency refers to the number of requests the server is processing at a given point in time

But in practical work, we often hear other sayings

for example:

Engineers often say 2000 concurrent times per second, but he actually means QPS=2000.

And a webmaster said that we have 1,000 concurrent users, but it actually means the maximum number of online users is 1,000. The number of online users is 1000 does not mean that everyone is interacting with the server at the same time, so the number of concurrent servers has not reached 1000.

The operation and maintenance staff said that the concurrent number of tomcat I set was 500. He meant that this tomcat could call up to 500 threads to accept requests at the same time. That is to say, the maximum number of concurrency that the server can achieve at the same time is 500, but due to other reasons such as CPU and OS, the number of concurrency cannot reach this value in practice.

The performance tester said that I set the concurrency number in LR to 3000, which means that he set 3000 concurrent simulated users in the test tool, but theoretically, there will be a maximum of 3000 simulated requests to the server per unit time. However, from the perspective of the client, the client may not be able to reach this pressure due to limitations of CPU, OS, waiting time, etc. Even if the client reaches this pressure, from the server's point of view, there will be a queuing mechanism and abnormal overflow of some requests. It cannot be said that the concurrency of the server has reached 3000.

Therefore, the number of concurrency in the performance test refers to the calculated value of a test result, and the calculation formula is

QPS*Average Response Time

Throughput

Throughput refers to the total amount of data transmitted over the network during a performance test. For interactive applications, the throughput indicator reflects the pressure on the server. In capacity planning tests, throughput is a key indicator because it can indicate the system-level load capacity. In addition, in performance tuning In the process, throughput metrics also have important value. For example, in a large factory, their production efficiency and production speed are very fast. They can produce 10W tons of goods a day. As a result, the transportation capacity of the factory is not good. Two small tricycles can pull 2 ​​tons of goods a day. The analogy is a bit exaggerated, but I want to explain It is this transportation capacity that is the bottleneck of the entire system.

Tip, it is extremely inaccurate to use throughput to measure the output capacity of a system. Using the simplest example, a faucet is opened for a day and a night, and 10 tons of water flows out; 10 faucets are opened for 1 second, and 0.1 tons of water flows out. Of course, the throughput of one tap is large. Can you say that the output capacity of 1 faucet is stronger than that of 10 faucets? Therefore, we need to add the unit time to see who has the most water output per second. This is the throughput rate.

maximum concurrency

After understanding the above indicators, the purpose of performance testing becomes the process of finding the maximum concurrency of the system under specific conditions (fixed hardware devices, usually to eliminate network bottlenecks). When the number of concurrency increases to a certain level, the system response time is still within an acceptable range, and the service does not fail, or the failure rate is within an acceptable range. If this amount of concurrency is exceeded, the system's indicators become unacceptable. If accepted, it is considered that this value is the maximum concurrency of the system.

Applications with maximum concurrency

The maximum concurrency of the system is only a technical theoretical value, which is often bragged within the technical team, but few business students are interested. When you feel complacent about your maximum concurrency of 1000, the customer asks: "How many users can the 1000 concurrency you said be able to carry?". At this time, you are often confused. At this time, you often have to ask the customer: "Are you asking about registered users or simultaneous online users?". The reality is the QPS and the maximum concurrent number of a single functional module. It is often difficult to estimate the number of users in the business sense of the whole station. Real life is much more complicated than the environment simulated by the computer room. The utilization rate of function points, the user's operating habits, etc., even if you use complex test simulation tools such as LR to comprehensively test different function modules, you can often only get an estimated value, and you often encounter The biggest problem is that you don't have a strong test machine that plays thousands of simulated clients. Even if you have, you often don't have that wide bandwidth. After the pressure goes up, the network becomes the bottleneck first (that's why The reason for the existence of testing services such as PTS such as Ali).

Fortunately, we often don't need to get a particularly precise value to serve our business. It is often an approximate value, so we can still estimate it based on some experience, such as the PV of the previous project, according to our own ideas. Weighted calculations for each module. You can even divide the daily PV number by the peak duration to get the daily QPS, and then use the maximum QPS to estimate the maximum PV number that can be supported, and then calculate the UV number.

There is a post with the QPS classification of common WEB websites, you can refer to the following

http://www.cnblogs.com/yiwd/p/3711677.html

According to the description of the post: 90% of the websites are actually floating in the first two levels

How to do performance testing rigorously

The following content comes from a famous blogger: Left ear mouse

See the original text: https://coolshell.cn/articles/17381.html

Generally speaking, several factors should be considered in a performance test: Thoughput throughput , Latency response time , resource utilization (CPU/MEM/IO/Bandwidth…), success rate , and system stability .

The methods of these performance tests below are basically derived from my old company, Thomson Reuters, a company that makes real-time financial data systems.

First, you have to define the response time latency of a system, the recommendation is TP99, and the success rate . For example, the definition of Reuters: 99.9% of the response time must be within 1ms, the average response time is within 1ms, and 100% of the requests are successful.

Two, find the highest throughput within this response time constraint . The data used for testing needs to have data of various sizes, large, medium and small, and can be mixed. It is best to use test data from the production line.

Third, do the Soak Test on this throughput, for example: use the throughput obtained in the second step test to continuously test the system for 7 consecutive days. Then collect CPU, memory, hard disk/network IO, and other indicators to check whether the system is stable. For example, the CPU is stable and the memory usage is also stable. Then, this value is the performance of the system

Fourth, find the limit value of the system. For example: in the case of a 100% success rate (regardless of the length of the response time), the system can maintain a throughput of 10 minutes.

Fifth, do the Burst Test. Use the throughput obtained in the second step to execute for 5 minutes, then execute the limit value obtained in the fourth step for 1 minute, return to the throughput of the second step and execute it for 5 minutes, and then execute the authorization value of the fourth step for 1 minute. This goes back and forth for a period of time, such as 2 days. Collect system data: CPU, memory, hard disk/network IO, etc., observe their curves, and the corresponding response time to ensure that the system is stable.

6. Tests of low throughput and small network packets. Sometimes, when the throughput is low, it may lead to an increase in latency. For example, if the parameter of TCP_NODELAY is not enabled, it will lead to an increase in latency (see TCP for details ), and network packets will lead to dissatisfaction with bandwidth and will lead to poor performance. , Therefore, the performance test also needs to selectively test these two scenarios according to the actual situation.

(Note: In Reuters, Reuters will multiply the throughput obtained in the second step by 66.7% as the soft alarm line of the system, 80% as the hard alarm line of the system, and the limit value is only used to carry sudden peaks. )

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324532846&siteId=291194637