Avoid these 12 stress testing misunderstandings to ensure effective stress testing

Stress testing is essentially a matter of experience. As for the technology, I think it is all supported now, and some people are confused about the supporting things. That is still a matter of experience. As a reminder, this article is quite useful for people who play stress testing in Yeluzi.

1. Misunderstandings in pressure testing

First, I will talk about the misunderstandings in stress testing. I will briefly summarize each misunderstanding. For those that need to be expanded, I will describe them in detail in the second part on how to effectively stress test.

Misunderstanding 1: Performance testing starts with writing scripts

The most important thing is that you can figure out why you need to stress test, the purpose of your stress test and the stress test scenario; writing a script is only the first step in your opinion, because you have no idea or are not clear about other so-called performance requirements analysis. Writing a script directly is not the most scientific step.

Myth 2: Performance testing must come after functional testing

This is waterfall thinking. Everyone is talking about shifting testing to the left. Why can’t performance testing be shifted to the left?

Unit-level performance testing is also possible, such as injecting Benchmark into the method layer (Benchmark is a library that supports functional benchmarking, similar to unit testing). Generally, companies do not have time to do unit-level testing. If the interface is implemented first, you can First complete the benchmark performance test of the interface.

Misunderstanding 3: Performance testing should cover more scenarios like functional testing

Performance testing pays more attention to the user access model to filter and be targeted, rather than covering everything like functions. However, the industry is currently promoting business stability testing. When time and resources permit, multi-scenario coverage also plays a great role. It is still recommended to distinguish between the two to achieve the core purpose quickly.

Myth 4: Improving hardware can improve system performance

There are many reasons for performance bottlenecks, and hardware is one of them. It can be said to a layman to say that performance can be improved by adding hardware without judgment. Improving hardware performance does not necessarily improve system performance. For example, if you are deadlocked or in an infinite loop, adding hardware will be useless.

Myth 5: Unrealistic performance metrics

Businesses often use the words "millions of concurrency", but these actually need to be converted into measurable performance test indicators. Some students do not understand the conversion process and directly compare the number of threads in the stress testing tool with the number of concurrency mentioned in the business.

Misunderstanding 6: Offline pressure testing is meaningless

This is quite typical. Some students have relatively poor offline stress test results. If they go to developers, they are often rejected by the developers. Your stress test environment is too different from the online stress test environment, so the results are meaningless. This statement seems to be true on the surface, but in fact it is changing the concept. The purpose of offline pressure testing is not to compare the actual values ​​online, but to discover basic performance problems through offline pressure testing, such as slow methods. , indexing problems, deadlocks, etc. Therefore, you must have your own independent thinking and do not be biased by development.

Misunderstanding 7: There is a lot of logic in the script

The core logic is complete. If judged by if, using logic plug-ins is a waste of performance and is counterproductive.

Misunderstanding 8: It can run without parameterization, so there is no need for parameterization.

The script works, but the scene is unreal. It can run and returns 200, but what you are looking at is the skin. The skin is the same but the lining is different. The stress test data is not parameterized and is cached in large quantities, which may not be consistent with the real scenario. This is a lie.

Misunderstanding 9: The script does not add checkpoints or has too many checkpoints

Failure to add checkpoints to scripts may lead to business deviations during performance stress testing. Excessive checkpoints will lead to a waste of performance, especially if you cannot connect to the database for query verification while performing stress testing.

Misunderstanding 10: Do scripts have to add meeting points?

Understand the access model for performance testing. For scenarios such as flash sales, you can add rendezvous points to verify overselling. For some pre-sale categories, the query category is not necessarily required and must be based on actual needs.

Misunderstanding 11: Bottleneck diagnosis starts with the server

To troubleshoot pressure test bottlenecks, first make sure there are no bottlenecks in the pressure test itself, including the environment in which the pressure is initiated. like:

  • Your own notebook?
  • Are you under pressure to deliver bandwidth to your office's public network?

Misunderstanding 12: Performance testing must be done to find performance problems?

This is a matter of experience, and performance testing is not required to discover performance problems. For example, if a single call to the interface is too slow, you can trace it; for example, if you find that the index has not been added, you can make an execution plan.

2. How to effectively measure stress?

Adequate demand research and the scientific accuracy of demand research determine whether stress testing can be effective. The core includes goal setting, business sorting, test data sorting, system deployment structure and monitoring, etc.

1. Goal setting and business rule sorting

① Goal setting

The three most important indicators for measuring performance are TPS, response time, and error rate. So how to formulate performance test indicators? What is your basis? Let me list some common misunderstandings:

  • I am based on the 28/20 principle. The boss said that we have one million daily users, and 80% of users visit in 20% of the time. The response time is based on the industry's 2-5-8;
  • Based on the data analysis of competing products, I found that the PV of their products should be in the millions, so our products are also formulated in this way;
  • This indicator is set by the business. They have discussed it with the development team and there should be no problem.

The above answer is not only far-fetched from a theoretical point of view, but also does not have any effective information that can be referenced when formulating performance tests. Performance testing is a very rigorous work, and it is impossible to meet specific analysis through indirect or universal rules. Therefore, the understanding of this issue can basically determine whether a student has actually done performance testing.

There will be an access count for each interface, which is relatively common in the industry and is also one of the most accurate indicators for measuring interface access capabilities. Generally, large companies will develop corresponding monitoring tools themselves, and developing companies will also use some open source or commercial tools for monitoring. For example, you can extract this data from ELK and specify the target Tps based on the actual frequency of access.

② Research on business rules

For performance testing, understanding of business rules is also indispensable. When some companies' performance testing teams conduct stress testing, business line testing also needs to assist in supporting the stress testing, which can reflect the importance of the business.

A full understanding of the business can not only help you improve the efficiency of script writing, but also help you construct more realistic performance testing scenarios. To give a simple example, when you simulate placing an order, do you consider product attributes, member attributes, etc., such as whether it is a single product or a package product? How many products are in the shopping cart when placing an order? Through the combination of different conditions, these will affect performance Test results. This part is very important. For example, if you have tested a promotion, the combination of functions may generate thousands of cases, and the activities will trigger many rules. If you only follow a simple process, the logic complexity is not the same thing at all, and the performance difference will be It is very different from the real logic, and transactions are the core link. If there is a problem, the pot will be quite big, and this aspect needs to be carefully considered.

2. Deployment architecture research

 

What does the tested business deployment architecture mean? Simply put, it means which components are involved in the tested service, which servers are each component deployed on, and what is the server configuration? You need to draw a schematic diagram of the deployment architecture. Only with this diagram can you know how to achieve comprehensive monitoring and which services to start with when encountering problems.

I use an architectural diagram I drew to illustrate this problem, as shown in the figure below. This is a classic link: initiated from the client to the server, and the server moves from the proxy layer to the application layer, and finally to the data layer. It should be noted that you need to know how many nodes there are for each service in the environment being tested. For example, how many stress testing machine nodes are there at the customer layer and which network segment they are on. In the same way, we can investigate the nodes of the service layer and data layer.

And now they are all cloud services, which can achieve elastic expansion. There will be different instance nodes at different time nodes. These should be recorded.

3. Conduct research on test data

Regarding test data research, it contains a lot of content. For business testing, data research is to obtain the necessary parameters to meet the established scenarios and run through them. So for performance testing, what aspects of data research need to be done? I will explain it to you one by one.

① Analysis of basic data volume of database

The basic data volume of the database is the actual data volume of the current online database. Why do we need to count the basic data volume? Many companies often have independent performance testing environments, but the data volume of the database is significantly different from that online. There may be a problem that a SQL statement executes quickly in the performance testing environment, but is very slow in production. This causes the tester to feel that everything that should be tested has been tested, but problems will still occur when it is put into production.

This problem may not be exposed due to missing indexes and a small amount of data in the performance environment. Therefore, the amount of data in the performance test environment must be consistent with that in production. In order to achieve this goal, some companies can desensitize production data and back it up, while others require you to write your own scripts to create data in batches based on business rules.

② Analysis of pressure measurement incremental data

In addition to the basic data volume of the database, we also need to consider how much data volume will be increased after a round of performance testing. The increased amount of data often ends up in the database, which may pass through various middlewares such as Redis, Mq, etc., so the links involved may experience a surge in data volume. Therefore, in this regard, it is necessary to formulate corresponding cover-up plans (i.e., the best solution) based on the increase. temporary solution in poor situations).

③ Analysis of hot and cold data

Based on my experience in the industry, there are not many companies that can consider the distribution of hot and cold data in the planning stage. They often trace it back to some abnormal points in performance test results or problems in the actual production line. Next, I will take you to understand what hot and cold data is and what impact it may have if it is not analyzed.

  • Cold data refers to data that is not frequently accessed. Usually it is stored in the database, and the reading and writing efficiency is relatively low.
  • Hot data is data that is frequently accessed by users and is generally placed in the cache.

During the performance test, frequently accessed cold data will be converted into hot data. If the amount of parameterized data is relatively small, continuous stress testing will make TPS higher and higher. In actual big promotions, there are often tens of millions of users accessing directly, but most of them are cold data. There may be problems with the system before the processing capacity reaches the indicators of the stress test results. Therefore, when conducting demand research, you also need to consider: Will the data be cached, and how long will it take to cache?

4. Test monitoring

Why monitoring is important? It is the eye that detects problems. Moreover, if timely monitoring and discovery are not made during the process, it will be difficult to restore the scene. Not only do you need to clearly list the monitoring tools and access methods you need, but you also need to clearly convey the content you monitor. For me, the first key word of being a prisoner is: completeness.

How to understand "whole"? Let me give you a typical example. Sometimes when working on a new project, I ask the support students whether they have deployed monitoring. They say that they have deployed it, but when you actually use it, you find that only the CPU of one application server is monitored. I believe that most people are familiar with this example, so I will explain it in full, including at least two aspects:

  • All servers involved;
  • Involves basic server monitoring, including CPU, disk, memory, network, etc.

The second is layering, not only hardware-related, but also link-related, which must be monitored, using tools such as skywalking, pinpoint, etc. Here is a large picture that I have used in PPT before. It is suitable for how to use it.

 

As shown in the figure above, there is a lot to monitor.

Another very important point in monitoring is to set thresholds to alarm. Whether it is online or offline performance testing, the alarm function is required. Because through manual observation, problems often cannot be discovered as quickly as possible. Once the alarm can be reported in time, the personnel involved can respond quickly to reduce risks as much as possible. 

Thank you to everyone who reads my article carefully. There is always a courtesy. Although it is not a very valuable thing, if you can use it, you can take it directly:

These materials should be the most comprehensive and complete preparation warehouse for [software testing] friends. This warehouse has also accompanied tens of thousands of test engineers through the most difficult journey. I hope it can also help you! Anyone in need Partners can click on the small card below to receive it  

 

Guess you like

Origin blog.csdn.net/hlsxjh/article/details/132676046