[High Concurrency] Interviewer: What are the metrics for performance optimization? What needs attention?

Write in front

Recently, many small partners are saying that I have not done performance optimization work. I only do some CRUD work in the company, and I have no access to performance optimization related work. Now when I go out for job interviews, the interviewer always asks some tricky questions to embarrass me, many of which I don't know! then what should we do? Then I will write some questions that are easy to ask in interviews related to high-concurrency systems. Today, let's talk about the measurement standards for performance optimization in high concurrency scenarios, and what issues need to be paid attention to when doing optimization.

Interview scene

Interviewer: Have you done any work related to performance optimization in your usual work?

First, let's analyze this question of the interviewer. In fact, based on my own recruitment interview experience, if the interviewer asks such a question. Essentially, it is not just for the interviewer to simply answer: done or not done. But I want to use this simple question to examine the interviewer's thinking ability and understanding of the problem. The interviewer essentially wants the interviewer to pass this question, tell us about his own experience in performance optimization related work, and some theoretical understanding of performance optimization work, such as: performance optimization metrics, issues that need to be paid attention to during the period and many more.

If the interviewer does not fully understand the interviewer’s intentions during the interview, and when answering questions, squeeze out points like squeezing toothpaste, then, in most cases, the interviewer will think that this person has no experience in performance optimization . At this time, the interviewer's impression of the interviewer's psychology will be greatly reduced, and there is a very high probability that the interview result will be cold.

Insert picture description here

Metrics

For performance optimization, there are many indicators to measure, which can be roughly divided into: performance indicators, response time, concurrency, second opening rate, and correctness. We can use the diagram below to represent these metrics.

Insert picture description here

Next, we will explain these measurement indicators separately.

Performance

Performance indicators can include: throughput and response speed. What we usually call QPS, TPS, HPS, etc., can be attributed to throughput. Many friends may not know much about QPS, TPS and HPS. Let's first talk about the meaning of these letters.

  • QPS represents the number of queries per second.
  • TPS represents the number of transactions per second.
  • HPS represents the number of HTTP requests per second.

These are all metrics related to throughput.

When we usually do optimization work, we must first clarify what needs to be optimized. For example: Is the optimization work we do to improve the throughput of the system? Or should we improve the response speed of the system? To give a specific example: For example, there are some database or cache batch operations in our program. Although the response speed of data reading has decreased, our optimization goal is throughput, as long as we optimize the overall system The throughput has increased significantly, which also improves the performance of the program.

So, optimizing performance is not just about improving the response speed of the system.

Here, optimizing performance is not blindly optimizing throughput and optimizing response speed, but to find a balance between throughput and response speed, and use limited server resources to better improve user experience.

Response time

For response time, there are two very important metrics. That is: average response time and percentile.

(1) Average response time

Generally, the average response time reflects the average processing capacity of the service interface. The calculation method is to add up the time spent on all requests and divide by the number of requests. To give a simple example: For example: We send 5 requests to a website, and the time spent on each request is 1ms, 2ms, 1ms, 3ms, 2ms, then the average response time is (1+2+1 +3+2)/5 = 1.8ms, so the average response time is 1.8ms.

There is a problem with the average response time: if the request becomes very slow in a short period of time, but it passes quickly, the average response time will not be able to reflect the performance fluctuation problem.

(2) Percentile

The percentile is when we are optimizing, we delineate a time range, add the time spent on each request to a list, and then sort these times in ascending order. In this way, we take the time taken for a specific percentile, and this number is the TP value.

The meaning of the TP value is: more than N% of the requests are returned within X time. For example, TP90 = 50ms, which means that requests over 90th will be returned within 50ms.

The percentile is also very important. It reflects the overall response of the application interface.

We generally divide the percentiles into TP50, TP90, TP95, TP99, TP99.9 and other segments. The higher the requirements for high percentiles, the higher the stability requirements for system response capabilities.

Concurrency

Concurrency refers to the number of requests that the system can handle at the same time, and reflects the load capacity of the system.

When optimizing high-concurrency systems, we often also tune in the amount of concurrency. There are also various tuning methods to improve the system's ability to process requests at the same time.

In general, the indicator of concurrency is relatively simple to understand, so I won't describe it too much.

Second opening rate

The second opening rate is mainly for the front-end webpage or mobile APP. If a front-end webpage or APP can be opened smoothly within 1 second, especially the loading of the homepage. At this point, the user will feel that the front-end web page or APP is very smooth to use. If it takes more than 3 seconds or more, the user may directly exit the front-end web page or the APP will no longer be used.

Therefore, to optimize a program in a high concurrency scenario, not only the back-end program is optimized, but also the front-end and APP must be optimized.

Correctness

Correctness means that no matter what method or means we use to optimize the application, the optimized interactive data result must be correct. There can be no such phenomenon that the performance before optimization is relatively low and the data is correct, and the performance after optimization is relatively high, but the data is incorrect.

Optimization issues that need attention

Insert picture description here

  • Unless necessary, don't optimize at the beginning (especially in the development phase)
  • Some optimization guidelines are outdated, you need to consider the current software and hardware environment (don't stick to the rules)
  • Don't overemphasize certain system-level indicators, such as cache hit rate, but focus on performance bottlenecks
  • Do not follow blindly, test and find the performance bottleneck of the system, and then determine the optimization method
  • Pay attention to weighing the cost and benefits of optimization (some optimizations may require adjustments to the existing architecture and increase development/operation and maintenance costs)
  • The goal of optimization is user experience and reducing hardware costs (reduce the cluster size and do not rely on the high performance of a single machine)
  • The optimization methods of the test environment may not be effective for the production environment (optimization needs to be based on the real situation)

Well, let's talk about it today! Don't forget to like it, give it to someone who is watching and forward it, so that more people can see, learn together and make progress together! !

Guess you like

Origin blog.csdn.net/l1028386804/article/details/108655417