Performance indicators, response time, concurrency... Let's talk about performance optimization metrics

Abstract: Today, let's talk about the metrics for performance optimization in high concurrency scenarios, and what issues need to be paid attention to when doing optimization.

This article is shared from HUAWEI CLOUD Community " [High Concurrency] What are the metrics for performance optimization? What to watch out for? ", Author: Glacier.

Recently, many friends have been saying that I have never done performance optimization work. I only do some CRUD work in the company and have no access to performance optimization related work. Now when I go out for a job interview, the interviewer always asks some tricky questions to embarrass me, many of which I don’t know! then what should we do? Then I will write some questions that are easy to ask in interviews related to high concurrency systems. Today, let's talk about the metrics for performance optimization in high concurrency scenarios, and what issues need to be paid attention to when doing optimization.

interview scene

Interviewer: Have you ever done some performance optimization-related work in your daily work?

First, let's analyze the interviewer's question. In fact, based on my own experience in recruitment interviews, if the interviewer asks such a question. Essentially, it's not just about asking the interviewee to simply answer: did or did not do it. Instead, I want to use this simple question to test the interviewee's thinking ability and ability to understand the problem. The interviewer essentially wants the interviewee to pass this question and tell about his own experience in performance optimization related work, as well as some theoretical understanding of performance optimization work, such as: performance optimization metrics, issues that need attention during the period etc.

If the interviewer does not fully understand the interviewer's intention during the interview process, and when answering questions, he squeezes out some points like toothpaste, then, in most cases, the interviewer will think that this person has no experience in performance optimization. . At this time, the interviewer will greatly reduce the psychological impression of the interviewer, and there is a very high probability that the interview result will be cool.

Metrics

For performance optimization, there are many indicators to measure, which can be roughly divided into: performance indicators, response time, concurrency, open rate per second, and correctness. We can use the graph below to represent these metrics.

Next, we describe these metrics separately.

Performance

Performance indicators can include: throughput and response speed. What we usually call QPS, TPS and HPS can be attributed to throughput. Many small partners may not know much about QPS, TPS and HPS, so let's talk about the meaning of these letters first.

  • QPS stands for the number of queries per second.
  • TPS stands for the number of transactions per second.
  • HPS stands for the number of HTTP requests per second.

These are all throughput-related metrics.

Usually when we do optimization work, we must first clarify the things that need to be optimized. For example: Are we doing optimization work to increase the throughput of the system? Or to improve the response speed of the system? To give a specific example: for example, there are some batch operations of databases or caches in our program. Although the response speed of data reading has dropped, the goal of our optimization is throughput, as long as we optimize the overall system The throughput has increased significantly, which also improves the performance of the program.

So, optimizing performance is not just about improving system responsiveness.

Here, optimizing performance is not just about optimizing throughput and optimizing response speed, but finding a balance between throughput and response speed, and using limited server resources to better improve user experience.

Response time

There are two very important metrics for response time. That is: average response time and percentiles.

(1) Average response time

Usually, the average response time reflects the average processing power of the service interface. It is calculated by adding up the time spent on all requests and dividing by the number of requests. Take a simple example: For example: we send 5 requests to a website, and the time spent on each request is: 1ms, 2ms, 1ms, 3ms, 2ms, then the average response time is (1+2+1 +3+2)/5 = 1.8ms, so the average response time is 1.8ms.

There is a problem with the metric of average response time: if the request becomes very slow in a short period of time, but it passes quickly, the average response time cannot be used to reflect the performance fluctuation problem well.

(2) Percentile

The percentile is when we are optimizing, we delineate a time range, add the time spent for each request into a list, and then sort these times in ascending order. In this way, we take out the time-consuming of a specific percentile, and this number is the TP value.

The meaning of the TP value is that more than N% of requests are returned within X time. For example, TP90 = 50ms, which means that requests over 90th will be returned within 50ms.

The percentile indicator is also very important, it reflects the overall response of the application interface.

We generally divide the percentiles into TP50, TP90, TP95, TP99, TP99.9 and other segments. The higher the requirement for the high percentile value, the higher the stability requirement for the system response capability.

Concurrency

Concurrency refers to the number of requests that the system can process at the same time, and reflects the load capacity of the system.

When we optimize a high-concurrency system, we often tune in the amount of concurrency, and there are various tuning methods. The purpose is to improve the system's ability to process requests at the same time.

In general, the concurrency indicator is relatively simple to understand, so I will not describe it too much.

second open rate

The second opening rate is mainly for front-end web pages or mobile APPs. If a front-end web page or APP can be opened smoothly within 1 second, especially the loading of the home page. At this point, the user will feel that the front-end web page or APP is very smooth to use. If it exceeds 3 seconds or even longer, the user may directly exit the front-end web page or the APP is no longer used.

Therefore, to optimize programs in high concurrency scenarios, it is not only necessary to optimize the back-end programs, but also to optimize the front-end and APP.

correctness

Correctness means that no matter how we optimize the application, the optimized interactive data result must be correct. The performance before optimization is relatively low, the data is correct, and the performance after optimization is relatively high, but the data is incorrect.

Optimization needs attention

  • Don't optimize at first unless necessary (especially during development)
  • Some optimization guidelines are outdated and need to consider the current hardware and software environment (don't stick to the rules)
  • Don't overemphasize some system-level metrics, such as cache hit rate, but focus on performance bottlenecks
  • Do not follow blindly, test, find the performance bottleneck of the system, and then determine the optimization method
  • Be careful to weigh the costs and benefits of optimization (some optimizations may require adjustments to the existing architecture and increase development/operation costs)
  • The goal of optimization is user experience, reducing hardware costs (reducing cluster size, not relying on single-machine high performance)
  • The optimization method of the test environment may not be effective for the production environment (optimization needs to be based on the real situation)

 

Click Follow to learn about HUAWEI CLOUD's new technologies for the first time~

{{o.name}}
{{m.name}}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326659740&siteId=291194637