How to improve system throughput (QPS/TPS)

One. System swallowing measurement elements:

The swallowing capacity (pressure bearing capacity) of a system is closely related to the CPU consumption of the request, external interfaces, IO, and so on. The higher the CPU consumption of a single reqeust, the slower the impact of external system interfaces and IO, the lower the system throughput, and vice versa.

Several important parameters of system throughput: QPS (TPS), concurrent number, response time

QPS (TPS): the number of requests/transactions per second

Concurrent number: The number of requests/transactions processed by the system at the same time

Response time: Generally take the average response time

(Many people often confuse the concurrency number and TPS understanding)

After understanding the meaning of the above three elements, you can calculate the relationship between them:

QPS (TPS) = Concurrent number/Average response time or Concurrent number = QPS*Average response time

for example:

Bank window business, work at 8 in the morning, the number of windows is 10 windows, and the average time for each person to handle business is 5 minutes. You can use the following method to calculate.

Concurrent number = 10 windows

The average response time is = 5*60 seconds

QPS = 10/(5*60) transactions/sec

The throughput of a system is usually determined by the two factors of QPS (TPS) and the number of concurrency. These two values ​​of each system have a relative limit value. Under the pressure of application scenario access, as long as a certain item reaches the highest value of the system, the system's The throughput will not go up. If the pressure continues to increase, the throughput of the system will drop instead. The reason is that the system is overloaded, and other consumption such as context switching, memory and so on causes the system performance to drop.

Factors that determine system response time

We need to schedule a project when we do projects. Many people can do multiple tasks concurrently, or one or more people can work serially. There will always be a critical path, and this path is the duration of the project.

The response time of a system call is the same as the project plan, and there is also a critical path, which is the system impact time;

The critical path is composed of CPU operations, IO, external system response, and so on.

two. How to improve the system QPS?

From the previous formula: QPS (TPS) = concurrent number/average response time, we can see that to improve qps, we must make two efforts

2.1 Increase the number of concurrent

1. For example, increasing the number of concurrent threads of tomcat and opening the number of threads matching the server performance can satisfy more service requests.

2. Increase the number of database connections and pre-establish a suitable number of TCP connections

3. If the back-end service is as stateless as possible, it can better support horizontal expansion and meet larger traffic requirements

4. Try not to single-point the various systems and services on the invocation link. They must have equal capabilities from start to finish, and one of them must not become a bottleneck.
5. Try to use the thread pool for RPC calls, and establish an appropriate number of connections in advance.

2.2 Reduce average response time

1. The request ends as soon as possible, the better, so that the pressure does not penetrate to the subsequent system, and caches can be added to each layer

2. Flow peak reduction. Appropriate traffic is released, and the request that cannot be processed directly returns an error or other prompt. Similar to the principle of dams

3. Reduce the call chain

4. Optimize the program

5. Reduce network overhead and use long connections appropriately

6. Optimize the database and build indexes


Guess you like

Origin blog.51cto.com/15082402/2644346