Detailed explanation of "high concurrency" of Internet architecture

1. What is high concurrency?

High concurrency (High Concurrency) is one of the factors that must be considered in the design of Internet distributed system architecture. It usually refers to ensuring that the system can process many requests in parallel at the same time through design.


Some commonly used indicators related to high concurrency include response time (Response Time), throughput (Throughput), query rate per second (QPS) (Query Per Second), number of concurrent users, etc.
 

Response time : The time it takes for the system to respond to a request. For example, it takes 200ms for the system to process an HTTP request. This 200ms is the system's response time.

Throughput : The number of requests processed per unit time.

QPS : Number of response requests per second. In the Internet field, the difference between this indicator and throughput is not so obvious.

Number of concurrent users : The number of users who are using the system normally at the same time. For example, in an instant messaging system, the number of simultaneous online users represents the number of concurrent users of the system to a certain extent.

2. How to improve the concurrency capability of the system

There are two main methodologies for Internet distributed architecture design and ways to improve system concurrency: vertical expansion (Scale Up) and horizontal expansion (Scale Out).

Vertical expansion : Improve the processing capability of a single machine. There are two ways to expand vertically:

(1) Enhance stand-alone hardware performance, for example: increase the number of CPU cores such as 32 cores, upgrade to a better network card such as 10G, upgrade to a better hard drive such as SSD, expand hard drive capacity such as 2T, and expand system memory such as 128G;

(2) Improve the performance of single-machine architecture, for example: use Cache to reduce the number of IOs, use asynchronous to increase single service throughput, and use lock-free data structures to reduce response time;


In the early days when the Internet business was developing very rapidly, if budget was not an issue, it was strongly recommended to use the method of "enhancing stand-alone hardware performance" to improve system concurrency capabilities, because at this stage, the company's strategy was often to develop business to seize time, and "enhance stand-alone hardware performance" ” is often the fastest way.

Whether it is improving the performance of stand-alone hardware or improving the performance of stand-alone architecture, there is a fatal shortcoming: there is always a limit to stand-alone performance. Therefore, the ultimate solution for designing high concurrency in Internet distributed architecture is horizontal expansion.

Horizontal expansion : As long as the number of servers is increased, system performance can be linearly expanded. Horizontal expansion has requirements for system architecture design. How to carry out horizontal expansion design at each layer of the architecture, as well as common horizontal expansion practices at each layer of the Internet company architecture, are the focus of this article.
 

3. Common Internet layered architecture


 Common Internet distributed architectures are as above, divided into:

(1) Client layer : The typical caller is a browser or mobile application APP

(2) Reverse proxy layer : system entrance, reverse proxy

(3) Site application layer : implement core application logic and return html or json

(4) Service layer : If servitization is realized, there will be this layer

(5) Data-cache layer : cache accelerates access to storage

(6) Data-database layer : database solidified data storage

How are the horizontal expansions at each level of the entire system implemented?

4. Layered horizontal expansion architecture practice

1. Horizontal expansion of the reverse proxy layer  

The horizontal expansion of the reverse proxy layer is achieved through "DNS polling": dns-server is configured with multiple resolution IPs for a domain name. Each time a DNS resolution request is made to access dns-server, these IPs will be polled and returned.

When nginx becomes a bottleneck, just increase the number of servers, deploy new nginx services, and add an external network IP to expand the performance of the reverse proxy layer and achieve theoretically infinite high concurrency.

2. Horizontal expansion of the site layer 

Horizontal expansion of the site layer is achieved through "nginx". By modifying nginx.conf, multiple web backends can be set up.

When the web backend becomes a bottleneck, just increase the number of servers, deploy new web services, and configure a new web backend in the nginx configuration to expand the performance of the site layer and achieve theoretically infinite concurrency.

3. Horizontal expansion of the site layer

The horizontal expansion of the service layer is achieved through the "service connection pool".

When the site layer calls the downstream service layer RPC-server through RPC-client, the connection pool in RPC-client will establish multiple connections with the downstream service. When the service becomes a bottleneck, just increase the number of servers and add new service deployments. Establishing a new downstream service connection at RPC-client can expand the service layer performance and achieve theoretically infinite high concurrency. If you need to automatically expand the service layer in an elegant way, you may need to support the automatic service discovery function in the configuration center.

4. Horizontal expansion of the data layer

When the amount of data is large, the data layer (cache, database) involves horizontal expansion of data, horizontally splitting the data (cache, database) originally stored on one server to different servers to expand system performance. the goal of.

There are several common horizontal splitting methods in the Internet data layer. Take the database as an example:

5. Split horizontally according to range

 

Each data service stores a certain range of data, as shown in the figure above:

user0 library, storage uid range 1-1kw

user1 library, storage uid range 1kw-2kw

The benefits of this solution are:

(1) The rules are simple, and the service only needs to determine the uid range to route to the corresponding storage service;

(2) The data is well balanced;

(3) It is relatively easy to expand, and you can add a data service with uid [2kw, 3kw] at any time;

The shortcomings are:

(1) The load of requests is not necessarily balanced. Generally speaking, newly registered users will be more active than old users, and the pressure on large-range service requests will be greater;


6. Split according to hash level

 

Each database stores part of the data hashed by a certain key value. The above figure is an example:

user0 library, stores even uid data

user1 library, stores odd uid data

The benefits of this solution are:

(1) The rules are simple, the service only needs to hash the uid to route to the corresponding storage service;

(2) The data is well balanced;

(3) Better uniformity is required;

The shortcomings are:

(1) It is not easy to expand. When expanding a data service and the hash method changes, data migration may be required;

What needs to be noted here is that expanding system performance through horizontal splitting is fundamentally different from expanding database performance by separating master-slave synchronous reading and writing.

Scale database performance with horizontal splits:

(1) The amount of data stored on each server is 1/n of the total amount, so the performance of a single machine will also be improved;

(2) The data on n servers have no intersection, and the union of the data on that server is the complete set of data;

(3) The data is horizontally split into n servers. Theoretically, the read performance is expanded by n times, and the write performance is also expanded by n times (in fact, it is far more than n times, because the data volume of a single machine becomes 1/n of the original);

Expand database performance through master-slave synchronization read and write separation:

(1) The amount of data stored on each server is the same as the total amount;

(2) The data on n servers are all the same, and they are all complete sets;

(3) Theoretically, the read performance is expanded n times, writing is still a single point, and the writing performance remains unchanged;


The horizontal splitting of the cache layer is similar to the horizontal splitting of the database layer. Most of them are range splitting and hash splitting, which will not be expanded.

Guess you like

Origin blog.csdn.net/m0_68949064/article/details/128905584