Designing an architectural idea that can withstand tens of millions of traffic

With the development of the Internet, the number of customers of various software is increasing. When the number of customers reaches a certain peak, when tens of thousands of traffic arrive, the smooth operation of the program and instant response are particularly important, just like the double 11 day. the same as Taobao. So, how to design the architecture to be able to resist the tens of millions of traffic.

First, we need to establish some principles when we design our architecture.

1. Achieve high concurrency

Service splitting: Split the entire project into multiple sub-projects or modules, divide and conquer, and expand the project horizontally.

Servicization: Solve the problem of service registration and discovery after complex service calls.

Message queue: decoupling, asynchronous processing

Cache: Concurrency brought by various caches

2. Achieve high availability

Clustering, throttling, downgrade

3. Business Design

Idempotency: It means that the result of one request or multiple requests initiated by the user for the same operation is consistent, and there will be no side effects due to multiple clicks. Just like the number 1 in mathematics, the result of any power is 1. To give the simplest example, that is payment. The user pays after purchasing the product, and the payment is deducted successfully, but when the result is returned, the network is abnormal. At this time, the money has been deducted. The user clicks the button again, and the second deduction will be made. The money is returned, the result is returned successfully, the user inquires the balance and finds that more money has been deducted, and the running water records also become two. . .

For details, refer to "Idempotency in Distributed" https://www.cnblogs.com/vveiliang/p/6643874.html

Anti-duplication: prevent the same data from being submitted at the same time

In addition to the restriction that you cannot continue to click after judging the business direction and button clicks, you can also prevent weights on the server side:

Generate a unique random identification number (Token<token>) on the server side and save the token in the current user's Session field, then send the token to the client's form, and use the hidden field in the form to This Token is stored, and the Token is submitted to the server together with China Unicom when the form is submitted, and then the server determines whether the Token submitted by the client is consistent with the Token generated by the server. Do not process the form submitted repeatedly, if it is the same, process the form, and clear the identification number stored in the session field of the current user after processing.

The server program will refuse to process user-submitted form requests in the following cases:

1) The Token stored in the Session domain is inconsistent with the Token submitted by the form.
2) There is no Token in the current user's Session.
3) There is no Token in the form data submitted by the user.

state machine

The concept of state machine in software design generally refers to finite state machine (English: finite-state machine, abbreviation: FSM), also known as finite state automaton, state machine for short, which represents a finite number of states and transitions between these states and mathematical models of behavior such as actions.

Here we focus on the concept and example of current limiting

Purpose of throttling

The purpose of throttling is to protect the availability of the system by limiting the rate of concurrent access/requests or requests within a time window. Once the rate limit is reached, service can be denied. Just like the pre-sale of mobile phones, if you want to sell 30,000 units, you only need to receive requests from 30,000 users. Other user requests can be filtered, and the prompt "The current server is too busy, please try again later" can be displayed. .

Current limiting method:

1. Limit the number of instantaneous concurrency: For example, at the entry layer (nginx_http_limit_conn_module is added to nginx) to limit the number of connections from the same ip source to prevent malicious attacks.

2. Limit the total concurrency: Constrain the total concurrency by configuring the database connection pool and thread pool size

3. Limit the average rate in the time window: At the interface level, the concurrent requests of the interface are controlled by limiting the access rate.

4. Other methods: limit the call rate of the remote interface and limit the consumption rate of MQ.

Common Current Limiting Algorithms

1. Sliding window protocol: A common flow control technique used to improve throughput.

The origin of the sliding window protocol:

Sliding window is a flow control technique. In the early network communication, the two communicating parties did not consider the congestion of the network to send data directly. Since everyone does not know the network congestion status and sends data at the same time, the intermediate nodes block packets and no one can send data, so there is a sliding window mechanism to solve this problem. Both sender and receiver maintain a sequence of data frames, which is called a window.

Definition: Sliding Window Protocol, an application of the TCP protocol, is used for flow control during network data transmission to avoid congestion. This protocol allows the sender to send multiple data packets before stopping and waiting for an acknowledgment. Since the sender does not have to stop and wait for an acknowledgment every time a packet is sent, the protocol can speed up data transmission and improve network throughput.　

Sending window: It is the sequence number table of the frames that the sender allows to send continuously. The sender can send data continuously without waiting for a response (which can be controlled by setting the size of the window)

Receive window: The sequence list of frames that the receiver is allowed to receive. All frames that fall within the receive window must be processed by the receiver, and frames that fall outside the receive window will be discarded. The number of frames that the receiver allows to receive each time is called the size of the receiving window　　

Demo address: https://media.pearsoncmg.com/aw/ecs_kurose_compnetwork_7/cw/content/interactiveanimations/selective-repeat-protocol/index.html

2. Leaky bucket: The leaky bucket algorithm can forcibly limit the data transmission rate .

The idea of the leaky bucket algorithm is very simple. The request enters the leaky bucket first, and the leaky bucket discharges water at a certain speed. When the water request overflows directly, it can be seen that the leaky bucket algorithm can forcibly limit the data transmission rate. The entry end does not need to consider the rate of the water exit, just like the mq message queue, the provider only needs to pass the message into the queue, and does not need to care whether the consumer has received the message.

For the overflowed water, that is, the filtered data, it can be directly discarded, or it can be temporarily saved in some way, such as adding to the queue, like the four processing mechanisms for overflowing data in the thread pool

3. Token Bucket: It is a rate -controlled current limiting algorithm.

For many application scenarios, in addition to being able to limit the average transmission rate of data, it is also required to allow some degree of burst transmission. At this time, the leaky bucket algorithm may not be suitable, and the token bucket algorithm is more suitable. The principle of the token bucket algorithm is that the system will put tokens into the bucket at a constant speed, and if the request needs to be processed, it needs to obtain a token from the bucket first. When there is no token available in the bucket, then Denial of service.

Set Rate = 2: the number of tokens put in per second

Bucket size: 100

Here is a small demo to implement the token bucket

public class TokenDemo {

    //qps: the number of requests processed per second; tps: the number of transactions processed per second
    //representing qps is 10;
    RateLimiter rateLimiter = RateLimiter.create(10);

    public void doSomething(){
        if (rateLimiter.tryAcquire()){
            //Try to get the token. If true, the token is acquired successfully
            System.out.println("normal processing");
        }else{
            System.out.println("processing failed");
        }

    }

    public static void main(String args[]) throws IOException{
        /*
        * CountDownLatch is implemented by a counter, the initial value of the counter is the number of threads, this value is the number of operations the thread will wait for (the number of threads) .
        * When a thread waits in order to perform these operations, it uses the await() method.
        * This method puts the thread to sleep until the operation is complete.
        * When an operation ends, it uses the countDown() method to decrement the internal counter of the CountDownLatch class, and the value of the counter is decremented by 1.
        * When the counter reaches 0, it means that all threads have completed the task, this class will wake up all the threads that sleep using the await() method to resume the task.
        *
        * */
        CountDownLatch latch = new CountDownLatch(1);
        Random random = new Random(10);
        TokenDemo tokenDemo = new TokenDemo();
        for (int i=0;i<20;i++){
            new Thread(()- >{
                try {
                    latch.await();
                    Thread.sleep(random.nextInt(1000));
                    tokenDemo.doSomething();
                }catch (InterruptedException e){
                    e.printStackTrace();
                }

            }).start();
        }
        latch.countDown();
        System.in.read();
    }

}

Results of the:

normal processing
normal processing
normal
processing
normal processing
processing failure
normal processing
processing failure
processing failure
processing
normal processing
processing failure
normal processing
processing failure
normal processing
normal processing
normal processing
normal processing
processing failure
processing failure

It can be seen that when the token is insufficient, the token acquisition will fail to achieve the effect of current limiting.

4. Counter: The simplest kind. By controlling the number of requests within a time period.

Source: http://www.cnblogs.com/GodHeng/p/8834810.html

Designing an architectural idea that can withstand tens of millions of traffic

Guess you like