35 | Traffic scheduling and load balancing

Compared with desktop programs, the basic software that server programs rely on is not only the operating system and programming language, but also two more categories:

Load Balance;

Database or other form of storage (DB/Storage).

Why is Load Balance needed? Today we will talk about traffic scheduling and load balancing.

In the previous lecture, we drew the architecture diagram of the server program, as follows:

What is "traffic scheduling"? We first need to understand the concepts related to several common server program running instances (processes):

Number of connections;

IOPS;

Traffic, inbound traffic and outbound traffic.

We know that a service request of a basic server program usually consists of a request packet (Request) and a response packet (Response). This question and answer is a complete service.

The number of connections, sometimes also called the number of concurrency, refers to the number of requests in the service at the same time. That is, the number of requests that have sent a request (Request) but have not yet received a response (Response).

IOPS refers to the average number of requests (one question and one answer) completed per second. It can be used to judge the efficiency of the server program.

Traffic is divided into inbound traffic and outbound traffic. Inbound traffic can be estimated as follows:

Average number of request packets (Requests) received per second * Average size of request packets.

Similarly, outbound traffic can be estimated as follows:

Average number of response packets (Response) returned per second * Average size of response packets.

Ignoring the existence of invalid request packets, that is, there are questions and no answers (but there must be some in actual production environments), then the average number of request packets (Requests) received per second, and the average responses returned per second The number of packages (Response) is IOPS. Therefore:

Inbound traffic ≈ IOPS * average request packet size

Outbound traffic ≈ IOPS * average response packet size

The so-called traffic scheduling is the process of allocating massive customer concurrent request packets to different server program instances according to specific strategies.

There are many ways to do traffic scheduling.

DNS traffic scheduling

The most basic way is through DNS, as shown in the figure below.

A domain name is resolved to multiple IPs through DNS, and each IP corresponds to a different server program instance. This completes traffic scheduling. Here we did not use conventional load balancing (Load Balance) software, but we did complete traffic scheduling.

So what are the disadvantages of this approach?

The first problem is the inconvenience of upgrading.

To upgrade the server program instance corresponding to IP1, we must first remove IP1 from DNS resolution. When the IP1 instance has no traffic, then we upgrade the instance, and finally add IP1 back to DNS resolution.

It looks fine, but let's not forget that DNS resolution is buffered. We remove IP1 from DNS resolution. Even if we specify a TTL of 15 minutes, there may still be sparse user requests being sent to the IP1 instance after a day.

Therefore, there is great uncertainty in upgrading by adjusting DNS resolution, and the upgrade cycle to complete an instance is extremely long.

If one instance upgrade takes 1 day and we have 10 instances in total, it will take 10 days. This is an exaggeration.

The second problem is unbalanced traffic scheduling.

DNS servers are capable of doing certain traffic balancing. For example, the first domain name resolution returns IP1 first, the second domain name resolution gives IP2 priority, and so on. It can return the IP list in a balanced manner based on domain name resolution.

However, domain name resolution balance does not represent true traffic balance.

On the one hand, not every user request will correspond to a DNS resolution, the client has its own cache. On the other hand, DNS resolution itself also has layers of caching, and the proportion to the DNS server is already very small.

Therefore, under such circumstances, traffic scheduling and balancing based on domain name resolution is very rough, and the actual results are uncontrollable.

So, how can traffic scheduling be truly balanced?

Network layer load balancing

The first approach is to perform load balancing at the network layer (IP layer).

The load balancing software LVS (Linux Virtual Server) initiated by Dr. Zhang Wensong works on this layer. Let’s take LVS as a representative to introduce the working principle.

LVS supports three scheduling modes.

VS/NAT: Scheduling is done through Network Address Translation (NAT) technology. Both requests and responses will be transferred through the scheduler, resulting in the worst performance.

VS/TUN: Forward the request message to the real server through the IP tunnel, and the real server returns the response directly to the client, so the scheduler only processes the request message. This approach performs much better than VS/NAT.

VS/DR: By rewriting the MAC address of the request message, the request is sent to the real server, and the real server returns the response directly to the client. Compared with VS/TUN, this approach reduces the IP tunnel overhead and has the best performance.

We focus on introducing VS/DR technology.

As shown in FIG. Let the client’s IP and MAC be CIP and CMAC.

Step 1: The client initiates a request. In its IP packet, the source IP is the user's CIP and the destination IP is VIP; the source MAC address is CMAC and the destination MAC address is DMAC.

Step 2, the request packet arrives at the LVS scheduler (Director Server). We keep the source IP and destination IP unchanged, only modify the destination MAC address to RMAC, and forward the request to the real business server instance RS (Real Server).

Step 3: RS receives the data packet and processes it, and sends a direct response to the client.

The key trick here is that VIP is bound to multiple machines, so we call it virtual IP. It is bound to both the LVS scheduler (Director Server) and all business server instances RS (Real Server).

Of course, a very important detail here is, what does the ARP broadcast query for the MAC address corresponding to the VIP get? The answer is of course LVS Scheduler (Director Server). On the real business server instance RS (Real Server), we bind the VIP to the lo interface and suppress ARP requests, thus avoiding IP conflicts.

LVS does load balancing at the bottom of the network layer. Compared with other load balancing technologies, it is characterized by strong versatility and high performance advantages.

But it also has some disadvantages. If a certain business server instance RS hangs, but the LVS scheduler (Director Server) has not yet sensed it, requests forwarded to the instance within this short period will fail. Such failures can only be resolved by relying on client retries.

Application layer load balancing

Is there a way to avoid this request failure?

Can. The answer is: server retry.

How to do server-side retry? Application layer load balancing. Sometimes we also call it an application gateway.

HTTP protocol is the most widely used application layer protocol. Most of the current application gateways are HTTP application gateways.

Nginx and Apache are the most familiar HTTP application gateways. HTTP application gateways are often very powerful because they know the details of the application layer protocol. We will discuss this further later. Today we will first talk about load balancing (Load Balance).

After the HTTP gateway receives an HTTP request (Request), it forwards the request to the back-end real business server instance RS (Real Server) according to a certain scheduling algorithm. After receiving the response (Response) from RS, it forwards it to the client. .

The logic of the whole process is very simple, and retrying is also very easy to do.

After discovering that an RS instance is down, the HTTP gateway can resend the same HTTP request (Request) to other RS ​​instances.

Of course, an important detail is that in order to support retries, the HTTP request (Request) needs to be saved. It is possible to retry the HTTP request without saving it, but it can only support the scenario where the business instance completely fails and no byte of the HTTP request is sent. However, in situations such as power outages or abnormal crashes, there will obviously be many ongoing requests that do not meet this premise, and they will not be able to be retried.

Most HTTP requests are not large and can be stored directly in memory, and the storage cost is not high. However, for file upload requests, since the request package contains file content, you may need to rely on temporary files or other means to save the HTTP request.

Elegant upgrade

With load balancing, not only can the balanced scheduling of traffic be achieved, but the upgrade of business servers will also be much more convenient.

For scenarios where the front end is network layer load balancing such as LVS, the core steps for upgrading are:

The upgrade system notifies the LVS scheduler (Director Server) to offline the business server (Real Server) instance to be upgraded.

The LVS scheduler (Director Server) removes the instance from the RS set so that new traffic is no longer scheduled to it.

The upgrade system notifies the RS instance to be upgraded to exit.

The RS instance to be upgraded handles all pending requests and then actively exits.

Upgrade the system to update the RS instance to the new version and restart.

The upgrade system adds the RS instance back to the RS collection to participate in scheduling.

For load balancing scenarios where the front end is an HTTP application gateway, the upgrade process can be simpler:

The upgrade system notifies the upgraded business server (Real Server) instance to exit.

The RS instance to be upgraded enters the exit state. When new requests come in, they are directly rejected (a special Status Code is returned); after all processing requests are processed, the RS instance actively exits.

Upgrade the system to update the RS instance to the new version and restart.

It can be seen that because the HTTP application gateway supports retry, the upgrade process of the business server becomes much simpler.

Conclusion

Today we started with traffic scheduling and talked about several typical scheduling methods and load balancing methods.

From the perspective of traffic scheduling, the greatest value of load balancing is to balance the pressure of multiple business servers. An implicit premise here is that load balancing software is often much more resistant to pressure than business servers.

This is manifested in: first, the number of load balancing instances/the number of business server instances is often much less than 1; second, DNS scheduling is unbalanced, so the pressure on different instances of load balancing is uneven, and some instances may be under great pressure. .

Of course, the value of load balancing is not just balanced scheduling of traffic, it also makes it possible to upgrade our business servers gracefully.

Guess you like

Origin blog.csdn.net/qq_37756660/article/details/134974307