The role of load balancer in system design

When a website becomes very popular, the traffic on the website increases and the load on individual servers also increases. Concurrent traffic exceeds what a single server can handle, causing the website to become unresponsive to users. In order to handle these high data volume requests and return the correct response in a fast and reliable manner, we need to scale the server. This can be achieved by adding more servers to the network and distributing all requests to these servers. But... who decides which request is routed to which server...???

22f3266cba44943062cd9553c58f0e67.png
Load-Balancer-System-Design-Interview.png

The answer is... a load balancer . Let's understand the concept of load balancer in detail...

What is a load balancer?

The load balancer is like a "traffic policeman" standing in front of the servers, distributing client requests to all servers. It effectively distributes a set of request operations (such as database write requests, cache queries) across multiple servers and ensures that no single server is overwhelmed with requests, thereby degrading the overall performance of the application. A load balancer can be a physical device, a virtual instance running on dedicated hardware, or a software process.

Consider a scenario where an application runs on a single server and clients connect directly to that server without load balancing. It might look something like this...

74f580aac569b926562b32970bf206b0.jpeg

 

There are two main issues with this model that we need to discuss...

Single point of failure : If the server crashes or something goes wrong, the entire application will be disrupted and unavailable to users for a period of time. This creates a bad experience for the user, which is unacceptable for the service provider. Server overload : There is a limit to the number of requests a web server can handle. If the business grows and the number of requests increases, the server will be overloaded. In order to handle the increasing number of requests, we need to add more servers and distribute the requests to these server clusters.

In order to solve the above problem and distribute the number of requests, we can add a load balancer in front of the web server, by distributing the requests to multiple servers, allowing our service to handle any number by adding any number of web servers in the network request. We can spread requests across multiple servers. If for some reason one of the servers goes offline, service will still continue. Additionally, latency per request will be reduced because each server is no longer RAM/disk/CPU constrained.

d50f0437f07a2375ffdb9a391d9e6849.png
 

•Load balancer can minimize server response time and maximize throughput. •Load balancer ensures high availability and reliability by sending requests only to online servers. •The load balancer performs continuous health checks to monitor the server's ability to handle requests. •Based on the number of requests or demand, the load balancer adds or removes the number of servers.

Where are load balancers typically placed?

Here's an illustration of where a load balancer might be placed...

7f2cd87ea91be41f2d19e6f2d71d89b6.png
 

•Between client application/user and server •Between server and application/job server •Between application server and cache server •Between cache server and database server

Type of load balancer

We can achieve load balancing in three ways. They are...

1. Software load balancer in client

As the name suggests, all load balancing logic resides in the client application (such as a mobile app). The application will get a list of a set of web servers/application servers to interact with. The application selects the first server in the list and requests data from the server. If the failure persists (after a configurable number of retries) and the server becomes unavailable, the server is discarded and another server in the list is selected to continue the process. This is a cheaper way to implement load balancing.

2. Software load balancer in service

These load balancers are software that receive a set of requests and redirect those requests based on a set of rules. This load balancer offers greater flexibility as it can be installed on any standard device (such as a Windows or Linux machine). It's also cheaper because there's no physical equipment to buy or maintain, unlike a hardware load balancer. You can choose to use

There are off-the-shelf software load balancers, or you can write your own custom software (for example, to load balance Active Directory queries for Microsoft Office365).

3. Hardware load balancer

As the name suggests, we use physical devices to distribute traffic across a cluster of web servers. Also known as layer 4-7 routers, these load balancers can handle a variety of HTTP, HTTPS, TCP, and UDP traffic. HLD (Hardware Load Balancer) provides virtual server addresses to the outside world. When a request arrives from a client application, it forwards the connection to the most appropriate actual server, performing a bidirectional Network Address Translation (NAT). HLD can handle large volumes of traffic, but is expensive and has limited flexibility.

HLD will continuously perform health checks on each server to ensure that each server can respond correctly. If any server does not produce the expected response, it immediately stops sending traffic to the server. These load balancers are difficult to obtain and configure, which is why many service providers only use them as the first entry point for user requests. An internal software load balancer is then used to redirect data behind infrastructure walls.

Different categories of load balancing

Typically, load balancers are divided into three categories...

1. Layer 4 (L4) Load Balancer

In the OSI model, layer 4 is the transport layer (TCP/SSL), and routing decisions are made at this layer. A Layer 4 load balancer is also called a network load balancer , and as the name suggests, it utilizes network layer information to make traffic routing decisions. It can control millions of requests per second and handle all forms of TCP/UDP traffic. The decision will be based on the TCP or UDP port used by the packet and its source and destination IP addresses. A Layer 4 load balancer also performs network address translation (NAT) on request packets, but it does not examine the actual contents of each packet. This class of load balancers maximizes utilization and availability by distributing traffic across IP addresses, switches, and routers.

2. Layer 7 (L7) Load Balancer

Layer 7 load balancers are also known as application load balancers or HTTP(S) load balancers . This is one of the oldest forms of load balancing. In the OSI model, layer 7 is the application layer (HTTP/HTTPS), and routing decisions are performed at this layer. Layer 7 adds content switching for load balancing, which uses information such as HTTP headers, cookies, uniform resource identifiers, SSL session IDs, and HTML form data to decide which server to route the request to.

3. Global Server Load Balancing (GSLB)

Today, many applications are hosted in cloud data centers across multiple geographies. This is why many organizations are turning to different load balancers, which can deliver applications to any device or location with greater reliability and lower latency. With significant changes in load balancer capabilities, GSLB meets these expectations of IT organizations. GSLB extends the capabilities of L4 and L7 servers in different geographical locations and efficiently distributes large volumes of traffic to multiple data centers. It also ensures a consistent user experience as users navigate multiple applications and services in the digital workspace.

Load balancing algorithm

We need a load balancing algorithm to decide which request to redirect to which backend server. Different systems use different ways to select servers from a load balancer. Depending on the configuration, the company uses various load balancing algorithm techniques. Here are some common load balancing algorithms:

1. Round Robin

Requests are distributed among servers in a sequential or rotating fashion. For example, the first request is sent to the first server, the second request is sent to the second server, the third request is sent to the third server, and so on, all requests are processed. This approach is easy to implement, but it does not take into account the load on the server, so there is a risk that a server receives a large number of requests and becomes overloaded.

2. Weighted Round Robin

This is very similar to polling technology. The only difference is that each resource on the list is given a weighted score. Requests are distributed to these servers based on weighted scores. Therefore, in this approach, some servers get a larger share of the overall requests.

3. Least Connection Method

In this method, requests will be directed to the server with the smallest number of requests or active connections. In order to do this, the load balancer needs to do some additional calculations to identify the server with the smallest number of connections. This may be slightly more expensive than the polling method, but the evaluation is based on the current load on the server. This algorithm is useful when traffic is unevenly distributed among servers.

4. Least Response Time Method

This technique is more complex than the least join method

miscellaneous. In this approach, requests are forwarded to the server with the fewest active connections and the lowest average response time. The server's response time represents the load on the server and the overall expected user experience.

5. Source IP Hash

In this method, the request is sent to the server based on the client's IP address. The client's IP address and the IP address of the receiving compute instance will be calculated using a cryptographic algorithm.

More exciting~

Guess you like

Origin blog.csdn.net/weixin_37604985/article/details/132573645