Understanding Load Balancing Sessions to Keep Session Synchronized

One, what is load balancing

A new website should not do load balancing, because the traffic is not large and the traffic is not large, so there is no need to do these things. However, with the rapid growth of website traffic and traffic, a single server is limited by its own hardware conditions, and it is difficult to withstand such a large amount of traffic. In this case, there are two options: 
1. Update the hardware of a single server, from dual-core to quad-core, and increase the memory. 
2. Increase the number of servers to share the burden of the servers. In order to achieve the purpose of increasing the network bandwidth and increasing the processing capacity of the server.

 

The first approach can be understood as vertical development, which is always limited. 
The second method is the correct choice 
to solve the problem. The load balancing method can be divided into two directions. One is to use software to achieve load balancing, and the other is to use hardware to achieve load balancing (including combining hardware and software). Using software to achieve load balancing, the process of realizing load balancing also consumes some system resources and increases the response time. For example: LVS, nginx, haproxy, apache and other load balancing software based on the application layer, suitable for those websites that do not have a particularly large number of visits. If a website with a large number of visits such as sina and 163 is used, it is the most obvious choice to use hardware to achieve load balancing.

There are many load balancing algorithms, some are based on the number of requests, some are based on root IP, some are based on traffic, and so on. There are two algorithms I often use.

One is that according to the number of requests 
a, each server can share the client's request more evenly, and if one of the servers goes down, it will not cause any bad impact. 
b. To synchronize the state between servers, such as session, other means are needed to synchronize these states.

One is that according to IP 
a, the ip_hash algorithm can map an ip to a server, which can solve the problem of session synchronization 
b. The disadvantage of ip_hash is that if one of the servers is down, it will be mapped to this one. The users of the server are depressed. 
c, ip_hash can easily lead to unbalanced load. Now the crab government filters google search keywords. You will often find that google cannot be opened, but it will be fine after a while. This makes those google lovers depressed, many users have gone abroad to find agents, and people are anxious to jump over the wall. If so, these proxies will be assigned to the same server, which will lead to unbalanced load or even failure.

Second, what is session retention and what does it do?

Session retention means that there is a mechanism on the load balancer, which ensures that the access requests related to the same user will be allocated to the same server while performing load balancing.

What is the role of session retention? For example, 
if a user access request is assigned to server A, and he logs in at server A, and in a very short time, the user makes another request, if there is no session retention function , the user's request is likely to be assigned to server B. At this time, there is no login on server B, so you have to log in again, but the user does not know where his request is assigned. The user's feeling is I have logged in, why do I have to log in again, the user experience is very bad. 
And when you buy something on Taobao, log in = "take something =" add address = "pay, this is a series of processes, it can also be understood as an operation process, all this series of operation processes should be controlled by It is done by one server and cannot be distributed to different servers by the load balancer.

There is a time limit for session retention (except for servers that are mapped to a fixed one, such as: ip_hash). Various load balancing tools will provide this session retention time setting, LVS, apache, etc. Even the php language provides the setting of session retention time. session.gc_maxlifetime The setting of session retention time is greater than the setting of session lifetime, which can reduce the need to synchronize sessions, but it cannot be eliminated. So synchronization session still needs to be done.

Three, session synchronization

Why do you need to synchronize the session, it has been mentioned when the session is maintained. For specific methods, please refer to the three methods of session synchronization in web clustering

3 methods of session synchronization in web cluster

After building a web cluster, you will definitely consider the issue of session synchronization first, because after load balancing, the same IP access to the same page will be assigned to different servers, if the session is not synchronized, a logged in user, a while It is in the logged-in state, and it is not in the logged-in state for a while. Therefore, this article gives three different methods to solve this problem according to this situation:  1. I did not use this method when doing multi-server session synchronization
by using database synchronization session 
. If I have to use this method, I think There are two methods: 
a. Use a low-end computer to build a database to store the session of the web server, or, build this special database on the file server. When users access the web server, they will go to this special database to check the session. In order to achieve the purpose of session synchronization. 
b. This method is to put the session table and other database tables together. If mysql is also clustered, each mysql node must have this table, and the data table of this session table must be synchronized in real time . 
Note: Using the database to synchronize the session will increase the burden on the database. The database is a place that is prone to bottlenecks. If the session is also placed in the database, it will undoubtedly make things worse. The above two methods, the first method is better, separate the table where the session is placed, which reduces the burden of the real database

2. Use cookie to synchronize session 
session is stored on the server in the form of a file, and a cookie is stored on the client in the form of a file. How to achieve synchronization? The method is very simple, that is, put the session generated by the user visiting the page into the cookie, that is, use the cookie as a transfer station. You visit web server A, generate a session and put it in the cookie. Your visit is assigned to web server B. At this time, web server B first determines whether the server has this session. If not, it will look at the client's cookie. Whether there is this session in it, if not, it means that the session really does not exist. If there is one in the cookie, synchronize the session in the cookie to the web server B, so that the synchronization of the session can be realized.

Description: This method is simple and convenient to implement, and will not increase the burden on the database, but if the client disables cookies, the session will not be synchronized, which will bring losses to the website; the security of cookies Not high, although it has been encrypted, it can still be forged.

3. Use memcache to synchronize sessions 
Memcache can be distributed. Without this function, it cannot be used for session synchronization. He can combine the memory in the web server to become a "memory pool", no matter which server generates the sessoin can be put into this "memory pool", and others can be used.

Advantages: Synchronizing sessions in this way will not increase the burden on the database, and the security is greatly improved than using cookies. Putting sessions in memory is much faster than reading from files. 
Disadvantages: memcache divides memory into storage blocks of various specifications, and there are blocks with sizes. This method also determines that memcache cannot fully utilize memory, resulting in memory fragmentation. If there are insufficient storage blocks, memory overflow will occur.

Fourth, summary

The above three methods are all feasible. The 
first method, the one that affects the system speed the most, is not recommended; the 
second method has good results, but the same security risks exist; the 
third method, I personally think the third method The method is the best and is recommended for everyone to use;

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327005211&siteId=291194637