Internet Service Architecture Sharing (1)

This note is mainly about the access layer. Based on our actual project, our experience is shallow and there may be something wrong. I hope to point it out. Write back the notes related to the logic layer and storage layer next time.


First of all, what is our purpose?

  • Guaranteed 7*24 hours availability of the system
  • Guaranteed user access response time
  • Ensure system security
  • Unified access layer to standardize application system deployment

  Generally speaking, it is enough to have Nginx, but when the traffic has a certain scale, such an architecture becomes unstable, such as: user response time is too long, occasional paralysis or something. Therefore, in order to improve availability and overall throughput, the access layer will be introduced, where LVS four-layer load balancing is used, DNS is resolved to the IP address of LVS, then LVS is forwarded to Nginx, and finally to the back-end RS. As shown in the figure:

Many companies start from a single point service, that is, one service does everything -> later service isolation, introducing Nginx to ensure high availability -> Nginx/LVS/F5/HAProxy for load balancing, as well as shunting and peak shaving Filling the valley, hotspot cache, these things will be covered a lot in the next note.
Let's look at this next:

  After reading this picture, you may be speechless, why do you do this? LVS+HAProxy+Nginx seems to be able to do the same thing. The three most widely used load balancing software are all used by you, right? In fact, this layer of architecture system is also iterated by the operation and maintenance department according to the company's specific business traffic.

  First look at the access process:

  1. The user enters the domain name in the browser and presses Enter
  2. If there is no ip address corresponding to the domain name in the local cache (browser cache + local cache, enter in Google Chrome: chrome://net-internals/#dns to see the cached page, the local cache is in hosts)
  3. Then go to DNS for domain name resolution, DNS will give the domain name resolution right to the CDN dedicated DNS server pointed to by CNAME
  4. In order to obtain the actual IP, the browser initiates a content URL access request to the CDN's global load balancing device.
  5. The global load balancing DNS returns the IP of the closest node at that time to the user's browser according to the user's IP (the URL-related policy set in the CDN is not considered here)
  6. The browser gets the IP, accesses it, and then goes to the LVS layer.

  We know that LVS is at four layers, and HAProxy is based on four layers and seven layers, and provides a comprehensive solution for load balancing of TCP and HTTP applications. Here we use seven layers. Nginx seven layers. Based on their various characteristics, you can go to Google, and I will not make detailed notes here. The load of Layer 4 is based on the IP address (VIP) of Layer 3 + the port number of Layer 4. We actually only achieved high availability and did not perform load balancing. The real load was placed on HAProxy, and one thing LVS did was Distribution means shunting, accessing different businesses and taking different services. When the traffic increases later, the load can be considered, and it can also be easily expanded, that is, it has strong scalability. One thing to realize is that the response to the request does not go through LVS . The project implements LVS/DR+Keepalived to achieve dual-machine hot backup.

  HAProxy implements load balancing, strategy: static-rr polls according to weight, HTTP reverse proxy, setting link rejection, virtual host, session retention, and its monitoring service, we developed an implementation when the service is abnormal or downtime SMS and email will be sent to the operation and maintenance department. Of course, the same is true for business services, but there is a more robust monitoring system. In terms of actual efficiency, it is much better than Nginx, and its stability is close to F5.

  The implementation of Nginx is that one Nginx corresponds to a Tomcat service. Nginx and Tomact are on the same server, and each service has two to prepare each other to improve availability. The specific functions are as follows: cache function, increase hit rate, and reduce back-to-source. Interface access filtering policy, each interface will have information such as version number and token. Logging function, access log, the importance of logs is self-evident in this age of information explosion; error log, correct detection can not only find bottlenecks in service performance, but also find problems in time, and log segmentation. Here you can directly build the ELK platform to query and display logs in a friendly way.

  The above content is just why we do it, not that we should do it or that we only use some of their functions, and there may be some things that are not well understood. The architecture scheme used by the company is to use different technologies according to the scale and different stages of the website. You only have dozens of QPS, is it necessary to use hardware load F5, Radware, Array, NetScaler? If you have money, you also need to start from reality, otherwise what will we optimize in the later stage?

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325038126&siteId=291194637