Nginx Epoll Redis network

Nginx

Load balancing

The principle of nginx

Nginx uses a multi-process (single thread) & multiple IO multiplexing model

Insert picture description here
1. After Nginx is started, there will be a master process and multiple independent worker processes.
2. The master process receives signals from the outside world and sends signals to each worker process. Each process may handle this connection.
3. Master The process can monitor the running status of the worker process. When the worker process exits (under abnormal conditions), a new worker process will be automatically started

  • The number of worker processes is generally set to the number of machine cpu cores. Because more workers will only cause the processes to compete with each other for cpu, which will lead to unnecessary context switching
  • Using multi-process mode, not only can improve the concurrency rate, but also the processes are independent of each other, and the hang of one worker process will not affect other worker processes

Detailed Nginx process

After Nginx is started, there will be a master process and multiple worker processes

master process

Mainly used to manage the worker process, including: receiving signals from the outside world, sending signals to each worker process, monitoring the running status of the worker process, when the worker process exits (under abnormal conditions), the new worker process will be restarted automatically

The master process acts as an interactive interface between the entire process group and the user, while monitoring the process. It does not need to deal with network events, is not responsible for the execution of the business, and only realizes the functions of restarting the service, smoothly upgrading, changing the log file, and real-time effect of the configuration file by managing the worker process.

worker process

Basic network events are handled in the worker process. Multiple worker processes are peer-to-peer, they compete equally for requests from clients, and each process is independent of each other. A request can only be processed in one worker process, and a worker process cannot process requests from other processes.

When we provide http service on port 80, a connection request comes, and each process may handle the connection. How can this be done? First, each worker process forks from the master process. In the master process, the socket (listenfd) that needs to be listened is established first, and then multiple worker processes are fork. The listenfd of all worker processes will become readable when a new connection arrives. To ensure that only one process handles the connection, all worker processes grab accept_mutex before registering listenfd read events, and the process that grabs the mutex lock registers listenfd read events. Call accept in the read event to accept the connection. When a worker process accepts the connection, it starts to read the request, parse the request, process the request, generate data, and then return to the client, and finally disconnect, such a complete request is like this

How does Nginx achieve high concurrency

Asynchronous, non-blocking, using epoll and a lot of underlying code to optimize
Nginx's asynchronous non-blocking working method to make use of the waiting time. When there is a need to wait, these processes are idle and stand by, so it appears that a few processes solve a large number of concurrency problems.

Every request comes in, a worker process will handle it. But it is not the whole process. To what extent? Process to the place where blocking may occur, such as forwarding the request to the upstream (back-end) server and waiting for the request to return. Then, this worker is very smart. After sending the request, he will register an event: "If the upstream returns, tell me, I will continue." So he went to rest. At this point, if another request comes in, he can process it in this way soon. Once the upstream server returns, this event will be triggered, the worker will take over, and the request will go down.

In this way, based on multi-process +epoll, Nginx can achieve high concurrency

Nginx load balancing strategy

Polling default method
weight weight method
ip_hash according to ip distribution method
least_conn least connection method

Shock group phenomenon

The main process (master process) first creates a sock file descriptor through socket() to monitor, then fork generates a child process (workers process), the child process will inherit the sockfd (socket file descriptor) of the parent process, and then the child process After accept(), a connected descriptor will be created, and then communicate with the client through the connected descriptor.
Then, since all the child processes inherit the sockfd of the parent process, when the connection comes in, all the child processes will be notified and "compete" to establish a connection with it, which is called the "shock group phenomenon". A large number of processes are activated and suspended, only one process can accept() to this connection, which of course consumes system resources

Nginx's treatment of shock group phenomenon

Nginx provides an accept_mutex, which is a shared lock added to accept. That is, each worker process needs to acquire the lock before executing accept, and if not, give up executing accept(). With this lock, only one process will go to accpet() at the same time, so that there will be no surprise group problem. accept_mutex is a controllable option, we can turn it off explicitly, and it is turned on by default

nginx configuration

ningx.conf configuration file example

#user administrator administrators;  #配置用户或者组,默认为nobody nobody。
#worker_processes 2;  #允许生成的进程数,默认为1
#pid /nginx/pid/nginx.pid;   #指定nginx进程运行文件存放地址
error_log log/error.log debug;  #制定日志路径,级别。这个设置可以放入全局块,http块,server块,级别以此为:debug|info|notice|warn|error|crit|alert|emerg
events {
    accept_mutex on;   #设置网路连接序列化,防止惊群现象发生,默认为on
    multi_accept on;  #设置一个进程是否同时接受多个网络连接,默认为off
    #use epoll;      #事件驱动模型,select|poll|kqueue|epoll|resig|/dev/poll|eventport
    worker_connections  1024;    #最大连接数,默认为512
}
http {
    include       mime.types;   #文件扩展名与文件类型映射表
    default_type  application/octet-stream; #默认文件类型,默认为text/plain
    #access_log off; #取消服务日志    
    log_format myFormat '$remote_addr–$remote_user [$time_local] $request $status $body_bytes_sent $http_referer $http_user_agent $http_x_forwarded_for'; #自定义格式
    access_log log/access.log myFormat;  #combined为日志格式的默认值
    sendfile on;   #允许sendfile方式传输文件,默认为off,可以在http块,server块,location块。
    sendfile_max_chunk 100k;  #每个进程每次调用传输数量不能大于设定的值,默认为0,即不设上限。
    keepalive_timeout 65;  #连接超时时间,默认为75s,可以在http,server,location块。
    upstream mysvr {   
      server 127.0.0.1:7878;
      server 192.168.10.121:3333 backup;  #热备
    }
    error_page 404 https://www.baidu.com; #错误页
    server {
        keepalive_requests 120; #单连接请求上限次数。
        listen       4545;   #监听端口
        server_name  127.0.0.1;   #监听地址       
        location  ~*^.+$ {       #请求的url过滤,正则匹配,~为区分大小写,~*为不区分大小写。
           #root path;  #根目录
           #index vv.txt;  #设置默认页
           proxy_pass  http://mysvr;  #请求转向mysvr 定义的服务器列表
           deny 127.0.0.1;  #拒绝的ip
           allow 172.18.5.54; #允许的ip           
        } 
    }
}

location match

The matched url can be truncated and sent to other IP addresses such as this machine through proxy_pass and other operations

grammar

location [= | ~ | * | ^] uri {…}
=: Exact match (must be all equal)
~: Case sensitive
*: Ignore case
^: Only match the uri part
@: Internal service jump

=, exact match
location = / {
#规则
}
则匹配到 http://www.example.com/ 这种请求。
~, case sensitive
location ~ /Example/ {
        #规则
}
#请求示例
#http://www.example.com/Example/  [成功]
#http://www.example.com/example/  [失败]
~*, ignore case
location ~* /Example/ {
            #规则
}
# 则会忽略 uri 部分的大小写
#http://www.example.com/Example/  [成功]
#http://www.example.com/example/  [成功]
^~, only matches beginning with uri
location ^~ /img/ {
        #规则
}
#以 /img/ 开头的请求,都会匹配上
#http://www.example.com/img/a.jpg   [成功]
#http://www.example.com/img/b.mp4 [成功]
@, nginx internal jump
location /img/ {
    error_page 404 @img_err;
}

location @img_err {
    # 规则
}
#以 /img/ 开头的请求,如果链接的状态为 404。则会匹配到 @img_err 这条规则上。

rewrite url

usage

The Rewrite (URL rewrite) command can appear under server{} or under location{}. For the rewrite instruction that appears under server{}, its execution will be before the location match; for the rewrite instruction that appears under location{}, its execution is of course after the location is matched, but the URI of the HTTP request occurs due to rewrite After the change, the URI after rewrite under location{} needs to be re-matched to location, just like a new HTTP request.
Example:

location  /bbb.html {
            rewrite "^/bbb\.html$" /ccc.html;
}

upstream

upstream backend {
    sticky;     # or simple round-robin
    server 172.29.88.226:8080 weight=2;
    server 172.29.88.226:8081 weight=1 max_fails=2 fail_timeout=30s ;
    server 172.29.88.227:8080 weight=1 max_fails=2 fail_timeout=30s ;
    server 172.29.88.227:8081;
    check interval=5000 rise=2 fall=3 timeout=1000 type=http;
    check_http_send "HEAD / HTTP/1.0\r\n\r\n";
    check_http_expect_alive http_2xx http_3xx;
}
server {
    location / {
        proxy_pass http://backend;
    }
    location /status {
        check_status;
        access_log   off;
        allow 172.29.73.23;
        deny all;
    }

The check command can only appear in the upstream, and the abnormal back-end server can be checked, so that subsequent requests will not be forwarded:

  • interval: The interval of health check packets sent to the backend.
  • fall: If the number of consecutive failures reaches fall_count, the server is considered down.
  • rise: If the number of consecutive successes reaches rise_count, the server is considered up.
  • timeout: The timeout period of the back-end health request.
  • type: the type of the health check package. The following types of
    tcp are now supported : simple tcp connection. If the connection is successful, the backend is normal.
    http: Send an HTTP request, and judge whether the backend is alive based on the status of the backend's reply packet.
    ajp: Send the Cping packet of the AJP protocol to the backend, and judge whether the backend is alive by receiving the Cpong packet.
    ssl_hello: Send an initial SSL hello packet and accept the server's SSL hello packet.
    mysql: Connect to the mysql server and judge whether the backend is alive by receiving the greeting packet from the server.
    fastcgi: send a fastcgi request, and judge whether the backend is alive by receiving and parsing the fastcgi response

If the type is http, you can also use check_http_send to configure the request content sent by the http monitoring check package. In order to reduce the amount of transmitted data, the HEAD method is recommended. When a long connection is used for health check, the keep-alive request header needs to be added to this command, such as: HEAD / HTTP/1.1\r\nConnection: keep-alive\r\n\r\n. When the GET method is used, the size of the request uri should not be too large to ensure that the transmission can be completed within 1 interval, otherwise it will be regarded as a back-end server or network abnormality by the health check module

check_http_expect_alive specifies the success status of the HTTP reply. By default, the status of 2XX and 3XX is considered healthy

Common optimization configurations of Nginx

  1. worker_processes The
    number of workers to be generated by Nginx. The best practice is to run 1 worker process per CPU
  2. Maximize the
    number of clients that the worker_connections Nginx Web server can provide services at the same time. When combined with worker_processes, get the maximum number of
    clients that can be served per second. Maximum number of clients/sec = worker processes * worker connections.
    In order to maximize the full potential of Nginx, worker connections should be set to allow the core to run at one time Maximum number of processes 1024
  3. Enable caching for static files To enable caching
    for static files to reduce bandwidth and improve performance, you can add the following command to limit the computer to cache static files of web pages
location ~* .(jpg|jpeg|png|gif|ico|css|js)$ {  
expires 365d;  
}
  1. Timeouts
    keepalive connections reduce the CPU and network overhead required to open and close connections, and variables that need to be adjusted for optimal performance
  2. access_logs
    access log records, it records every nginx request, so it consumes a lot of CPU resources, thereby reducing nginx performance

proxy_set_header

proxy_set_header is used to redefine the request header sent to the back-end server.
Syntax format:

proxy_set_header Field Value;

Value can contain text, variables, or a combination of them. Common settings such as:
proxy_set_header Host $proxy_host; The request header will be redefined as the host of the new forwarding address
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $ host; its value is in The value of the "Host" field when the request contains the "Host" request header. Equivalent to $http_host, when the request does not carry the "Host" request header, it is the main domain name of the virtual host
proxy_set_header Host $http_host; the request header is set to http host, and it will not be changed after forwarding

Case

Question: nginx is equipped with aaa.example.com virtual host, now it is necessary to transfer the request to http://aaa.example.com/api/xx/client/ to http://bbb.example.com/api /xx/client/, the virtual host of bbb.example.com is on another nginx

So add configuration

location ~ ^/api/([0-9]+)(\.[0-9]+)*/client/ {
    proxy_pass http://bbb.example.com;
}

However, it reported 404.
Solution:

location ~ ^/api/([0-9]+)(\.[0-9]+)*/client/ {
    proxy_pass http://bbb.example.com;
    proxy_set_header Host $proxy_host;
}

Reason: It is
found that proxy_set_header Host httphost; is configured in the nigxn file ; when Host is set to http_host;, when Host is set tohttphost;, When H O S T set is set as the time HTTP_HOST, the value of the request header is not changed, so when the time to be forwarded to the bbb.example.com, or aaa.example.com the Host request header information, there will be problems; When Host is set to $proxy_host, the Host information with the request header bbb.example.com will be reset

Common commands

Start nginx ./sbin/nginx
stop nginx ./sbin/nginx -s stop ./sbin/nginx -s quit
reload configuration ./sbin/nginx -s reload (GR) service nginx reload
check the configuration file is correct ./ What
are $1, $2, and $3 in sbin/nginx -t nginx? $1 represents the first parameter matched by the regular expression in the path

Reference blog: http://www.ha97.com/5194.html

Epoll

reference

select

int s = socket(AF_INET, SOCK_STREAM, 0);  
bind(s, ...)
listen(s, ...)

int fds[] =  存放需要监听的socket

while(1){
    int n = select(..., fds, ...)
    for(int i=0; i < fds.count; i++){
        if(FD_ISSET(fds[i], ...)){
            //fds[i]的数据处理
        }
    }
}

Insert picture description here
The program monitors the three sockets of sock1, sock2, and sock3 as shown in the figure below at the same time. After calling select, the operating system adds process A to the waiting queue of these three sockets. When any socket receives data, the interrupt program will evoke the process.
When process A is awakened, the process is removed from all waiting queues and added to the work queue. It knows that at least one socket has received data. The program only needs to traverse the socket list once to get the ready socket

epoll

int s = socket(AF_INET, SOCK_STREAM, 0);   
bind(s, ...)
listen(s, ...)

int epfd = epoll_create(...);
epoll_ctl(epfd, ...); //将所有需要监听的socket添加到epfd中

while(1){
    int n = epoll_wait(...)
    for(接收到数据的socket){
        //处理
    }
}

Insert picture description here
The kernel has an eventpoll object, and like socket, it also has a waiting queue inside. When adding monitoring of a socket, the kernel puts the eventpoll object reference into the waiting queue of the socket, and the process object is added to the waiting queue of the eventpoll object.
When the socket receives data, the interrupt program will add to the ready queue rdlist of eventpoll. "Socket" reference, on the other hand, wake up eventpoll waiting for the process in the queue, process A enters the running state again (as shown in the figure below). Also because of the existence of rdlist, process A can know which sockets have changed. The
eventpoll object is equivalent to the intermediary between the socket and the process. The data reception of the socket does not directly affect the process, but changes the state of the process by changing the ready list of eventpoll.
Insert picture description here

Redis

Cache penetration

Cache penetration refers to data that
does not exist in the cache and the database, but the request will be sent to the database every time. This phenomenon in which no data exists in the query is called cache penetration.

Solution

  1. Cache empty values ​​and set expiration time
  2. Add verification at the interface layer, such as user authentication verification, id for basic verification, and direct interception if id<=0

Cache breakdown

In an ordinary high-concurrency system, when a large number of requests query a key at the same time, the key just fails at this time, which will cause a large number of requests to hit the database. This phenomenon is called cache breakdown. It
will cause too much database request at a certain moment, and the pressure will increase sharply.

Solution

  1. Set hotspot data to never expire
  2. Add mutex

Cache avalanche

Cache avalanche refers to the large amount of data in the cache until the expiration time, and the huge amount of query data causes excessive pressure on the database or even downtime. Unlike cache breakdown, cache breakdown refers to checking the same piece of data concurrently. Cache avalanche means that different data has expired, and many data cannot be found, so check the database.

Solution

  1. The expiration time of cached data is set randomly to prevent a large amount of data from expiring at the same time
  2. If the cache database is a distributed deployment, evenly distribute the hot data in different cache databases
  3. Set hotspot data to never expire

The internet

Three handshake and four waved hands

Insert picture description here
First, the Client sends a connection request message, and the Server section replies with an ACK message after accepting the connection, and allocates resources for this connection. After receiving the ACK message, the Client also sends an ACK message to the Server segment and allocates resources, so that the TCP connection is established

Assume that the client initiates a connection request, that is, sends a FIN message. After the server receives the FIN message, it means "my client has no data to send to you", but if you still have data that has not been sent, you don't need to close the Socket in a hurry and you can continue to send data. So you send an ACK first, "Tell the client that I have received your request, but I am not ready yet, please continue to wait for my message." At this time, the client enters the FIN_WAIT state and continues to wait for the FIN message from the server. When the server side determines that the data has been sent, it sends a FIN message to the client side, "Tell the client side, OK, I have finished sending the data here, and I am ready to close the connection". After the client receives the FIN message, "it knows that the connection can be closed, but he still doesn't believe in the network, because the server does not know to close it, so it enters the TIME_WAIT state after sending the ACK. If the server does not receive the ACK, it can restart Pass." After the server receives the ACK, "you know you can disconnect." The client side waits for 2MSL and still does not receive a reply, it proves that the server side has been closed normally, so good, my client side can also close the connection. Ok, the TCP connection is closed like this

Guess you like

Origin blog.csdn.net/modouwu/article/details/113079858