Nginx Advanced

Nginx Advanced

1. Load balancing

1. Overview of load balancing

Load Balance can provide a cheap, effective and transparent method to expand the bandwidth of network devices and servers based on the existing network structure, and can increase throughput and enhance network data processing capabilities to a certain extent. , Improve network flexibility and availability, etc. According to the official website, it acts as a "traffic commander" in the network flow, "standing" in front of the server to handle all requests between the server and the client, thereby maximizing the response rate and capacity utilization, while ensuring No servers are overloaded. If a single server fails, the load balancing method redirects traffic to the remaining cluster servers to ensure service stability. When a new server is added to the server group, it can also start automatically processing requests from clients through negative balancing.

The role of load balancing :

  • Solve the pressure of high concurrency on servers and improve application processing performance
  • Provide failover and achieve high availability
  • Enhance website scalability by adding or reducing the number of servers
  • Filtering on the load balancer can improve the security of the system

2. Processing method

2.1 User manual selection

This method is more primitive than the original one. The main implementation method is to provide different lines and different service connection methods on the website homepage, allowing users to choose the specific server they visit to achieve load balancing.

For example, Lan Zuoyun

2.2 DNS polling

DNS: Domain Name System Service Protocol is a distributed network directory service, mainly used for the mutual conversion of domain names and IP addresses.

Most domain name registrars support adding multiple A records to the same host name. This is DNS polling. The DNS server randomly allocates resolution requests to different IPs in the order of the A records, so that simple load balancing can be completed. . The cost of DNS polling is very low and is often used on some unimportant servers.

Using DNS to implement polling does not require excessive investment. Although DNS polling is low-cost, DNS load balancing has obvious shortcomings:

  1. Low reliability

    Major broadband access providers store numerous DNS in cache to save access time, resulting in DNS not being updated in real time.

    Windows refresh DNS command:ipconfig/flushdns

  2. Unbalanced load balancing

    This will cause some servers to have a low load, while other servers have a high load, and the processing speed of requests is slow. Servers with high configurations are assigned fewer requests, while servers with low configurations are assigned more requests.

2.3 Four/seven layer load balancing

OSI, Open Systems Interconnection Model, is a network architecture specified by the international standard ISO that is not based on specific models, operating systems or formulas. This model divides the work of network communication into seven layers.

  • Four-layer load balancing: The transport layer in the OSI seven-layer model, mainly based on IP + PORT load balancing
  • Seven-layer load balancing: In the music layer, it is mainly based on virtual URL or host IP load balancing.

3. Seven-layer load balancing

Nginx needs to use proxy_pass proxy module configuration to achieve seven-layer load balancing. Nginx is installed by default to support this module, and we don’t need to do anything else. Nginx's load balancing is based on Nginx's reverse proxy and distributes user requests to a group of [upstream virtual service pools] according to the specified algorithm.

3.1 Seven-layer load balancing instructions
3.1.1 upstream

This directive is used to define a group of servers, which can be servers listening on different ports, or servers listening on both TCP and Unix Socket. The server can specify different weights, which default to 1.

grammar default value Location
upstream name {…} - http
3.1.2 server

This command is used to specify the name of the backend server and some parameters. You can use domain name, IP, port or Unix Socket.

grammar default value Location
server name [paramerters] - upstream
3.2 Implementation process

Usage example:

server {
	listen		9001;
    server_name 	localhost;
    default_type 	text/html;
    location / {
    	return 200 "port=9001";
    }
}
server {
	listen		9002;
    server_name 	localhost;
    default_type 	text/html;
    location / {
    	return 200 "port=9002";
    }
}
upstream backend {
	server localhost:9001;
    server localhost:9002;
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;  # backend 为服务组的名称
    }
}
3.3 Load balancing status

The status of the proxy server responsible for load balancing scheduling is as follows:

state describe
down The current server does not participate in load balancing for the time being.
backup Reserved backup server
max_fails Number of failed requests allowed
fail_timeout After max_fails failure, the service pause time
max_conns Limit the maximum number of receptions
3.3.1 down

Mark the server as permanently unavailable, then the server will not participate in load balancing

upstream backend {
	server localhost:9001 down;
    server localhost:9002;
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;  # backend 为服务组的名称
    }
}

This status is generally set for servers that need to be shut down for maintenance.

3.3.2 backup

Mark this server as a backup server that will be used to pass requests when the primary server is unavailable.

upstream backend {
	server localhost:9001 backup;  # 备份服务器
    server localhost:9002 down;  
    server localhost:9003;  # 主服务器
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;  # backend 为服务组的名称
    }
}

At this time, it is necessary to disable access to port 9003 to simulate the downtime of the only service that can provide external access. The backup server will begin to provide external services. At this time, in order to test and verify, we need to use a firewall to intercept.

At this time, we need to use the ufw tool to control the firewall:

sudo apt install ufw  # 安装 ufw
ufw enadble  # 开启防火墙
ufw deny 9003/tcp  # 关闭 9003 端口
ufw allow 9001:9002/tcp  # 允许开启 9001 和 9002
ufw allow http  # 开放http

Here I am using the ubuntu system

3.3.3 max_conns

max_conns=number: It is used to set the maximum number of simultaneous active connections of the proxy server. The default is 0, which means no limit. This configuration can be set according to the concurrent amount of requests processed by the back-end server to prevent the back-end server from being overwhelmed.

3.3.4 max_fails & fail_timeout

max_fails=number: Set the number of failed requests to the proxy server. The default is 1.

fail_timeout=time: Set the service suspension time after max_fails failure. The default is 10 seconds.

upstream backend {
	server localhost:9001 backup;  # 备份服务器
    server localhost:9002 down;  
    server localhost:9003 max_fails=3 fail_timeout=15;  # 主服务器
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;  # backend 为服务组的名称
    }
}
3.4 Load balancing strategy

After learning the relevant instructions of load balancing, we can already distribute user requests to different servers. So in addition to using the default distribution method, what kind of load algorithm can we use?

Algorithm name illustrate
polling Default mode
weight Weighting method
ip_hash According to ip allocation method
least_conn According to the least connection method
url_hash According to URL allocation method
fair According to response time method
3.4.1 Polling

It is the default strategy for load balancing of the upstream module. Each request is dispatched to a different backend server one by one in chronological order. Polling requires no additional configuration.

upstream backend {
	server localhost:9001; 
    server localhost:9002;  
    server localhost:9003;  
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;
    }
}
3.4.2 weight

weight=number: used to set the weight of the server, the default is 1. The larger the weight data, the greater the probability of being assigned to the request; this weight value is mainly adjusted for different back-end server hardware configurations in the actual working environment. , so this strategy is more suitable for situations where server hardware configurations are quite different.

upstream backend {
	server localhost:9001 weight=10;    
    server localhost:9002;  
    server localhost:9003; 
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;
    }
}
3.4.3 ip_hash

When multiple dynamic application servers are used for load balancing on the backend, the ip_hash command can locate a request from a certain client IP to the same backend server through a hash algorithm. In this way, when a user from a certain IP logs in on back-end Web server A, when he accesses other URLs of the site, he can still access back-end Web server A.

grammar default value Location
ip_hash; - upstream
upstream backend {
    ip_hash;
	server localhost:9001;    
    server localhost:9002;  
    server localhost:9003; 
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;
    }
}
3.4.4 least_conn

Least connections, the request is forwarded to the backend server with fewer connections. The polling algorithm forwards the request to each backend evenly so that their load is roughly the same; however, some requests take a long time, which will cause other The backend where it is located has a high load. In this case, least_conn can achieve better load balancing effect.

upstream backend {
    least_conn;
	server localhost:9001;    
    server localhost:9002;  
    server localhost:9003; 
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;
    }
}

This load balancing strategy is suitable for situations where varying request processing times cause server overload.

3.4.5 url_hash

Distribute requests according to the hash result of the accessed URL, so that each URL is directed to the same back-end server, and should be used in conjunction with cache hits. Multiple requests for the same resource may arrive at different servers, resulting in unnecessary multiple downloads, low cache hit rates, and a waste of resource time. Using url_hash, the same URL (that is, the same resource request) can reach the same server. Once the resource is cached, it can be read from the cache when a request is received.

upstream backend {
    url_hash &request_uri;
	server localhost:9001;    
    server localhost:9002;  
    server localhost:9003; 
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;
    }
}
3.4.6 fair

Fair does not use the polling balancing algorithm used by the built-in load balancing, but can intelligently perform load balancing based on the page size and the length of the loading event. So, how to use the fair load balancing strategy of third-party modules?

upstream backend {
    fair;
	server localhost:9001;    
    server localhost:9002;  
    server localhost:9003; 
}
server {
	listen  	80;
    server_name localhost;
    location /	{
    	proxy_pass http://backend;
    }
}

However, an error will be reported if used directly, because fair is a load balancing implemented by a third-party module. Need to add nginx-upstream-fair, how to add it?

  1. Download nginx-upstream-fairmodule: https://github.com/gnosek/nginx-upstream-fair.git

    cd
    git clone https://github.com/gnosek/nginx-upstream-fair.git
    
  2. Use ./confirgurethe command to add resources to the Nginx module

    nginx -V  # 复制原来的配置参数
    cd /root/nginx-1.22.1/  # 进入 Nginx 安装目录
    ./configure old_args -add-module=/root/nginx-upstream-fair  # 添加第三方模块
    make  # 进行编译
    mv objs/nginx /usr/local/nginx/sbin/  # 将编译后的 Nginx 替换原来的 Nginx
    make upgrade  # 进行在线升级
    

    During the compilation process, you will find that the compilation failed:

    We need to run:

    vim src/http/ngx_http_upstream.h
    

    Find the corresponding location, add the content in the picture, and then compile it again.

4. Four-layer load balancing

Nginx has added a stream module to implement forwarding, proxying, load balancing, etc. of the four-layer protocol. The usage of the stream module is similar to that of http. It allows us to configure a set of monitoring protocols such as TCP or UDP, then forward our requests through proxy_pass, and add multiple backend services through upstream to achieve load balancing.

The implementation of four-layer protocol load balancing generally uses LVS, HAProxy, F5, etc., but the configuration of Nginx is relatively simple and can complete the work quickly. Layer 4 load balancing is not often used, just understand it.

Nginx does not compile this module by default. If you need to use the stream module, you need to add it during compilation --with-stream. The steps are similar to adding a general module.

4.1 Four-layer load balancing instructions
4.1.1 stream

This directive provides the configuration file context in which the streaming server directive is specified, and is the http sibling.

grammar default value Location
stream{…} - main
4.1.2 upstream

This directive is similar to http's upstream directive.

4.2 Usage examples
stream {
    upstream redisbackend {
    	server localhost:6379;  # 第一台 Redis 服务器
        server localhost:6378;  # 第二台 Redis 服务器
    }
    server {
    	listen 81;  
        proxy_pass redisbackend;  
    }
}
stream {
    upstream flaskbackend {
    	server localhost:8080;  # flask 服务器
    }
    server {
    	listen 82;
        proxy_pass flaskbackend;
    }
}
http {
	server {
        listen       80;
        server_name  localhost;

        charset koi8-r;

        access_log  logs/host.access.log  main;

        location / {
            root   html;
            index  index.html index.htm;
        }
        
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }
    }
}

A port 81 is bound to two redis servers, and port 82 is bound to the flask server.

2. Nginx cache integration

1. The concept of cache

The cache is the buffer for data exchange. When the user wants to obtain data, he will first query and obtain the data from the cache. If it is in the cache, it will be returned directly to the user. If it is not in the cache, a request will be sent to requery from the server. Data, return the data to the user and put the data into the cache. The next time the user will get the data directly from the cache.

Caching application scenarios:

Scenes effect
Operating system disk cache Reduce disk mechanical operations
Database cache Reduce file system IO operations
application cache Reduce database queries
Web server cache Reduce the number of requests to the application server
browser cache Reduce the number of interactions with the backend

Advantages of caching:

  1. Reduce data transmission, save network traffic, speed up response and improve user experience
  2. Reduce server pressure
  3. Provide high availability on the server side

Disadvantages of caching:

  1. Data is inconsistent
  2. increase cost

2. Cache related instructions

Nginx's web caching server is mainly ngx_http_proxy_moduleimplemented using module-related instruction sets.

2.1 proxy_cache_path

This instruction is used to set the storage path of cache files.

grammar default value Location
proxy_cache_path path [levels=number]
keys_zone=zone_name:zone_size [inactive=time][max_size=size];
- http

path: cache path address, such as:/usr/local//proxy_cache

levels: Specify the directory corresponding to the cache space. Up to three levels can be set, and the value of each level is 1 | 2, such as:

levels=1:2  缓存空间有两层目录,第一次是 1 个字母,第二次是 2 个字母  
# 最终存储路径为 /usr/local//proxy_cache/d/07

keys_zone: used to set the name and specified size for this cache area

inactive: The specified cached coral orange will be deleted if it has not been accessed for many times.

max_size: Set the maximum cache space. If the cache space is full, the resource with the longest cache time will be overwritten by default.

2.2 proxy_cache

This command is used to enable or disable proxy caching. If enabled, customize which cache area to use for caching.

grammar default value Location
proxy_cache zone_name | off; proxy_cache off; http、server、location

zone_name: Specifies the name of the cache area to use

2.3 proxy_cache_key

This command is used to set the Key value of the Web cache. Nginx will store the cache according to the MD5 hash of the Key value.

grammar default value Location
proxy_cache_key key; proxy_cache s c h e m e scheme schemeproxy_host$request_uri; http、server、location
2.4 proxy_cache_valid

This directive is used to set different cache times for URLs with different return status codes.

grammar default value Location
proxy_cache_valid [code …] time; - http、server、location
2.5 proxy_cache_min_uses

该指令用来设置资源被访问多少次后被缓存

语法 默认值 位置
proxy_cache_min_uses number; proxy_cache_min_uses 1; http、server、location
2.6 proxy_cache_methods

该指令用户设置缓存哪些 HTTP 方法

语法 默认值 位置
proxy_cache_methods GET | HEAD | POST; proxy_cache_methods GET HESAD; http、server、location

默认缓存 HTTP 的 GET 和 HEAD 方法,不缓存 POST 方法。

3、 Nginx 缓存的清除

3.1 删除缓存目录
rm -rf /usr/local/proxy_cache/...
3.2 使用扩展模块

使用模块ngx_cache_purge

  1. 下载资源包:https://github.com/FRiCKLE/ngx_cache_purge.git

    git clone https://github.com/FRiCKLE/ngx_cache_purge.git
    
  2. 进行安装

    nginx -V  # 查询 Nginx 的配置参数
    ./configure old_args --add-module=/root/nginx-1.22.1/modules/ngx_cache_purge
    make  
    mv objs/nginx /usr/local/nginx/sbin/
    make upgrade
    
语法 默认值 位置
proxy_cache_purge zone_name proxy_cache_key - http、server、location

具体的使用方式可以查看:https://github.com/FRiCKLE/ngx_cache_purge

4、 设置资源不缓存

并不是所有的数据都适合缓存,比如说一些经常发生变化的数据。

这时候需要使用到如下两个指令:

  1. proxy_no_cache

    该指令是用来定义不将数据进行缓存的条件。

    语法 默认值 位置
    proxy_no_cache string …; - http、server、location
    proxy_no_cache $cookie_nocache $arg_nocache $arg_comment;
    
    • $cookie_nocache

      指的是当前请求的 Cookie 中键的名称为 nocache 对应的值

    • $arg_nocache

      指的是当前请求的参数中属性名为 nocache 的值

    • $arg_comment

      指的是当前请求的参数中属性名为 comment 的值

  2. proxy_cache_bypass

    该指令是用来设置不从缓存中获取数据的条件

    语法 默认值 位置
    proxy_cache_bypass string …; - http、server、location
    proxy_cache_bypass $cookie_nocache $arg_nocache $arg_comment;
    

上述两个指令都有一个指定条件,这个条件可以是多个,并且多个条件中至少有一个不为空且不等于 0,则条件满足成立。

使用示例:

http{
    proxy_cache_path /usr/local/nginx/proxy_cache levels=2:1 keys_zone=temp:200m inactive=1d max_szie=20g;  
    upstream backend {
    	server localhost:8080;
    }
    server {
        location / {
            # 如果请求的文件为 js 文件,不进行缓存
            if ($request_uri ~ /.*\.js$) {
                set $nocache 1;
            }
            proxy_cache temp;  
            proxy_cache_key temp_;
            proxy_cache_min_uses 5;  
            proxy_cache_valid 200 5d;
            proxy_cache_valid 404 30s;
            proxy_cache_valid any 1m;
            proxy_no_cache $nocache;
            proxy_cache_bypass $nocache;
            proxy_pass http://backend/;
          }
        location /purge(/.*) {
        	proxy_cache_purge temp  temp_;
        }
    }
}

三、 Nginx 实现动静分离

1、 概念

什么是动静分离?

  • 动:后台应用程序的业务处理
  • 静:网站静态资源
  • 分离:将两者进行分开部署访问,提供用户进行访问。举例说明就是以后所有和静态资源相关的内容都交给 Nginx 来部署访问,非静态内容则交给类似于 Flask 的服务器来部署访问。

为什么要静态分离?

  • Nginx 在处理静态资源的时候,效率非常高,而且 Nginx 的并发访问量也是名列前茅的,而 Flask 等服务器则相对比较弱一些,所以把静态文件交给 Nginx 后,可以减轻 Flask 服务器的访问压力并提高静态资源的访问速度。
  • 动静分离后,减低了动态资源和静态资源的耦合度。如果动态资源宕机了也不影响动态资源的展示。

如果实现动静分离?

  • 实现动静分离的方式很多,比如,静态资源化可以部署到 CDN、Nginx 等服务器上,动态资源可以部署到 Flask 等上面,这里使用Nginx + Flask 来实现动静分离。

2、 需求分析

我们需要将动态的链接发送到 Flask,静态的访问直接通过 Nginx 来进行处理。

http {
    include       mime.types;
    default_type  application/octet-stream;

    sendfile        on;

    keepalive_timeout  65;
    upstream flaskservice{
        server localhost:8080;
    }

    server {
        listen       80;
        server_name  localhost;
        location ~/.*\.(html|htm|js|css|ico|png|jpg|gif|svg|txt) {
                root /root/www/static;
        }
        location /demo{
                proxy_pass http://flaskservice;
        }
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }
    }
}

四、 Nginx 实例

1、 Nginx 制作下载站点

首先,我们要清除什么是下载站点?

  • 我们先来看一下网站:https://nginx.org/download/
  • 这个网站主要就是用来提供用户来下载相关资源的网站,就叫做下载站点。

如何制作下载站点?

  • Nginx 使用的是模块 ngx_http_autoindex_module来实现的,该模块处理以斜杆结尾的请求,并生成目录列表。
  • Nginx 编译的时候会自动加载该模块,但是该模块默认是关闭的,我们需要使用下面的指令来完成对应的配置

指令:

  1. autoindex:启用或禁用目录列表输出

    语法 默认值 位置
    autoindex on | off; autoindex off; http、server、location
  2. autoindex_exact_size:对应 HTML 格式,指定是否在目录列表展示文件的详细大小

    默认为 on,显示出文件的确切大小,单位是 bytes;改为 off 后,显示出文件的大概大小,单位是 kb 或者 Mb 或者 GB。

  3. autoindex_format:设置目录列表的格式

    语法 默认值 位置
    autoindex_format html | xml | json | jsonp; autoindex_format html; http、server、location
  4. autoindex_localtime:对应 HTML 格式,是否在目录列表上显示时间

    默认为 off ,显示的文件时间为 GMT 时间;改为 on 后,显示的文件时间为文件的服务器时间。

location /download {
    root /root/www;
	autoindex on;
    autoindex_localtime on;
}

在 www 目录下有一个download 目录;同时,由于在root 目录下,需要在全局块中添加 user root;

2、 用户认证

对于系统资源的访问,我们往往需要限制谁能访问,谁不能访问。这块就是我们通常所说的认证部分,认证需要做的就是根据用户输入的用户名和密码来判定用户是否为合法用户,如果是则放行,如果不是则拒绝访问。

Nginx 对应用户认证是通过 ngx_http_auth_basic_module模块来实现的,它允许通过使用“ HTTP 基本身份验证”协议验证用户名和密码来限制对资源的访问。默认情况下 Nginx 是已经安装了这个模块。如果没有,则使用 --without-http_auth_basic_module

该模块的指令比较简单:

  1. auth_basic:使用“ HTTP 基本认证”协议启用用户名和密码的验证

    语法 默认值 位置
    auth_basic string | off; auth_basic off; http、server、location、limit_except

    开启后,服务端会返回 401,指定的字符串会返回到客户端,给用户以提示信息,但是不同的浏览器对内容的展示不一致。

  2. auth_basic_user_file:指定用户名和密码所在的文件

    语法 默认值 位置
    auth_basic_user_file file; - http、server、location、limit_except

    指定文件路径,还文件中的用户名和密码的设置,密码需要进行加密。可以采用工具自动生成

# 下载站点的相关配置
location /download {
    root /root/www;
    autoindex on;
    autoindex_localtime on;
    auth_basic "please inpute your auth";
    auth_basic_user_file htpasswd;
}

生成用户名和密码的文件:

# 我们使用 htpasswd 工具生成
yum install -y httpd-tools

cd /usr/local/nginx/conf
htpasswd -c htpasswd username  # 创建一个新文件记录用户名和密码
htpasswd -b htpasswd username password  # 在指定文件新增一个用户名和密码
htpasswd -D htpasswd username  # 从指定文件删除一个用户信息
htpasswd -v htpasswd username  # 验证用户名和密码是否正确

Guess you like

Origin blog.csdn.net/qq_62789540/article/details/128661324