How does nginx configuration request forwarding location and rewrite rules?

location  = / {
  # 精确匹配 / ,主机名后面不能带任何字符串
  [ configuration A ] 
}

location  / {
  # 因为所有的地址都以 / 开头,所以这条规则将匹配到所有请求
  # 但是正则和最长字符串会优先匹配
  [ configuration B ] 
}

location /documents/ {
  # 匹配任何以 /documents/ 开头的地址,匹配符合以后,还要继续往下搜索
  # 只有后面的正则表达式没有匹配到时,这一条才会采用这一条
  [ configuration C ] 
}

location ~ /documents/Abc {
  # 匹配任何以 /documents/ 开头的地址,匹配符合以后,还要继续往下搜索
  # 只有后面的正则表达式没有匹配到时,这一条才会采用这一条
  [ configuration CC ] 
}

location ^~ /images/ {
  # 匹配任何以 /images/ 开头的地址,匹配符合以后,停止往下搜索正则,采用这一条。
  [ configuration D ] 
}

location ~* \.(gif|jpg|jpeg)$ {
  # 匹配所有以 gif,jpg或jpeg 结尾的请求
  # 然而,所有请求 /images/ 下的图片会被 config D 处理,因为 ^~ 到达不了这一条正则
  [ configuration E ] 
}

location /images/ {
  # 字符匹配到 /images/,继续往下,会发现 ^~ 存在
  [ configuration F ] 
}

location /images/abc {
  # 最长字符匹配到 /images/abc,继续往下,会发现 ^~ 存在
  # F与G的放置顺序是没有关系的
  [ configuration G ] 
}

location ~ /images/abc/ {
  # 只有去掉 config D 才有效:先最长匹配 config G 开头的地址,继续往下搜索,匹配到这一条正则,采用
    [ configuration H ] 
}

location ~* /js/.*/\.js

  • Is =beginning exact matching
    , such as the request A only matches the end of the root directory, the back can not take any string.
  • ^~ The beginning means that the uri starts with a regular string, not a regular match
  • The beginning of ~ indicates case-sensitive regular matching;
  • The beginning of ~* indicates case-insensitive regular matching
  • / General match, if there is no other match, any request will be matched

Order no priority:
(location =)> (location full path)> (location ^~ path)> (location ~,~* regular order)> (location part starting path)> (/)

The above matching result
According to the above location writing, the following matching example is established:

  • / -> config A
    matches exactly, even if /index.html does not match
  • /downloads/download.html -> After config B
    matches B, there is no match further down, use B
  • /images/1.gif -> configuration D
    matches F, goes down to D, stops going down
  • /images/abc/def -> config D is the
    longest match to G, goes down to match D, stop going down,
    you can see that anything starting with /images/ will match to D and stop, FG written here is meaningless Yes, H will never turn, here is just to illustrate the matching order
  • /documents/document.html -> config C
    matches C, and there is no match down, use C
  • /documents/1.jpg -> configuration E
    matches C, and goes down to match E
  • /documents/Abc.jpg -> config CC is the
    longest match to C, and the regular sequence is matched down to CC, not down to E

Practical recommendations

所以实际使用中,个人觉得至少有三个匹配规则定义,如下:
#直接匹配网站根,通过域名访问网站首页比较频繁,使用这个会加速处理,官网如是说。
#这里是直接转发给后端应用服务器了,也可以是一个静态首页
# 第一个必选规则
location = / {
    proxy_pass http://tomcat:8080/index
}
# 第二个必选规则是处理静态文件请求,这是nginx作为http服务器的强项
# 有两种配置模式,目录匹配或后缀匹配,任选其一或搭配使用
location ^~ /static/ {
    root /webroot/static/;
}
location ~* \.(gif|jpg|jpeg|png|css|js|ico)$ {
    root /webroot/res/;
}
#第三个规则就是通用规则,用来转发动态请求到后端应用服务器
#非静态文件请求就默认是动态请求,自己根据实际把握
#毕竟目前的一些框架的流行,带.php,.jsp后缀的情况很少了
location / {
    proxy_pass http://tomcat:8080/
}

Rewrite rules

The rewrite function is to use global variables provided by nginx or variables set by yourself, combined with regular expressions and flags to achieve URL rewriting and redirection. Rewrite can only be placed in server{}, location{}, if{}, and can only work on the string after the domain name except for the passed parameters, for example,  http://seanlook.com/a/we/index.php?id=1&u=str only rewrite /a/we/index.php. grammarrewrite regex replacement [flag];

If the relative domain name or parameter string works, you can use global variable matching or proxy_pass reverse proxy.

It shows that the functions of rewrite and location are a bit similar, and both can achieve jump. The main difference is that rewrite is to change the path to obtain resources within the same domain name, while location is to control access or reverse proxy for a type of path, and proxy_pass to other machines. . In many cases, rewrite will also be written in location, and their execution order is:

  1. Execute the rewrite instruction of the server block
  2. Perform location matching
  3. Execute the rewrite command in the selected location

If one of the URIs is rewritten, repeat 1-3 until a real file is found; if the loop exceeds 10 times, a 500 Internal Server Error error will be returned.

flag

  • last : It is equivalent to the [L] mark of Apache, indicating that the rewrite is completed
  • break : Stop executing the subsequent rewrite instruction set of the current virtual host
  • redirect : Return to 302 temporary redirect, the address bar will display the redirected address
  • permanent : Return to 301 permanent redirect, the address bar will display the redirected address

Because 301 and 302 can't simply return the status code, they must also have a redirect URL. This is why the return command cannot return 301,302. The difference between last and break here is a bit hard to understand:

  1. Last is generally written in server and if, while break is generally used in location
  2. Last does not terminate the rewritten URL matching, that is, the new URL will go through the matching process from the server again, and break terminates the rewritten matching
  3. Both break and last can organize and continue to execute the rewrite instructions that follow

if instruction and global variables

The
syntax of if judgment instruction is if(condition){...}to judge the given condition. If it is true, the rewrite instruction in the braces will be executed, and the if condition (conditon) can be any of the following:

  • When the expression is just a variable, if the value is empty or any string starting with 0 will be regarded as false
  • When directly comparing variables and content, use =or!=
  • ~Regular expression matching, ~*case-insensitive match, !~case-sensitive mismatch

-fAnd !-fused to determine whether there is a file
-dand !-dused to determine whether a directory exists
-eand !-eused to determine whether a file or directory exists
-xand !-xused to determine whether a file is executable

E.g:

if ($http_user_agent ~ MSIE) {
    rewrite ^(.*)$ /msie/$1 break;
} //如果UA包含"MSIE",rewrite请求到/msid/目录下

if ($http_cookie ~* "id=([^;]+)(?:;|$)") {
    set $id $1;
 } //如果cookie匹配正则,设置变量$id等于正则引用部分

if ($request_method = POST) {
    return 405;
} //如果提交方法为POST,则返回状态405(Method not allowed)。return不能返回301,302

if ($slow) {
    limit_rate 10k;
} //限速,$slow可以通过 set 指令设置

if (!-f $request_filename){
    break;
    proxy_pass  http://127.0.0.1; 
} //如果请求的文件名不存在,则反向代理到localhost 。这里的break也是停止rewrite检查

if ($args ~ post=140){
    rewrite ^ http://example.com/ permanent;
} //如果query string中包含"post=140",永久重定向到example.com

location ~* \.(gif|jpg|png|swf|flv)$ {
    valid_referers none blocked www.jefflei.com www.leizhenfang.com;
    if ($invalid_referer) {
        return 404;
    } //防盗链
}

Global variables
The following are global variables that can be used as if judgments

  • $args : #This variable is equal to the parameter in the request line, the same$query_string
  • $content_length : Content-length field in the request header.
  • $content_type : The Content-Type field in the request header.
  • $document_root : The value specified in the root directive is currently requested.
  • $host : Request the host header field, otherwise it is the server name.
  • $http_user_agent : Client agent information
  • $http_cookie : Client cookie information
  • $limit_rate : This variable can limit the connection rate.
  • $request_method : The action requested by the client, usually GET or POST.
  • $remote_addr : The IP address of the client.
  • $remote_port : The port of the client.
  • $remote_user : The user name that has been verified by the Auth Basic Module.
  • $request_filename : The file path of the current request, generated by the root or alias command and URI request.
  • $scheme : HTTP method (such as http, https).
  • $server_protocol : The protocol used by the request, usually HTTP/1.0 or HTTP/1.1.
  • $server_addr : Server address, this value can be determined after completing a system call.
  • $server_name : server nickname.
  • $server_port : The port number of the request to reach the server.
  • $request_uri : The original URI that contains the request parameters, does not contain the host name, such as: "/foo/bar.php?arg=baz".
  • $uri : The current URI without request parameters, $uri does not contain the host name, such as "/foo/bar.html".
  • $document_uri : Same as $uri.

例:http://localhost:88/test1/test2/test.php
$host:localhost
$server_port:88
$request_uri:http://localhost:88/test1/test2/test.php
$document_uri:/test1/test2/test.php
$document_root:/var/www/html
$request_filename:/var/www/html/test1/test2/test.php

Commonly used regular

  • . : Match any character except newline
  • ? : Repeat 0 times or 1 time
  • + : Repeat 1 or more times
  • * : Repeat 0 or more times
  • \d : Match numbers
  • ^ : Match the beginning of the string
  • $ : Introduction to matching strings
  • {n} : Repeat n times
  • {n,} : Repeat n times or more
  • [c] : Matches a single character c
  • [a-z] : Match any one of az lowercase letters

()The content matched between the parentheses can $1be quoted later , which $2means ()the content in the second one . What is confusing in regular is the \escape of special characters.

rewrite instance

Example 1 :

http {
    # 定义image日志格式
    log_format imagelog '[$time_local] ' $image_file ' ' $image_type ' ' $body_bytes_sent ' ' $status;
    # 开启重写日志
    rewrite_log on;

    server {
        root /home/www;

        location / {
                # 重写规则信息
                error_log logs/rewrite.log notice; 
                # 注意这里要用‘’单引号引起来,避免{}
                rewrite '^/images/([a-z]{2})/([a-z0-9]{5})/(.*)\.(png|jpg|gif)$' /data?file=$3.$4;
                # 注意不能在上面这条规则后面加上“last”参数,否则下面的set指令不会执行
                set $image_file $3;
                set $image_type $4;
        }

        location /data {
                # 指定针对图片的日志格式,来分析图片类型和大小
                access_log logs/images.log mian;
                root /data/images;
                # 应用前面定义的变量。判断首先文件在不在,不在再判断目录在不在,如果还不在就跳转到最后一个url里
                try_files /$arg_file /image404.html;
        }
        location = /image404.html {
                # 图片不存在返回特定的信息
                return 404 "image not found\n";
        }
}

For /images/ef/uh7b3/test.pngthe request of the form , rewrite to /data?file=test.png, then match location /data, first check whether the /data/images/test.pngfile exists, if it exists, respond normally, if it does not exist, rewrite tryfiles to the new image404 location, and return the 404 status code directly.

Example 2 :

rewrite ^/images/(.*)_(\d+)x(\d+)\.(png|jpg|gif)$ /resizer/$1.$4?width=$2&height=$3? last;

For /images/bla_500x400.jpga file request in the shape of a file, it is rewritten to the /resizer/bla.jpg?width=500&height=400address and will continue to try to match the location.

Guess you like

Origin blog.csdn.net/wx_15323880413/article/details/108265000