Due to the recent use of nginx online, I am not familiar with its structure, so I made a detailed analysis and study summary of it, which borrowed from
https://www.zybuluo.com/phper/note/89391
,
http://www.zybuluo.com/phper/note/89391 seanlook.com/2015/05/17/nginx-install-and-config/
(very well written).
About the agent
Forward proxy: that is, a proxy. Its working principle is a springboard. I want to access google.com, but I cannot access it directly. I can visit A, A can access google.com, and A can be used as a positive proxy at this time. to the agent. A forward proxy is a server that sits between a client and an origin server (such as google), then the proxy forwards the request to the origin server and returns the obtained content to the client.
Reverse proxy: On the contrary, it is like an origin server to the client, and the client does not need to do any special settings, the client sends a normal request to the content in the namespace of the reverse proxy, and then the reverse proxy Will determine where to forward the request (origin server) and return the obtained content to the client as if this part of the content was its own.
Nginx is a very good reverse proxy server. The configuration file in nginx is mainly nginx.conf. Its structure is relatively clear and simple, roughly divided into blocks:
main (global setting) events (nginx working mode) { .... } http { .... upstream myproject (load balancer settings) { ..... } server (host settings) { .... location (url match) { .... } } server { .... location { .... } } .... }
The comment lines in the nginx file start with #, and the main properties of the main part are configured as follows:
user nobody worker_processes auto; pid logs/nginx.pid; worker_cpu_affinity auto;
The user running nginx, which also determines the read and write permissions of the file system that the process can access.
In the linux/unix system, the general process will write the pid of the process to a disk file, which has only one line, which can be viewed with the cat command. Using the file as the process lock can prevent the process from starting multiple copies. Only the process that has the permission to write the pid (fixed path fixed file name) can start normally and write its own pid to the file. The process exits automatically.
For worker_processes, it should be the number of worker role processes. After nginx starts, there will be multiple workers processing http requests. The master role process does not process requests, but manages worker processes according to the corresponding configuration file information. The master process is mainly responsible for receiving client requests. , and distribute the request to multiple workers, each worker process is responsible for actually handling the request.
The optimal worker_processes value depends on many factors, including but not limited to the number of cpu cores. If set to auto, it will automatically detect cpu cores and set the worker_processes parameter. If nginx is dealing with cpu-intensive operations that consume more cpu, set this value to the number of cpus or the number of cpu cores.
The events module is used to specify the working mode of nginx and the upper limit of the number of connections
events { use kqueue; #mac worker_connections 51200; }
use is used to specify the working mode of nginx, which supports: select, poll, kqueue, epoll, rtsig, /dev/poll, among which select and poll are standard working modes, kqueue and epoll are efficient working modes, different It is epoll used on the Linux platform, and kqueue used in the BSD system, because the Mac is based on BSD, so the Mac must also use this mode, for the Linux system, the epoll working mode is the first choice.
worker_connections, the maximum number of connections (including all connections) that each worker process can process (initiated) concurrently. It should be noted that this number includes all connections, the number of connections with the proxied service, with other roles, not only with the client, another thing to note is that this actual value cannot exceed the current maximum. Number of open files (worker_rlimit_nofile).
The maximum number of connections of a process is limited by the maximum number of open files of a Linux system process. The setting of worker_connections takes effect only after the operating system command "ulimit -n 65536" is executed.
The http module is the core module, responsible for the configuration of http server-related properties:
include mime.types; default_type application/octet-stream; #log_format main '$remote_addr - $remote_user [$time_local] "$request" ' # '$status $body_bytes_sent "$http_referer" ' # '"$http_user_agent" "$http_x_forwarded_for"'; access_log on; log_format main '[$time_local] $remote_addr - $remote_user - $server_name to: $upstream_addr: $request upstream_response_time $upstream_response_time msec $msec request_time $request_time'; access_log /usr/local/etc/nginx/logs/access.log main; sendfile on; #tcp_nopush on; #keepalive_timeout 0; keepalive_timeout 65; #gzip on;
Include is used to set the mime type of the file. The type is defined in the mime.type file in the configuration file directory to tell nginx to identify the file type.
default_type sets the default type as binary stream, that is, when the file type is undefined, this method is used. For example, when the locate environment of asp is not configured, Nginx will not parse it. At this time, use the browser to access the asp file Download will appear.
sendfile on can enable efficient file transfer mode, and set the tcp_nopush and tcp_nodelay attributes to on to prevent network congestion.
keepalive_timeout sets the timeout period for the client connection to keep alive, after which the server will close the connection.
The server module is within the scope of the http module and is a sub-module used to define a virtual host. A simple server definition instance:
server { #Define virtual host start listen 8080; #Specify the service port of the virtual machine host server_name localhost 192.168.1.130 www.aaa.com; #Specify ip address or domain name, multiple domain names are separated by spaces # Global definition, if it is all in this directory, this is the easiest definition. root /Users/yangyi/www; #All root web root directories in the server virtual machine index index.php index.html index.htm; #Define the default address for global access charset utf-8; # Web page default encoding format access_log usr/local/var/log/host.access.log main; #Access log error_log usr/local/var/log/host.error.log error; #Error log .... }
The location module is very widely used in nginx and is also very important. It is used to locate and parse URLs. It provides a powerful regular expression function and supports conditional judgment and matching. Users can use the location instruction to realize nginx to filter dynamic and static web pages. .
location / means matching the access root directory. root specifies the web directory of the virtual host when specifying the access root directory. The directory can be a relative path (relative to nginx) or an absolute path.
The default home page and virtual machine directory, index is used to set the default home page address after we only enter the domain name.
location / { root /Users/aaa/www; index index.php index.html index.htm; }
In addition, location can also perform regular matching, such as the configuration of the php environment:
location ~ \.php$ { root /Users/aaa/www; fastcgi_pass 127.0.0.1:9000; fastcgi_index index.php; include fastcgi.conf; }
A powerful and simple load balancing function is provided in nginx, using the upstream module, a simple example:
upstream webservers{ server 192.168.33.11 weight=10; server 192.168.33.12 weight=10; server 192.168.33.13 weight=10; } server { listen 80; server_name upstream.iyangyi.com; access_log /usr/local/var/log/nginx/upstream.iyangyi.access.log main; error_log /usr/local/var/log/nginx/upstream.iyangyi.error.log error; location / { proxy_pass http://webservers; proxy_set_header X-Real-IP $remote_addr; } }
The request is directed to the webservers through the proxy_pass of the location (the name can be customized), and the defined upstream distributes the request to the corresponding 3 servers according to the weight (weight). There are additional parameters in the upstream module:
- max_fails: The number of times the request is allowed to fail, the default is 1, when the maximum number of times is exceeded, the error defined by the proxy_next_upstream module is returned;
- fail_timeout: After the manager fails max_fails times, the time to suspend the service, used together with max_fails, to check the server health status.
When it fails 2 times, the service is stopped for 30 seconds.
server 192.168.1.130 weight=1 max_fails=2 fail_timeouts=30s;
- down: The server does not participate in load balancing temporarily, which is equivalent to a comment;
- backup: This machine is a backup server, used when other servers are unavailable;
Upstream can use a certain load balancing algorithm for scheduling, currently supported are:
- Weight polling (default), each request is allocated one by one in order, and the down server will be automatically eliminated;
- ip_hash, each request is allocated according to the hash result of accessing the ip, and the visitors of the same ip will visit the same server fixedly to solve the session sharing problem existing in dynamic web pages;
- fair, only a load balancing algorithm, which intelligently performs load balancing according to the page size and loading time, allocates requests according to the response time of the backend server, and needs to download upstream_fair to support this algorithm;
- url_hash, allocate requests according to the result of accessing the url for hashing, so that each url is directed to the same server, further improving the efficiency of the back-end cache server, you need to download the hash package to support this algorithm;
Note that when ip_hash is selected, the weight and backup parameters will be invalid.
log
There are two kinds of logs in nginx, access logs and error logs. The access log records every request made by the client to access nginx, including the source of the user's region, the source of the jump, and the access volume of a certain url of the terminal. Customization is not supported, and the performance bottleneck of a service or server in the system can be obtained through the error log.
Open the access log in http, the access_log attribute is on, and the access_log specifies the address of the log
access_log on; log_format main '[$time_local] $remote_addr - $remote_user - $server_name to: $upstream_addr: $request upstream_response_time $upstream_response_time msec $msec request_time $request_time'; access_log /usr/local/etc/nginx/logs/access.log main;
log_format specifies the format of the log, where $ is followed by the variables defined in nginx. If the above configuration is followed, the log is as follows:
[31/Oct/2016:20:43:27 +0800] 127.0.0.1 - - - localhost to: 119.254.109.163:8081: GET /home/homePage/getHomeImg.json HTTP/1.1 upstream_response_time 0.165 msec 1477917807.635 request_time 0.165
The meaning of the properties in the log:
property name | attribute meaning | |
$remote_addr | and $http_x_forwarded_for to record the client's ip address | |
$time_local | Log access time and time zone | |
$remote_user | record client username | |
$request | Record the requested url and http protocol | |
$request_uri | /stat.php?id=1585378&web_id=1585378 | |
$request_time | request time | 0.205 |
$connection | serial number of the connection used | |
$http_referer | Used to record from which page the link came | |
$http_user_agent | Record information about the client browser | “Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; SV1; GTB7.0; .NET4.0C; |
$body_bytes_sent | Record the content size of the file body sent to the client | |
$status | Log the request status; success is 200 | |
$http_host | The request address, which is the address (IP or domain name) you enter in the browser |
img.alipay.com
10.253.70.103
|
$upstream_status | upstream status | 200 |
$ssl_protocol | SSL protocol version | TLSv1 |
$ssl_cipher | Algorithms in exchanging data | RC4-SHA |
nginx FAQ
The nginx pid file fails. When using nginx -s reload, the configuration needs to be reloaded according to the process id (pid) in the pid file. If there is no already running nginx process, the signal cannot be sent to it. At this time Just run nginx directly.
nginx: [error] open() "/usr/local/var/run/nginx.pid" failed (2: No such file or directory)
In Taobao's improved tengine, some access log fields are not supported, for example $upstream_addr does not work in our environment.
For the specific usage of the location module in nginx, you can check
https://www.zybuluo.com/phper/note/133244