In-depth understanding of Nginx: parsing module development and architecture (Second Edition) reading notes

origin

Contact with nginx (hereinafter referred ng) have not only been exposed, in fact, as early as the author ng has been used in the project in 2015, when, but was restricted to a limited knowledge, there are many things to learn, I have not been able to ng to do in-depth understanding. Recent projects a little empty, so be prepared to do ng a more in-depth understanding, to know which way is to read the Tao-hui's "in-depth understanding of Nginx: parsing module development and architecture (Second Edition)", and the article also read this book make a note some gains.

Read about the object
before reading this book, I have a certain understanding of ng ng and used in the project, so this article is not ng instructions for use, before reading this article requires the reader to the basic use of a certain understanding ng

About Configuration

ng itself can be understood as a container, in which we can introduced many modules, some modules are necessary ng others that we need to expand ng when it need to be introduced. ng there are thousands of modules each have its own configuration parameters. In this article, we will only be used to explain the configuration of several modules.

Configuration items for debugging process and Orientation

(1) is running as a daemon Nginx
syntax: daemon on | off;
default: daemon on;
daemon (daemon) is out of the terminal and processes running in the background. It is in order to avoid departing from the terminal during the process execution information is displayed on any terminal, so that the process will not be interrupted by any of the information generated by the terminal. Nginx is undoubtedly a need for the service to run in daemon mode, so the default is to run in this way.
However, Nginx does offer closed daemon mode, the reason for providing this model, in order to facilitate trace debugging Nginx, after all, when debugging with gdb process most troublesome is how to continue to follow up the child process fork out. This is useful when the third part of the study Nginx architecture.

(2) is working with master / worker mode
syntax: master_process on | off;
default: master_process on;
the default nginx by way of a master process to manage multiple worker processes running, almost all of the production environment, Nginx to all this working ways.
With the same configuration daemon provides master_process configuration is for the convenience of trace debug Nginx. If closed off master_process way, a worker will not fork child process to handle the request, but the request is processed by the master process itself.

Set (3) error log
syntax: error_log / path / file level;
default: error_log logs / error.log error;
error log is the best tool to locate Nginx problem, we can set the proper path error log and according to their needs level.
/ path / file parameter may be a specific file, for example, is by default logs / error.log file, it is best to put a sufficient disk space location; / path / file may be / dev / null so it will not output any log, and this is the only means of error log off; / path / file can also be stderr, so the log will be output to the standard error file.
level is the log output levels, ranges debug, info, notice, warn, error, crit, alert, emerg, the level gradually increases from left to right. When set to a level greater than or equal to the log level will be outputted to / path / file file is less than the level log is not output. For example, when the error level is set, error, crit, alert, log level are output emerg.
If the log level is set to debug, all log is output, so that a large amount of data will need to ensure that the advance / path / file disks where there is sufficient disk space.
Note that if the log level set to debug, must join -with-debug configuration items when configure.

(4) whether the process of commissioning a few particular points
syntax: debug_points [stop | abort]
This configuration item is also used to help users keep track of debugging Nginx. It accepts two parameters: stop and abort. Nginx in some key mistakes logic (Nginx 1.0.14 version has eight) set up debug point. If debug_points set to stop, then code execution Nginx SIGSTOP will send a signal to debug these points for debugging. If debug_points set to abort, it will generate a coredump file, you can use gdb to view a variety of information Nginx time.
You do not usually use this configuration item.

(5) only on specified client output debug level log
syntax: debug_connection [IP | CIDR]
This configuration item actually belongs event class configuration, therefore, it must be placed in events {...} is valid. Its value may be an IP address or CIDR address, for example:
Events {
debug_connection 10.224.66.14;
debug_connection 10.224.57.0/24;
}
In this way, only the IP address from the above requests will output debug level logging, other requests are still used error_log configured log level.
Bug fix for this configuration above is useful, especially at high concurrent requests positioning problem will occur.
Note the use of debug_connection before, need to ensure that in the implementation of configure -with-debug has joined the argument, it would not take effect.

(6) limit coredump core dump file size
syntax: worker_rlimit_core size;
in Linux systems, an error occurs or when the process is terminated signal is received, the memory contents (core image) when the system will execute a process to write a file (core files), as debugging purposes, this is the so-called core dump (core dumps). Nginx process when some illegal operation (such as memory bounds) process leading to the forced closure of the operating system directly, it generates a core dump core files, can be obtained at the time of the stack, registers, documents and other information from the core to help us locate the problem. But much of the information in this document is not necessarily the core needs of the user, if not limit, then a core file may reach several GB, so just a few coredumps will fill the disk, causing serious problems. You can limit the size of the core files through worker_rlimit_core configuration, which effectively help users locate the problem.

(7) designated coredump file generation directory
Syntax: working_directory path;
the working directory worker process. The sole purpose is to set the configuration item is placed coredump file directory to help locate the problem. Therefore, the need to ensure that the worker process has permission to working_directory specified directory to write files.

The normal operation of configuration items

(1) define the environment variable
syntax: env VAR | VAR = VALUE
This configuration option allows the user to directly set the environment variable on the operating system. E.g:

env TESTPATH=/tmp/;

(2) embedded in other configuration files
Syntax: the include / path / File;
the include CI may be embedded in other nginx.conf current configuration file to the file, its parameters may be either an absolute path or a relative path (relative to the Nginx configuration directory, the directory where nginx.conf), for example:

include mime.types;
include vhost/*.conf;

We can see the value of the parameter can be an explicit file name, or file name contains a wildcard *, but one can embed multiple profiles.

Path (3) pid file
syntax: pid path / file;
default: pid logs / nginx.pid;
save the master process ID pid file storage path. And configure the default parameters "-pid-path" when performing the specified path is the same, can be modified at any time, but be sure Nginx permission to create pid file in the appropriate target, the file can be run directly affect whether Nginx.

(4) users and user groups Nginx worker processes running
syntax: user username [groupname];
default: the nobody the nobody the User;
the User is used to set the master process is started, fork out worker processes running in which users and user groups. When in accordance with the "user username;" setting, the user group name and username.
If the user configure parameters during command execution -user = username and -group = groupname, nginx.conf case the parameters specified users and user groups.

(5) specifies the Nginx worker processes can open the maximum number of handles descriptor
syntax: worker_rlimit_nofile limit;
provided the maximum number of file handles a process worker can open.

(6) limits the signal queue
Syntax: worker_rlimit_sigpending limit;
provided each user signal sent to the queue size Nginx. That is, when a user's signal queue is full, the amount of signal sent by the user will be lost again.

Optimize the performance of configuration items

(1) The number of Nginx worker processes
Syntax: worker_processes number;
default: worker_processes 1;
in the master / worker operating mode number, the definition of worker processes.
the number of worker processes will directly affect performance. Then, the user configure the number of worker processes should we do? This is actually related to the business needs.
Each worker process is single-threaded process, they will call each module to achieve a variety of functions. If these modules confirm blocking the call does not appear, then, how many CPU cores on how many processes should be configured; on the contrary, if it is possible to call blocking occurs, then you need to configure slightly more worker processes.
For example, if the business could result in a large number of user requests to read a static resource files on the local disk, and a smaller memory on the server, so the majority must be addressed to read the disk (when requesting access to the static head of the resource file is slowly), rather than the in-memory disk cache, disk I / O call may block live worker processes a small amount of time, leading to overall service performance.
Multi worker process can take full advantage of multi-core system architecture, but if the number of worker processes than the number of CPU cores, it will increase the consumption process caused by switching between (Linux is a preemptive kernel). In general, the user should configure worker processes equal to the number of CPU cores, and configured to use the following worker_cpu_affinity bind CPU core.

(2) binding Nginx worker processes to specific CPU core
syntax: worker_cpu_affinity cpumask [cpumask ...]
Why worker process to bind to a specific CPU core of it? Assume that a worker process is very busy, if multiple worker processes are looting the same CPU, then it will be synchronization issues. On the other hand, if each worker process are exclusive a CPU, to achieve full concurrency on the scheduling policy of the kernel.
For example, if there are four CPU core, can be carried out as follows:

worker_processes 4;
worker_cpu_affinity 1000 0100 0010 0001;

Note: worker_cpu_affinity configured only for Linux operating systems. Linux operating system sched_setaffinity () system call to achieve this function.

(3) SSL hardware acceleration
syntax: ssl_engine device;
if there are on the server SSL hardware acceleration device, then it can be configured to speed up the processing speed of the SSL protocol. Users can use the command OpenSSL to provide SSL hardware acceleration to see if there is equipment:

openssl engine -t

(4) system call execution frequency gettimeofday
syntax: timer_resolution t;
By default, each time the kernel calls the event (e.g. epoll, select, poll, kqueue etc.) Returns executed once gettimeofday, implemented by the kernel clock to update Nginx in the cache clock. In early Linux kernel, the gettimeofday the execution cost is not small, because once the intermediate kernel mode memory copy to the user state. When you need to reduce the frequency of call gettimeofday can be used timer_resolution configuration. For example, "timer_resolution 100ms;" represents at least once every 100ms before calling gettimeofday.
But in most of the current kernel, such as x86-64 architecture, gettimeofday just a vsyscall, just do the data access shared memory pages, not the usual system calls, the cost is not large, generally do not use this configuration. Moreover, if the time want to print the log files of each line more accurately, you can use it.

(5) Nginx worker processes priority setting
Syntax: worker_priority nice;
default: worker_priority 0;
Nice priority of the configuration item is used to set Nginx worker processes.
In Linux or other UNIX-like operating systems, when many processes are in a ready state, a process in accordance with the priority of all to decide this kernel to choose which one process execution. CPU time slice allocated size of the process is also associated with the process priority, the higher priority, time slice process allocation to the greater (e.g., in the default configuration, the minimum time slice is only 5ms, then the maximum slot there 800ms). In this way, a high-priority process will occupy more system resources.
Dynamic adjustment (currently only ± 5 adjustments) together determine the priority of the case made by the implementation of static priority and according to the kernel process. nice values are static priority of a process, it ranges from -20 to + 19, -20 is the highest priority, + 19 is the lowest priority. Therefore, if the user wants Nginx occupy more system resources, you can configure the nice value smaller number, but does not recommend the nice value (usually -5) smaller than a kernel process.

Event class configuration items

(1) whether the lock open accept
syntax: accept_mutex [on | off]
Default: accept_mutext ON;
accept_mutex is Nginx load balancing lock, Nginx book will detail how to achieve load balancing in Chapter 9 event handling framework. Here, the reader only needs to know accept_mutex This lock allows multiple worker processes in turn, a sequence of established TCP connection with the new client. When the number of connections 7/8 a process to establish a worker reaches the maximum number of connections worker_connections configuration, it will greatly reduce the chance that the worker process attempts to establish a new TCP connection, in order to achieve customer above all worker processes handle end requests as close as possible.
accept lock is enabled by default, if you close it, so time-consuming to establish a TCP connection will be shorter, but the load between the worker process will be very uneven, it is not recommended to close it.

Path (2) lock file
syntax: lock_file path / file;
default: LOCK_FILE logs / nginx.lock;
accept locks may need this lock file, if accept lock closed, lock_file fully configured not take effect. If you turn accept lock, and due to the compiler, operating system architecture and other factors led Nginx does not support atomic lock, then the lock will be achieved accept (section 14.8.1 will introduce file locking usage) with file locking, so lock_file specified the lock file to take effect.
Note that based on i386, AMD64, Sparc64, on PPC64 architecture of the operating system, the use of GCC, Intel C ++, SunPro C ++ compiler to compile Nginx, you can be sure at this time Nginx support atomic lock, since Nginx will use CPU characteristic and implement it in assembly language (reference may be achieved at an atomic operation section 14.3 x86 architecture). This time lock_file configuration is meaningless.

(3) accept the use of locks to really set the delay time between the connection
syntax: accept_mutex_delay Nms;
default: accept_mutex_delay 500ms;
after using accept lock, at the same time only one worker process can take to accept a lock. This lock is not accept blocking locks, if fail to return immediately. If there is a worker process tries to lock but did not accept to take to get it to wait at least the time accept_mutex_delay defined interval before trying again to get the lock.

(4) bulk establish a new connection
syntax: multi_accept [on | off];
default: multi_accept off;
when there is a new connection notification event model for all TCP requests this scheduling are initiated by the client to establish a connection as possible.

(5) Select event model
syntax: use [kqueue | rtsig | epoll | / dev / poll | select | poll | eventport];
default: Nginx will automatically use the most appropriate event model.
For Linux operating systems, the choice of event-driven model has a poll, select, epoll three. epoll is certainly one of the highest performance, the 9.6 will explain why the epoll can handle a large concurrent connections.

The maximum number of connections (6) of each worker's
syntax: worker_connections number;
maximum number of connections defined for each worker process can handle simultaneously.

Published 114 original articles · won praise 146 · Views 350,000 +

Guess you like

Origin blog.csdn.net/qq32933432/article/details/97394359