What are Apache and Nginx? |What are Nginx and Reactor? |Essence of network IO|Blocking queue|Asynchronous non-blocking IO

foreword

那么这里博主先安利一些干货满满的专栏了!

首先是博主的高质量博客的汇总,这个专栏里面的博客,都是博主最最用心写的一部分,干货满满,希望对大家有帮助。

高质量干货博客汇总https://blog.csdn.net/yu_cblog/category_12379430.html?spm=1001.2014.3001.5482


Pre-concept preparation

First of all, if we want to understand Nginx and Apache, we must first understand the related concepts of network IO, which is very important. What is blocking IO, what is non-blocking IO, what is asynchronous IO, what is multi-channel transfer, what is reactor mode , we all have to figure out these first, the content of this blog, you can understand.

Therefore, for the above content, the blogger made a detailed explanation and explanation. In the previous blog, if you don’t understand the concepts related to network IO, you should read the previous one first, and then learn this one ~

What is IO? The essence of IO? |How to make IO efficient? What is efficient? |Asynchronous IO|Multi-channel transfer|reactor mode https://blog.csdn.net/Yu_Cblog/article/details/131778346?spm=1001.2014.3001.5501

Apache and Nginx

The underlying principle of the Apache HTTP server

The Apache HTTP server adopts the classic multi-process/multi-thread model. Its main components include the main process/thread (Master Process/thread) and the worker process thread (Worker Process/thread). When Apache starts, the main process is first created, and listens to the specified port, waiting for client connections. When a new connection request arrives, the master process accepts the connection and dispatches it to an available worker process.

In layman's terms, it is multi-threaded or multi-process, of course, threads are generally used, because threads occupy less CPU resources. In fact, it is a main thread, which manages some new threads uniformly. When the server starts, a main thread is created. If the underlying link is connected, a new thread is created and accept is called. The new thread is dedicated to processing this specific link until the link is closed. When you come to a link, the main thread will create a new thread, which is the essence of using the new thread to pick up the task. Of course, this solution can be optimized. Generally, a thread pool will be used to complete these tasks. First, a bunch of threads will be created, and then a thread will be linked. Then, one thread will take over the task. If the thread pool threads are not enough, you can choose to create them. Or block and wait for other threads to release.

So what's the problem with this approach? If a link comes, it’s a long link, and I won’t leave or send you a message, so what about your task thread? It can only be hung by this link that occupies the pit and does not work, and can neither be released nor work. If there are many such long links, it will consume CPU resources very much!

The underlying principle of Nginx

Nginx adopts an event-driven, asynchronous single-process model. Its underlying structure consists of multiple modules, including event module, HTTP module, reverse proxy module, etc.

The core component is the event module, which uses the asynchronous I/O mechanism provided by the operating system (such as epoll, kqueue) to achieve efficient event processing. The main process of Nginx is an event-driven Reactor, which monitors and accepts client connections through the event loop (Event Loop). When a new connection arrives, the master process distributes the connection to an available worker process.

The Worker Process is the executor that Nginx actually processes the request. Each worker process is independent and shares the same event loop across multiple connections. The worker process processes requests in an event-driven manner, including reading requests, parsing request headers, processing request logic, generating responses, and more. In the process of processing requests, Nginx uses non-blocking I/O operations and makes full use of the asynchronous I/O mechanism to improve concurrent processing capabilities.

reactor (take epoll under linux as an example)

Among them, reactor is the core component of Nginx. Let me directly give an example to explain what is an event loop and what is to monitor these things. Just give an example, and everyone will understand. For an HTTP request:

For a server, there must be a listening socket listensock. When the server is turned on, there will definitely be many links from other places, wanting to shake hands with our server three times, so we have a connection coming, should we accept it, right? But now in epoll multi-way transfer mode, listensock cannot be accepted directly! Why, because I don't know when the link will come. If the link doesn't come, won't it be blocked if I call accept? Therefore, we should put listensock into epoll to register! ! Then return directly without blocking! After registration, if the link comes, that is to say, the read event of the listensock socket (the essence of a socket is a file descriptor, and bloggers will not repeat these basic concepts) is ready! epoll will notify me! I will go to accept again at this time, and it will definitely not be blocked at this time! Because epoll told me that the read event of listensock is ready!

Then we know that the socket after accept, that is, the ordinary socket, may send us a message, so is it okay to call read directly according to the previous method? Definitely not! If there is no news, what do you read? If there is no news, your read will not be blocked? There cannot be such low-level operations in epoll. Therefore, also, register in epoll! When the news comes, epoll will tell you, you don't have to worry about it, just return directly.

The whole process divides a request into multiple stages, each stage will be registered and processed in many modules , and all operations are asynchronous and non-blocking. Asynchronous here means that the server does not need to wait for the return result after executing a task, but automatically receives a notification after completion.

The whole process is single-process and single-threaded, but with high concurrency! I'm not afraid of the long link, you just register in epoll, if you don't send me a message, I won't spend time on you (calling read), so this method is very efficient! ! ! ! ! This approach enables the server to efficiently handle multiple concurrent requests and perform other tasks while waiting for I/O operations to improve overall performance.

What is the bottom layer of epoll? What are the advantages compared to select and poll? These can be seen from a blogger's project on github about multi-way transfer, which is very clear!

Multiplexing-high-performance-IO-serverhttps://github.com/Yufccode/Multiplexing-high-performance-IO-server

Does reactor only have epoll?

The Reactor pattern is a design pattern for building event-driven applications. In Reactor mode, there is an event loop (Event Loop) responsible for listening to events and dispatching corresponding handlers. The specific underlying implementation can use a variety of technologies and system calls, among which epoll is one of the commonly used event notification mechanisms under the Linux system.

In the Linux system, epoll provides an efficient I/O event notification mechanism, enabling the server to handle a large number of concurrent connections. Therefore, many Reactor mode implementations will choose to use epoll as the underlying event notification mechanism to achieve high-performance event-driven.

However, the underlying implementation of Reactor mode is not limited to epoll, it can also use other event notification mechanisms, such as select, poll, etc., or use corresponding mechanisms on other operating systems, such as kqueue (on FreeBSD and Mac OS X) or IOCP (on Windows).

Therefore, the Reactor pattern does not depend on a specific underlying implementation, but focuses on event-driven design ideas and patterns. The specific underlying implementation depends on the operating system and the event notification mechanism chosen by the developer.

Some other features of Nginx

In addition, Nginx also provides a powerful modular architecture, users can choose and configure different modules according to their needs. Nginx modules can implement functions such as load balancing, caching, reverse proxy, SSL/TLS encryption, etc. Modules can be loaded and configured through configuration files, making Nginx highly flexible and scalable.

A Http server based on Reactor model

Recently, bloggers are working on such an http server, which is implemented based on Reactor asynchronous IO and underlying multi-channel transfer, which can meet the requirements of high efficiency.

Now the backend of this project has been basically completed, and some details are still being improved. I hope everyone will support this project~~

Reactor-based-HyperWebServerhttps://github.com/Yufccode/Reactor-based-HyperWebServer

Summarize

Whether it is a reverse proxy server such as Nginx or Squid, they all adopt an event-driven network model. Event-driven is actually an ancient technology, and mechanisms such as select and poll were used in the early days. Subsequently, more advanced event mechanisms based on kernel notifications emerged, such as epoll in libevent, which improved event-driven performance. The core of event-driven is still the I/O event, and the application can quickly switch between multiple I/O handles to realize the so-called asynchronous I/O. Event-driven servers are very suitable for handling I/O-intensive tasks, such as reverse proxy, which acts as a data transfer station between the client and the web server, involving only pure I/O operations, not complex calculations. It is a better choice to use event-driven to build a reverse proxy. A worker process can run without the overhead of managing processes and threads, and the CPU and memory consumption is also small.

Therefore, servers such as Nginx and Squid are implemented in this way. Of course, Nginx can also adopt a multi-process plus event-driven mode, and several processes run libevent, instead of hundreds of processes like Apache. Nginx also performs well when handling static files, because static files themselves are also disk I/O operations and are handled in the same way. As for the so-called tens of thousands of concurrent connections, it doesn't make much sense. A handy network program can handle tens of thousands of concurrent connections, but if most clients are blocked somewhere, it's of little value.

Let’s take a look at application servers like Apache or Resin. They are called application servers because they need to run specific business applications, such as scientific computing, graphics and image processing, and database reading and writing. They are likely to be CPU-intensive services, and event-driven is not suitable for such cases. For example, if a calculation takes 2 seconds, the process will be completely blocked for 2 seconds, and the event mechanism will have no effect. Imagine if MySQL switched to event-driven, a large join or sort operation would block all clients. In this case, multiprocessing or multithreading presents an advantage, as each process can perform tasks independently without blocking or interfering with each other. Of course, with modern CPUs getting faster, the blocking time for a single calculation can be short, but as long as there is blocking, event programming has no advantage. Therefore, technologies such as processes and threads will not disappear, but will complement the event mechanism and will exist for a long time.

In summary, event-driven is suitable for I/O intensive services, while multi-processing or multi-threading is suitable for CPU-intensive services. They each have their strengths and there is no tendency to replace each other.

References for this text:

Copyright statement: This article is an original article of CSDN blogger "Xi Feijian", following the CC 4.0 BY-SA copyright agreement, please attach the original source link and this statement for reprinting. Original link: Apache and Nginx network model_What network model does nginx and apache_Xi Feijian's Blog-CSDN Blog

Guess you like

Origin blog.csdn.net/Yu_Cblog/article/details/131777668