High-performance network programming (2)

Discussion on the solution of the C10K problem
To solve this problem, from the perspective of pure network programming technology, there are two main ideas:

    One is to allocate a separate process/thread for each connection processing; the other idea is to use the same process/thread to handle several connections at the same time.

  8.1 Idea 1: The most straightforward idea is that each process/thread handles one connection .
However, since the application process/thread will occupy considerable system resources,

  At the same time, the management of multiple processes/threads will put pressure on the system, so this solution does not have good scalability.

  Therefore, this idea is not feasible when the server resources are not rich enough.

  Even if the resources are abundant enough, the efficiency is not high enough. In short, the technical implementation of this idea will cause too much resource occupation and poor scalability.


8.2 Idea 2: Each process/thread handles multiple connections at the same time (IO multiplexing)
There are many technical implementations of IO multiplexing. Let's take a look at the advantages and disadvantages of the following implementations one by one.

● Implementation 1 : The simplest method of traditional thinking is to process each connection one by one in a loop, and each connection corresponds to a socket. When all sockets have data, this method is feasible.

However, when the application reads the file data of a certain socket and is not ready, the entire application will block here waiting for the file handle, and even if other file handles are ready, it cannot be processed further.

Implementation summary: Direct loop processing of multiple connections.
Problem summary: The failure of any file handle will block the entire application.

● Implementation 2 : Select is to solve the above blocking problem. The idea is very simple. If I check the status of the file handle before reading it, it will be processed when it is ready, and it will not be processed if it is not ready. Got this problem? So there is a select scheme. A fd_set structure is used to tell the kernel to monitor multiple file handles at the same time. When the state of one of the file handles changes as specified (for example, a handle changes from unavailable to available) or times out, the call returns. After that, the application can use FD_ISSET to see which file handle state has changed one by one. In this way, small-scale connections are not a problem, but when the number of connections is large (the number of file handles is large), checking the status one by one is very slow. Therefore, select often has a managed handle upper limit (FD_SETSIZE). At the same time, in use, because there is only one field to record attention and occurrence events, the fd_set structure should be re-initialized before each call.

[align=right !important][color=initial !important]1 intselect(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);


Implementation summary: A connection request arrives and then checks for processing.
The problem is summarized: handle upper limit + repeated initialization + checking the status of all file handles one by one is not efficient.

● Implementation 3 : poll mainly solves the first two problems of select: passing events that need attention to the kernel through a pollfd array to eliminate the upper limit of file handles,

                             At the same time, different fields are used to mark attention events and occurrence events to avoid repeated initialization.

Implementation summary: Design new data structures to provide efficiency.
Summary of the problem: It is not efficient to check the status of all file handles one by one.

● Implementation 4 : Since poll is inefficient to check the status of all file handles one by one, it is natural that if only the file handles whose status has changed (probably data ready) is provided to the application when the call returns, the efficiency of checking will be inefficient. It's much taller. epoll adopts this design and is suitable for large-scale application scenarios. Experiments show that when the number of file handles exceeds 10, the performance of epoll will be better than that of select and poll; when the number of file handles reaches 10K, epoll has surpassed select and poll by two orders of magnitude.

Implementation summary: Only return the file handle of the state change.
Problem generalization: Depends on a specific platform ( Linux ).

Because Linux is the most used operating system in Internet companies, Epoll has become synonymous with technologies such as C10K killer, high concurrency, high performance, and asynchronous non-blocking. FreeBSD introduced kqueue, Linux introduced epoll, Windows introduced IOCP, and Solaris introduced /dev/poll. The functions provided by these operating systems are designed to solve the C10K problem. The programming model of epoll technology is asynchronous non-blocking callback, which can also be called Reactor, event-driven, and event loop (EventLoop). Nginx, libevent, Node.js are the products of the Epoll era.

● Implementation 5: Since each interface of epoll, kqueue, and IOCP has its own characteristics, program porting is very difficult, so these interfaces need to be encapsulated to make them easy to use and port. The libevent library is one of them. Cross-platform, encapsulate the calls of the underlying platform, and provide a unified API, but the bottom layer automatically selects the appropriate calls on different platforms. According to the official website of libevent, the libevent library provides the following functions: when a specific event of a file descriptor (such as readable, writable or error) occurs, or a timed event occurs, libevent will automatically execute a user-specified callback function to handle events. Currently, libevent supports the following interfaces /dev/poll, kqueue, event ports, select, poll and epoll. Libevent's internal event mechanism is entirely based on the interface used. So libevent is very portable and makes it very easy to extend. Currently, libevent has been compiled for the following operating systems: Linux, BSD, Mac OS X, Solaris and Windows. Development with the libevent library is very simple and easily portable across various unix platforms. A simple program using the libevent library is as follows:

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325067607&siteId=291194637