The difference between the select, poll, epoll

(1) select ==> time complexity of O (n)

It only knows, there are I / O event occurs, but do not know what that several streams (there may be one, more, or even all), we can only undifferentiated poll all streams, identify read data, or the write data stream, to operate them. So select polling has no difference complexity O (n) is , the more stream processing while the longer undifferentiated polling time.

(2) poll ==> time complexity of O (n)

the poll and select essentially no difference, the user would copy the array passed to the kernel space, and then query the status of each device corresponding to fd,  but it does not limit the maximum number of connections , because it is based on the stored list.

(3) epoll ==> time complexity of O (1)

epoll poll can be understood as Event , different from the busy polling and non-discriminatory poll, epoll which flow will occur how the I / O event let us know. So we actually say epoll event-driven (each event associated with fd) , and at this time we are operating in these streams are meaningful. (Complexity is reduced to O (1))

select, poll, epoll is IO multiplexing mechanism. I / O multiplexing it through a mechanism, can monitor multiple descriptors, descriptor once a ready (ready generally read or write-ready), the program can be notified accordingly read and write operations. But it is synchronous I / O the select, poll, epoll essence, because they need to read and write after the event is ready responsible for their own reading and writing, that the reading and writing process is blocked , and asynchronous I / O is no need to own responsible for reading and writing, asynchronous I / O responsible for implementation of the data copied from the kernel space to the user.  

epoll can provide solutions with select multi-channel I / O multiplexing. There are now capable of supporting the Linux kernel, which is Linux-specific epoll, and select should be specified in POSIX, the general operating system has achieved

 

Today, these three IO multiplexing comparison, reference books and online information above, are summarized as follows:

1, select realization

Select the calling procedure is as follows:

(1) copy copy_from_user from user space to kernel space fd_set

(2) to register a callback function __pollwait

(3) through all fd, which calls the corresponding poll method (for socket, the poll is sock_poll, sock_poll some cases calls to tcp_poll, udp_poll or datagram_poll)

(4) to tcp_poll example, its core implementation is __pollwait, which is registered above the callback function.

(5) __ pollwait main job is to put current (the current process) hang in the queue waiting to devices, different devices have different waiting queue for tcp_poll, its queue is sk-> sk_sleep (note that the process to hang waiting in the queue does not mean that the process has been sleeping). If the device receives a message (network equipment) or filling out the file data (disk device), the device will wake up sleeping processes waiting on the queue, then the current will be awakened.

(6) poll described method returns whether a read or write operation readiness return mask mask, the mask according to the mask fd_set assignment.

(7) If completed through all fd, also did not return a readable and writable mask mask will be called schedule_timeout calling select process (that is, current) goes to sleep. When the device driver can read and write their own resources occurs, it will wake up the sleeping process waiting on the queue. If more than a certain timeout (schedule_timeout specified), or no one wakes up, the process is called re-select the wake up get CPU, and then re-iterate fd, judges have no ready fd.

(8) fd_set copied from the kernel space to user space.

to sum up:

select several major disadvantages:

(1) each call to select, need to copy fd collection from user mode to kernel mode, this overhead will be great when many fd

(2) at the same time each call fd select all need to traverse passed in the kernel, this overhead is also great when many fd

(3) the number of file descriptors select support is too small, the default is 1024

2 poll achieve

  poll and select implementation is very similar, but different ways described fd set, using poll fd_set structure pollfd structure rather than select, and almost all the other, a plurality of descriptors are managed polling, processing according to the state descriptor, but the poll is no maximum limit to the number of file descriptors . poll and select a drawback is also present, contains a large number of file descriptors between the array is copied in its entirety to the user mode and kernel address space, regardless of the file descriptor is ready to increase its spending with the number of file descriptors increases linearly.

3、epoll

  Since it is improved epoll select and poll, it should be able to avoid the above-mentioned three drawbacks. That epoll is how to solve it? Prior to this, we look at the different, select and poll on the call interface and select epoll and poll the only provides a function --select or poll function. The epoll provides three functions, epoll_create, epoll_ctl and epoll_wait, epoll_create is to create a epoll handle; epoll_ctl is registered to listen for the event type; epoll_wait is a wait event.

  For the first drawback, epoll solutions epoll_ctl function. Every time a new event to register (designated EPOLL_CTL_ADD in epoll_ctl in) when epoll handle, the fd will all be copied into the kernel, rather than epoll_wait when a duplicate copy. epoll fd ensure that each copy only once in the whole process.

  For the second drawback, like the epoll solutions select or poll every time the same current device turns corresponding to fd added waiting queue only when epoll_ctl hang over the current (which again is essential) and for each a fd specify a callback function when the device is ready, waiting for those who wake up waiting on the queue, it will call the callback function, and this callback function will be ready to join a ready list of fd). Epoll_wait actually works is to look at the ready list, there is no ready fd (use schedule_timeout () to achieve sleep for a while, step 7 will be the effect of a judgment, and select implementations are similar).

  For the third disadvantage, epoll not have this limitation, FD it supports the upper limit is the maximum number of files can be opened, this number is generally much larger than 2048, for example, on the 1GB memory machines is about 100,000, the specific number can cat / proc / sys / fs / file-max view, in general, this is a big number and the relationship between the system memory.

to sum up:

(1) select, poll achieve continuously polls all need their own fd set until the device is ready, you may want to sleep and wake up several times during the alternate. And in fact, need to call epoll epoll_wait constantly polling the ready list, it may alternately repeatedly during sleep and wake, but it is when the device is ready, call the callback function, fd ready to put in the ready list, goes to sleep and wakes up in the epoll_wait process. Although every sleep and alternately, but select and poll in the "awake" time to traverse the entire collection fd, and epoll in the "awake" when just determine what the ready list is empty on the line, which saves a lot of CPU time. This is caused by a callback mechanism to enhance performance.

(2) select, poll each call must be set to fd from user mode to kernel mode Copy Once, current to the device and make a queue waiting to hang, and epoll just one copy, but the current queue waiting to be hung hanging only once (at the beginning of epoll_wait, Note that the device is not waiting queue waiting queue, but inside a defined waiting queue epoll). This can save a lot of overhead. 

发布了58 篇原创文章 · 获赞 29 · 访问量 14万+

Guess you like

Origin blog.csdn.net/taoqilin/article/details/103964979