TCP / IP implementation principles underlying queue

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/wufaliang003/article/details/91354801

Since the last study of the TCP / IP congestion control algorithms, the more I want more in-depth understanding of some of the underlying principles of TCP / IP, search a lot of information on the network and see the great God Tao Hui column about high-performance network programming , gains a lot. To sum up today, and add some of their own thinking.

 I have a better understanding of the Java language, understanding of the Java programming on the network beyond the use Netty framework. NettyNorman Maurer source contributors for web development Netty had a word of advice, "Never block the event loop, reduce context-swtiching". That is, try not to clog IO thread, but also to minimize the thread switch.

 Why can not block IO thread reads the network information? Here we must begin to understand from the classic network C10K, how server supports 10,000 concurrent requests. C10K root cause is that the IO model of the network. Linux network processing are used in synchronous blocking the way, that is, each request is assigned a process or thread, to support 10,000 concurrent, is necessary to use 10,000 threads processing requests thing? This 10000 thread scheduling, context switching and the memory they consume, will become a bottleneck. C10K common solution is to use I / O multiplexing, Netty is.

 Netty has a responsible server listens creating a thread group (mainReactor) connected and responsible for IO thread group (subReactor) connected read and write operations, but also have a special Worker thread group (ThreadPool) processing business logic.

    Three independent of each other, so there are many benefits. First, there is the establishment of a special thread group is responsible for monitoring and handling network connection, can prevent TCP / IP half-connection queue (sync) and full connection queue (acceptable) to be filled. Two separate thread group and Worker Thread IO, both parallel processing network I / O and logic operations, can be avoided IO thread is blocked to prevent TCP / IP packets received queue is full. Of course, if the business logic low, i.e. IO intensive light computing business, the business logic can be placed IO processing thread, a thread switch to avoid, which is the second half of Norman Maurer words.

 TCP / IP queue how so much ah? Today we'll take a closer look TCP / IP, several queues, including the establishment of semi-connected queue (sync) when connecting, the whole connection queue (accept) and receiving receive messages when, outoforder, prequeue and backlog queue.

Establish a connection queue

 As shown above, there are two queues: syns queue (half-connection queue) and accept queue (connection queue full). After the three-way handshake, the server receives a client's SYN packet, the relevant information into the semi-connection queue, and SYN + ACK reply to the client. When the service end of the third step of receiving the client's ACK, if not then the whole connection queue is full, then the connection from the semi-queue out the relevant information into a fully connected to the queue, otherwise press tcp_abort_on_overflowvalues to perform actions directly abandoned or over a period of time and try again.

Receive queue when packets

 Compared to establish a connection, TCP processing logic upon receiving packets is more complex, involving more queues and configuration parameters.

 Application receives a TCP packet where the program and the network server system receives incoming TCP packets are two separate processes. Both are examples of manipulation of socket, but will be determined by a time lock contention who should control, resulting in many different scenarios. For example, an application being received packet, an operating system and receives packets through the network card, then how the process? If the application does not call recv read or read messages, the operating system will receive the message how to deal with?

 Then we will give priority to three figures, describes three scenarios when receiving TCP packet, and which presents four receive relevant queue.

Receiving a packet scene

FIG receiving the TCP packet is a schematic diagram of a scene. The operating system first received message stored in the socket of the receive queue, then the user process again calls recv to read.

1) When the network card receives the packet and determines that the TCP protocol, the call through the layers, eventually calls to the kernel tcp_v4_rcvmethod. Since the current to be received at a TCP packet is S1, so that tcp_v4_rcvthe function which is added directly to receivethe queue. receiveThe queue is already received TCP packets, removes the TCP header, sorted into user process can directly read in sequence queue. Since the socket is not in the user process context (that is, no user processes read socket), and we need the serial number of the message S1, and S1 just received a packet, therefore, it enters the receivequeue.

2) receiving packets S3, since the next TCP packet sequence number S2 is to be received, it is added to the out_of_orderqueue, all packets will be scrambled placed here.

3) Then, the received TCP packet S2 desired, directly into the receviequeue. Since this time out_of_orderthe queue is not empty, should be checked.

4) to each receivetime enqueue the packet checks out_of_orderthe queue, since the received packet after S2, S3 to the desired sequence number, so that out_of_orderthe queue will be moved S3 packet receivequeue.

5) user process to start reading socket, allocate a block of memory in the process, then call reador recvmethod. a series of socket configuration attribute has a default value, such as socket default blocking, its SO_RCVLOWATattribute value defaults to 1. Of course, such an approach recv also receive a flag parameter, which can be set to MSG_WAITALL, MSG_PEEK, MSG_TRUNKand so on, where we assumed to be the most common 0. Process calls the recvmethod.

6) call the tcp_recvmsgmethod

7) tcp_recvmsgThe method will first lock the socket. multi-threaded socket can be used, and the operating system will use, it is necessary to deal with concurrency issues. To control socket, first acquire the lock.

8) At this point, receivethe queue has three packets, the first packet will be copied to the memory user mode, since the parameters in the fifth step with the socket does not MSG_PEEK, so a message will be removed from the first queue In addition, freed from kernel mode. On the contrary, MSG_PEEKthe flag will cause receivethe queue does not delete the messages. So, MSG_PEEKit is mainly used for reading the situation the same socket multi-process.

9) a second copy of the message, of course, whether the remaining space in front of the copy execution checks the user mode memory is sufficient to put the current packet, when not directly return the number of bytes that have been copied. 10) copies of third packets. 11) receivethe queue is already empty, then checks SO_RCVLOWATthe minimum threshold. If it is less than the number of bytes has been copied, the process will be dormant, waiting for more messages. The default SO_RCVLOWATvalue is 1, that is, to read the message can be returned.

12) Check the backlogqueue, backlogthe queue is a user process is to copy the data, the card will receive the message into the queue. At this time, if the backlogqueue has data, passing under treatment. backlogNo data queue, thereby releasing the lock, ready to return to user mode.

13) the user process code begins execution, the number of bytes is copied from the kernel and the like at this time recv method returns.

Scene II received message

 The second graph shows the second scenario, where involved prequeuequeue. When a user process calls recv method, there is no socket queue messages, while socket is blocked, so the process of sleep. The operating system received the message, then prequeuethe queue began to have an effect. In this scenario, tcp_low_latencythe default of 0, the socket socket SO_RCVLOWATis a default, is still blocking socket, as shown below.

 Wherein the processing steps 2, 3 as before. We start from the fourth step.

4) Due to this case receive, prequeueand the backlogqueue is empty, there is no user to copy a byte in memory. And socket configuration requires at least a copy of SO_RCVLOWATwhich is 1 byte packets, thus blocking the flow enters the waiting socket. The maximum wait time is SO_RCVTIMEOa specified time. Before entering the waiting socket socket lock is released, will make the fifth step, the new message will not only enter the backlogqueue. 5) packets to S1, which is added to prequeuethe queue. 6) is inserted into the prequeuequeue after sleep will wake up in the socket of the process. After 7) user process wakes up, reacquire lock socket, then again receive the message can only enter the backlogqueue. 8) process to check the receivequeue, of course, is still empty; go check prequeuethe queue, found the message S1, is just awaiting the serial number of the message, and then directly from the prequeuecopy queue to user memory, and then release the kernel of this report Wen. 9) now has a copy of the packet to the byte user memory, to check whether this length exceeds a minimum threshold value, and len is SO_RCVLOWATthe minimum. 10) The SO_RCVLOWATuse of a default value of 1, the copy number of bytes greater than the minimum threshold, ready to return to user mode, will look at the way the backlog queue if there is data in this case is not so prepared back, releasing the lock socket. 11) returns the number of bytes of the user has been copied.

Scene Three received message

 In a third scenario, the system parameter tcp_low_latencyis 1, is provided on the socket SO_RCVLOWATattribute values. First the server receives packets S1, but less than its length SO_RCVLOWAT. User process calls recvand read or be read to, although a part of, but did not reach the minimum threshold, so the sleep process. At the same time, the received scrambled before sleep S3 packets directly into backlogthe queue. Then, the packet reaches the S2, since no prequeuequeue (as set tcplowlatency), and it is the starting sequence number is the next value to be copied, the copy directly into the user memory, total copy number of bytes to satisfy SO_RCVLOWATthe requirements! Finally before returning the user to backlogqueue packets S3 is also copied to the user.

1) Sl receives a packet, it is ready to receive the packet sequence number, therefore, it is added directly to the ordered receivequeue. 2) the system property tcp_low_latencyis set to 1, it indicates that the server program can hope timely received TCP packet. Calling user recvreceives a blocked message on the socket, the socket is SO_RCVLOWATgreater than the size of the first packet, and the user is allocated sufficient memory for len. 3) call the tcp_recvmsgmethod to complete the reception work, to lock the socket. 4) preparation processing respective packets in the receive queue kernel. 5) receivethe queue has packets can be copied directly, the size of which is less than len, copy directly to user memory. 6) simultaneously performing the fifth step, the core and receives packets to S3, socket is locked at this time, packets directly into the backlogqueue. This message is not orderly. 7) In the fifth step, the copy packet S1 to the user memory, its size is smaller than the SO_RCVLOWATvalue. Since the socket type is blocked, so the user process to sleep. Before entering sleep, you will first deal with backlogthe queue of messages. S3 because the message is out of order, so enter the out_of_orderqueue. Former user process to sleep will first deal with what backlogqueue. 8) process to sleep until a timeout or receivequeue is not empty. 9) packet received by the core S2. Note that, at this time due to the opening of the tcp_low_latencyflag, so the packets can not enter the prequeuequeue wait for the process handle. 10) Since the message S2 is the message to be received, at the same time, a user process to sleep waiting for the message, so the message will be directly copied to the user memory S2. After 11) each finished processing an orderly message, whether it is copied to the receivequeue or directly copied to the user memory, checks out_of_orderthe queue to see if the message can be processed. S3 packets are copied to the user memory, and then wake up the user process. 12) wake up the user process. 13) At this time, it checks whether the number of bytes copied is greater thanSO_RCVLOWAT, And backlogthe queue is empty. Both are satisfied, ready to return.

 To summarize the role of four queues.

  • receive queue is a real receive queue, the operating system receives the TCP packet, after inspection and treatment, will be saved to the queue.

  • backlogIt is "standby queue." When the socket is in the context of the user process (i.e., the user is made on the socket system calls, such as the recv), will save the data packet to the operating system receives the packet  backlogqueue, and then directly returned.

  • prequeueIt is a "pre-existing queue." When the socket is not being used by the user process, that is, the user process calls read or recv system call, but the sleep state, the operating system will receive direct messages are stored in  prequeue, and then return.

  • out_of_orderIt is "out of order queue." Queue storage is out of order packets, the operating system is not the received packet TCP packet is ready to receive the next sequence number is placed in  out_of_ordera queue, waiting for the subsequent processing.

No micro-channel public attention and headlines today, excellent article continued update. . . . .

 

 

Guess you like

Origin blog.csdn.net/wufaliang003/article/details/91354801