[Switch] the Linux network - receiving process data packets

Turn, Original:  https://segmentfault.com/a/1190000008836467

-----------------------------------------------------------------------------------------------------------------

This article describes the Linux system, step by step how the packet is passed from the hands of the process of the card.

If English is not a problem, I strongly recommend reading the following reference in the two articles, which introduced more detailed.

This article discusses only the Ethernet NIC physical, does not involve the virtual device, and a process to receive the UDP packet as an example.

This example listed in the function call relationship from the kernel 3.13.0, if your kernel is not this version, the function name and the associated path may be different, but the principle behind it should be the same (or slightly different)

Memory card

NIC requires a driver to work, drivers are loaded into the kernel module, responsible for the network adapter card and kernel modules, drive at load time will be registered into its own network module, when the corresponding network card receives a packet, the network module calls appropriate driver processes the data.

The following figure shows the data packet (Packet) how to get into the memory, and the core network module start process:

                   +-----+
                   |     |                            Memroy
+--------+   1     |     |  2  DMA     +--------+--------+--------+--------+
| Packet |-------->| NIC |------------>| Packet | Packet | Packet | ...... | +--------+ | | +--------+--------+--------+--------+ | |<--------+ +-----+ | | +---------------+ | | 3 | Raise IRQ | Disable IRQ | 5 | | | ↓ | +-----+ +------------+ | | Run IRQ handler | | | CPU |------------------>| NIC Driver | | | 4 | | +-----+ +------------+ | 6 | Raise soft IRQ | ↓ 
  • 1: physical NIC packet enters the network from the outside. If the destination address is not the card, and the card is not turned promiscuous mode, the packet will be discarded cards.
  • 2: the data packet through NIC DMA written to the specified memory address mode, the address is allocated by the NIC driver and initialized. Note: The old card may not support DMA, but the new card generally support.
  • 3: NIC notifies the CPU through a hardware interrupt (IRQ), tell it to the data
  • 4: CPU according to the interrupt table, call the registered interrupt function, the interrupt function will be transferred to the driver (NIC Driver) in the corresponding function
  • 5: disable the network card driver interrupt, indicating that the driver already know that there is data in the memory, tell the card to receive the next data packet memory write directly on it, do not notify the CPU, so you can improve efficiency, avoid CPU does not stop is interrupted.
  • 6: Start soft interrupt. After this step, the hardware interrupt handler ends returned. Since the hard interrupt handler during execution can not be interrupted, so if it is performed too long, the response can not cause the CPU to other hardware interrupts, software interrupts the kernel on the introduction, which can be time-consuming hardware interrupt handler portion move soft interrupt handlers inside to slowly process.

Core network module

Soft interrupt triggers the network kernel module soft interrupt handler, the subsequent procedure is as follows

                                                     +-----+
                                             17      |     |
                                        +----------->| NIC |
                                        |            |     |
                                        |Enable IRQ +-----+ | | +------------+ Memroy | | Read +--------+--------+--------+--------+ +--------------->| NIC Driver |<--------------------- | Packet | Packet | Packet | ...... | | | | 9 +--------+--------+--------+--------+ | +------------+ | | | skb Poll | 8 Raise softIRQ | 6 +-----------------+ | | 10 | | ↓ ↓ +---------------+ Call +-----------+ +------------------+ +--------------------+ 12 +---------------------+ | net_rx_action |<-------| ksoftirqd | | napi_gro_receive |------->| enqueue_to_backlog |----->| CPU input_pkt_queue | +---------------+ 7 +-----------+ +------------------+ 11 +--------------------+ +---------------------+ | | 13 14 | + - - - - - - - - - - - - - - - - - - - - - - + ↓ ↓ +--------------------------+ 15 +------------------------+ | __netif_receive_skb_core |----------->| packet taps(AF_PACKET) | +--------------------------+ +------------------------+ | | 16 ↓ +-----------------+ | protocol layers | +-----------------+ 
  • 7: ksoftirqd process kernel responsible for soft handling interrupts, when it receives a soft interrupt, it will call the appropriate software interrupt handling function corresponding to step 6 above NIC driver module is thrown soft interrupt, ksoftirqd will call the network function module net_rx_action
  • 8: net_rx_action call the NIC driver in the poll function to process the data packets one by one
  • 9: In the pool function, the drive will read a card written by one packet memory, the memory format of data packets only driver to know
  • 10: Driver packets into the memory module to identify the core network skb format, and then calls the function napi_gro_receive
  • 11: napi_gro_receive handles GRO relevant content, that is, will be able to merge data packets to merge, so you only need to call one protocol stack. Then determine whether to open the RPS , if opened, will call enqueue_to_backlog
  • 12: enqueue_to_backlog function, the packet will be placed input_pkt_queue softnet_data structure of the CPU, and then returns, if input_pkt_queue full, then the packet will be dropped, the size of the queue can be net.core.netdev_max_backlog configuration
  • 13: CPU will be followed by their own soft interrupt network data processing input_pkt_queue in context (call __netif_receive_skb_core)
  • 14: If you do not open the RPS , napi_gro_receive directly call __netif_receive_skb_core
  • 15: Is there AF_PACKET see the type of socket (that is, we often say that the original socket), if available, a copy of the data to it. tcpdump packet capture is caught here.
  • 16: corresponding function call stack, the protocol stack packet to process.
  • 17: After all packets are processed in memory to be completed (ie poll function execution is completed), the hard-enabled network card interrupt so that next time the card again when data is received will inform the CPU
enqueue_to_backlog function is also called netif_rx function, and lo netif_rx it is called when the device sends the packet function

Stack

IP layer

Since the UDP packet, the first step will enter the IP layer, and then go down to a level functions:

          |
          |
          ↓         promiscuous mode &&
      +--------+    PACKET_OTHERHOST (set by driver)   +-----------------+
      | ip_rcv |-------------------------------------->| drop this packet|
      +--------+                                       +-----------------+
          | | ↓ +---------------------+ | NF_INET_PRE_ROUTING | +---------------------+ | | ↓ +---------+ | | enabled ip forword +------------+ +----------------+ | routing |-------------------->| ip_forward |------->| NF_INET_FORWARD | | | +------------+ +----------------+ +---------+ | | | | destination IP is local ↓ ↓ +---------------+ +------------------+ | dst_output_sk | | ip_local_deliver | +---------------+ +------------------+ | | ↓ +------------------+ | NF_INET_LOCAL_IN | +------------------+ | | ↓ +-----------+ | UDP layer | +-----------+
  • ip_rcv: ip_rcv entry function is a function of the IP module, in which the function, the first thing is to junk data (destination mac address instead of the current card, but the card is set promiscuous mode is received in) direct lost, then call registration on NF_INET_PRE_ROUTING function
  • NF_INET_PRE_ROUTING: netfilter hooks on the protocol stack, some packets may be injected through iptables handler, to modify or discard the packet, if the packet is not dropped, we will continue to go down
  • routing: routing, if it is not a local IP destination IP, and no on ip forward function, then the packet will be dropped if turned ip forward function, it would enter the function ip_forward
  • ip_forward: ip_forward will first call the netfilter registered NF_INET_FORWARD correlation function, if the data packets are not discarded, it will continue to the next function call dst_output_sk
  • dst_output_sk: This function calls the appropriate function of the data packet sent the IP layer, as the latter half of a flow of data packets transmitted with the next to be described.
  • ip_local_deliver: If the time that the destination IP routing above is the local IP, it will call the function, the function will be related to the first call NF_INET_LOCAL_IN hooks, if passed, the packet will be sent down to the UDP layer

UDP layer

          |
          |
          ↓
      +---------+            +-----------------------+
      | udp_rcv |----------->| __udp4_lib_lookup_skb |
      +---------+            +-----------------------+
          | | ↓ +--------------------+ +-----------+ | sock_queue_rcv_skb |----->| sk_filter | +--------------------+ +-----------+ | | ↓ +------------------+ | __skb_queue_tail | +------------------+ | | ↓ +---------------+ | sk_data_ready | +---------------+ 
  • udp_rcv: udp_rcv entry function is a function of UDP module, which it will call other functions, mainly to do the necessary checks, where a call is important __udp4_lib_lookup_skb, will find the function corresponding to the destination IP socket port and, if the corresponding socket is not found, then the packet will be dropped, otherwise continue
  • sock_queue_rcv_skb: mainly did two things, one is to check the socket's receive buffer is not full, if full, then discard the packet, and then is to call sk_filter see whether this package is to meet the conditions of the package, if the current socket set filter , and the condition is not satisfied, then the packet, the packet will be discarded (which in Linux, each socket may define the same as the inside tcpdump filter , the condition is not satisfied, the packet will be discarded)
  • __skb_queue_tail: the socket end of the packet in the receive queue
  • sk_data_ready: notification socket packet ready
After the calling sk_data_ready, a packet processing is completed, waiting for the application program to read, during the execution of all the above functions in the context of soft interrupt.

socket

The application layer there are two ways to receive data, one is waiting for data to recvfrom function blocks, in this case when the socket where the receipt of the notification, recvfrom will be awakened, and then read the receive data queue; another species by epoll or select the corresponding listening socket, when notified, then the function calls recvfrom to read received data queue. Both of which can normally receive the corresponding data packet.

Conclusion

Understand the processes packets received help to help us figure out what areas we can monitor and modify data packets, packets may be dropped under what circumstances, provide some reference for us to deal with network problems, and we understand the appropriate netfilter hook position, understanding the usage of iptables will certainly help, but also help us to better understand the subsequent virtual network devices under Linux.

In the next few articles we will introduce virtual network devices and iptables under Linux.

reference

Monitoring and Tuning the Linux Networking Stack: Receiving Data
Illustrated Guide to Monitoring and Tuning the Linux Networking Stack: Receiving Data
NAPI

Guess you like

Origin www.cnblogs.com/oxspirt/p/12041537.html