TCP packet loss solution

I. Introduction

        Continuing from the previous article TCP packet loss investigation_tingmailang's blog-CSDN blog , the packet loss inspection is confirmed in the environment and network link layer. For the follow-up investigation and solution bloggers to track the whole process, share it here.

2. Causes of packet loss - operating system level

        The previous article located the occasional packet loss in the tcp handshake. O&M contacted Alibaba Cloud personnel to check whether the packet loss was related to the cloud platform. Alibaba Cloud conducted system monitoring and found that there were indeed many packet loss phenomena.

                                              

        Alibaba Cloud explains that on kernel-4.19.91-25.1.al7 and earlier kernel versions, when an application initiates multiple TCP connection requests at the same time, a large number of TCP packets pass through the NAT table and may obtain duplicate ports. The Conntrack module finds that the ports are duplicated during the confirmation phase, and discards the relevant TCP packets.

        The iptables firewall can be used to create filtering (filter) and NAT rules, all Linux distributions can use iptables. Therefore, it is determined by the mechanism of Linux that obtains the duplicate port of NAT.

        Some students may have doubts, isn't k8s isolated, how can there be port reuse? In fact, if a physical machine has a k8s container, it is indeed isolated, but this also loses the meaning of k8s: to isolate resources that are easy to interfere with each other on physical machines such as memory disks, and simply use one physical machine as a service.

        The production environment basically has more than a dozen services in one physical opportunity, and the core cpu and network ports are actually shared by the services in the k8s container.

 

Summarize the request sending process in the current microservice environment:

IP + Port① The x service sends a request, and converts         the network address (of the data packet) through the NAT table of iptables to obtain the port

        ②The port sends a tcp handshake packet, and the Linux Conntrack module self-checks and confirms ( the packet will go through nf_conntrack_in(), nf_conntrack_confirm() to create (new) and confirm (confirm) two stages  )

        ③The request reaches service grid 1

IP + Port④The service grid converts         the network address (of the data packet) through the NAT table of iptables , obtains the port, and establishes a tcp link

        ⑤The grid sends the tcp handshake packet, and the Linux Conntrack module self-checks and confirms ( the packet will go through nf_conntrack_in(), nf_conntrack_confirm() to create (new) and confirm (confirm) two stages  )

        ⑥The server operating system receives the handshake request packet, obtains the port through the NAT table of iptables, and uses this port to establish a tcp link with the client grid

        ⑦Grid 1 forwards data to Grid 2

        ⑧ Service grid 2 receives the request data and forwards it to y service for processing

 It can be seen here that the problem this time is in the second and fifth, the acquisition port is repeated, and the packet will be discarded directly when the linux self-test confirms, the handshake times out, and tcp initiates a handshake again.

3. Packet loss solution - operating system level

         According to Alibaba Cloud's description, 4.19 is obtained sequentially, so it is relatively frequent. Use yum to upgrade to 5.x or above. The kernel of 5.X is obtained randomly, and the frequency of port duplication is relatively low.

        This solution is probabilistic. The risk of repeated port packet loss still exists, but the frequency will decrease. However, Alibaba Cloud has made other optimizations. When the port repeatedly loses packets, it will immediately notify TCP to retransmit, so that the delay will be reduced. To the ms level.

4. Packet loss solution - service level

        Although packet loss is a problem at the operating system and network level, there is no optimization solution at the service level. It may not be called a solution. It is the same as the upgrading of the operating system, but the frequency of packet loss is reduced as much as possible.

        According to the positioning in the previous article, packet loss is basically the first handshake, so can we try to avoid creating new tcp connections and reuse them more?

        There is still this way. The json serialization used by the service is the default EAGER_DESERIALIZER_FETCH of fasterxml.jackson. When this serialization reads the response json, the last \r\n may not be read so that the next multiplexing connection will be used. Sometimes you will find dirty data\r\nclose the connection, and create a new tcp connection.

        Changing to use FAIL_ON_TRAILING_TOKENS can reduce the new tcp in the above situation. The impact on jackson itself is that if it is a string like "{json}xxxxx" before it can be deserialized, it will fail after adding it, but in normal business There will be no such abnormal json in the scene.

        Of course, this change is not directly aimed at packet loss, but mainly to reuse tcp, so the effect is probable. New links are inevitable, and links will not be reused all the time. It depends on packet loss due to network reasons. When, is it just to reuse the link or create a new link.

V. Summary

        Finally, it was determined that the tcp handshake packet loss was due to the fact that the Alibaba Cloud kernel easily retrieved duplicate ports to establish a tcp link. Linux self-test confirmed that the handshake packet was discarded if it was found to be reused.

        The solution is to upgrade the kernel in the same way (random acquisition instead of sequential acquisition, and immediately notify tcp to retransmit when the port repeatedly loses packets), and the service uses json to configure FAIL_ON_TRAILING_TOKENS (reduce new tcp links)

Guess you like

Origin blog.csdn.net/m0_69270256/article/details/126710697