tcp calls the recv interface without judging the length of the receive buffer, resulting in a return value of zero and a misjudgment that the link is disconnected

    When we call the recv(fd,buf,len,0) interface, we must judge the return value. If the return value is 0, it means that the link is disconnected. However, if len is set to 0, the return value is also 0. At this time, it is obviously wrong to judge that the link is disconnected. When testing the new function recently, such an old bug was detected. The process of tracking down bugs is as follows:

    In order to test a new function, there is such a scenario: A frantically sends asynchronous messages to B, where B calls my interface. Phenomenon: B soon found that the return value of recv() was 0, judged that the link was disconnected, and then tried to reconnect to A. I have also tested this scenario of sending asynchronous messages at high frequency before, and there is no problem, but it has not reached the "crazy" level. This also led me to mistakenly think that my code was fine and could support the scenario of high-frequency asynchronous messages. It actually proved that the "high frequency" I tested at the time was not high enough, so no problem was detected.

    Solution: After discovering that B has disconnected from the link, in the code logic of A and B, it is considered that the other party has actively disconnected, so an omnipotent packet capture is carried out. Fortunately, A and B are not on the same machine, otherwise they would not be able to catch it, if their communication did not go through the network card. Packet capture discovery: B sends an RST packet to A. The meaning of this packet is as follows: there is data in the receiving buffer, and if the link is actively disconnected, an RST packet will be sent to the other party, indicating that the connection is abnormally disconnected. That is to say, there is still data in B's receive buffer, but B actively calls the close(fd) function to disconnect the link. At this time, A will receive the RST packet sent by B. That is to say, the link is actively disconnected by B. So the question is, why does B take the initiative to disconnect the link? It is obvious that B decides that the other party disconnects the link, and then disconnects the link?

    Considering that A frantically sends asynchronous messages, could it be that B's tcp receive buffer is too small to accept the massive data sent by A? But after adjusting the tcp parameters of the system, it is found that the problem still exists. On second thought, the size of the tcp receiving buffer does not affect the reception of data, because tcp has flow control, and uses the sliding window mechanism for congestion control. If B cannot receive so much data, A will send it slowly, no The tcp buffer of B burst, causing the link to be disconnected.

    Since it is not a system reason, it is naturally a code reason. But after reading the code a few times, I still feel that there is no problem. This is also normal. After all, it is difficult to see the problem when you look at your own code. So check in one direction.

First of all, is it caused by non-network reasons, such as threads/locks, etc.? It's not right to think about it, other threads did not actively disconnect the link, but I still did a test, commented out the recv() function, and stopped collecting data. This result is naturally normal and will not cause the link to be disconnected. Because B does not receive data, it will not judge that the link is disconnected, so it will not destroy resources, and will not disconnect the link.

It seems that the problem is still with recv. After thinking about it, since both A and B think that they have not actively disconnected the link, is it because the link is normal and has not been disconnected at all? So I made a special process. In B's logic, if B judges that the link is abnormal, it ignores the abnormality and continues to receive data. Sure enough, B can continue to receive data, the link is normal, and B wrongly judges that the link is disconnected!

After getting this test result, I am even more puzzled. It is clear that the return value of the recv() function is 0, which means the link is disconnected. The documentation says this, why is this happening to me, then I can't judge. Is the link up or down? After asking my colleague, my colleague's sentence made me stunned: If len in recv(fd,buf,len,0) is set to 0, what will be the return value? It's really possible! I modified the code to print the value of len before recv. Sure enough, when len is 0, the return value of recv is also 0. The reason is found! It turns out that when the recv function is called, len cannot be set to 0.

    Postscript: In fact, before calling the recv() function, I judged whether len is 0, but in the process of writing the code later, other logic was added in the middle, and the value of len was modified, which caused len to become 0. , but I didn't remember it, I only remembered that I had judged the value of len. I still have to pay attention to writing code in the future. For judgments like this value, try to write it near the call to prevent myself from modifying it, and then forgetting it again. Also feel like I've dealt with this anomaly. In addition, for network programming, test more for abnormal conditions, such as crazy sending.

 
 
G
M
T
 
 
Detection Language Esperanto, Chinese, Simplified, Chinese, Traditional, Danish, Ukrainian, Uzbek, Urdu, Armenian, Igbo, Russian, Bulgarian, Sinhalese, Croatian, Icelandic, Galician, Catalan, Hungarian, South African, Zulu Nada, Hindi, Indonesian, Sundanese, Javanese, Indonesian, Gujarati, Kazakh, Turkish, Tajik, Serbian, Sesotho, Welsh, Bengali, Cebuano, Nepali, Basque, Boolean (Afrikaans) Hebrew, Greek, German, Italian, Yiddish, Latin, Latvian, Norwegian, Czech, Slovak, Slovenian, Swahili, Punjabi, Japanese, Georgian, Maori, French, Polish, Bosnian, Persian, Telugu, Tamil Thai, Haitian, Creole, Irish, Estonian, Swedish, Belarusian, Lithuanian, Somali, Yoruba, Burmese, Romanian, Lao, Finnish, Hmong, English, Dutch, Filipino, Portuguese, Mongolian, Spanish, Hausa, Vietnamese, Azerbaijan Languages: Albanian, Arabic, Korean, Macedonian, Malagasy, Marathi, Malayalam, Malay, Maltese, Khmer, Chichewa
 
Esperanto Chinese Simplified Chinese Traditional Danish Ukrainian Uzbek Urdu Armenian Igbo Russian Bulgarian Sinhala Croatian Icelandic Galician Catalan Hungarian South African Zulu Kannada Languages ​​Hindi, Indonesian, Sundanese, Javanese, Indonesian, Gujarati, Kazakh, Turkish, Tajik, Serbian, Sesotho, Welsh, Bengali, Cebuano, Nepali, Basque, Boolean (Afrikaans), Heber Incoming languages: Greek, German, Italian, Yiddish, Latin, Latvian, Norwegian, Czech, Slovak, Slovenian, Swahili, Punjabi, Japanese, Georgian, Maori, French, Polish, Bosnian, Persian, Telugu, Tamil, Thai, Haitian Creole, Irish, Estonian, Swedish, Belarusian, Lithuanian, Somali, Yoruba, Burmese, Romanian, Lao, Finnish, Hmong, English, Dutch, Filipino, Portuguese, Mongolian, Spanish, Hausa, Vietnamese, Azerbaijani, Albanian Arabic, Korean, Macedonian, Malagasy, Marathi, Malayalam, Malay, Maltese, Khmer, Chichewa
 
 
 
 
 
 
 
 
 
Text-to-speech is limited to 200 characters
 
 
Options : History : Feedback : Donate closure

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325345655&siteId=291194637