Generation and solution of tcp sticky packet/unpacking problem

Table of contents

1. The cause of the problem of sticking/unpacking

1. The process of sending and receiving data between the receiving end and the sending end

(1) TCP three-way handshake process

(2) TCP four waved process

 2. The structure of a frame of data

3. Nagle algorithm

4. The cause of sticking/unpacking problems

Second, the way to solve the problem of sticky package


1. The cause of the problem of sticking/unpacking

        Before understanding the cause of sticky packet/unpacket problem, we need to have a certain understanding of the principle of TCP/IP data transmission. 

1. The process of sending and receiving data between the receiving end and the sending end

(1) TCP three-way handshake process

        The three-way handshake occurs in the process of establishing a connection . It is initiated by the client and occurs between the server's accept (listen) function and the client's connect function.

        The three-way handshake is to ensure that both parties in the communication know that the other party's ability to send and receive data is no problem. At the same time, the three-way handshake is also a process of synchronizing the serial number.

(2) TCP four waved process

        The four handshakes occur during the disconnection process and are initiated by the active closing party (both the server and the client can initiate)

Take the initiative of the client as an example

 2. The structure of a frame of data

        MTU (Maximum Transmission Unit) maximum transmission unit is the maximum limit of the data size of a single transmission of the link layer.

        MSS (Maximum Segment Size) is the maximum segment length , which is the maximum limit for encapsulating the largest tcp packet at the network layer, and is used to limit the maximum number of bytes sent by the application layer at a time.

3. Nagle algorithm

        Nagle algorithm (Nagle algorithm) is an algorithm in the field of congestion control. According to the above description of data transmission and data structure, we can understand that in the TCP/IP protocol, every time data is sent, a protocol header must be added to the original data. Send ACK to indicate confirmation. In order to make full use of network bandwidth, TCP always hopes to send a large enough data packet each time . Therefore, the MSS parameter is set at the network layer, and TCP/IP hopes to send a data packet of MSS size every time. The Nagle algorithm is to send large blocks of data as much as possible to avoid frequent transmission of small data blocks in the network, which brings unnecessary resource overhead .

4. The cause of sticking/unpacking problems

        The TCP protocol is a connection-oriented transport layer protocol. The TCP protocol data transmission is based on "byte stream" . It does not include concepts such as messages and data packets. It requires the application layer protocol to design its own message boundaries . Most of the daily network application development is carried out at the transport layer, so the problem of sticking and unpacking mostly only occurs in the TCP protocol . Of course, the problem of sticking/unpacking also occurs at the link layer and the network layer.

        In general, the Nagle algorithm actively affects the size of the data sent each time. When we send small pieces of data multiple times, the Nagle algorithm will only send small pieces of data for the first time. After receiving the ACK, it will try to integrate the remaining small pieces of data. Send out a data packet close to the MSS size, unless the next small piece of data is urgent data, so the phenomenon of saving small pieces of data in the buffer is sticky packets . Another situation is that when the size of the data sent by the sender exceeds the size of the MSS, due to the existence of the Nagle algorithm, the data will be split into two parts, and the part that needs to be sent will be controlled at <= the size of the MSS . The rest will be placed in the buffer, this phenomenon is called unpacking .

Second, the way to solve the problem of sticky package

Method 1:

        The sender can specify the data at the beginning of each packet to represent the size of the entire packet , so that the receiver can determine the actual length of the packet based on the data at the beginning of each packet.

Method 2:

        The sender can set a special mark on the edge of the data , and the receiver can judge the boundary of the data by judging whether there is a mark in the received data packet.

Method 3:

        The sending end encapsulates each data packet into a fixed length (free space is filled with 0) , so that the receiving end can distinguish data segments every time it reads fixed-length data from the receiving buffer.

        Below is a small example I wrote to illustrate this approach.

       (This example can enable the client to download files in the folder where the server is located)

Server side code: 

#include <arpa/inet.h>
#include <fcntl.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
//打印错误码信息、打印出错行数
#define PRINT_ERR(msg)                                      \
    do {                                                    \
        printf("%s %s:%d\n", __FILE__, __func__, __LINE__); \
        perror(msg);                                        \
        return -1;                                          \
    } while (0)

//发送消息的数据包
typedef struct MSG {
    int count;  //发送端每次发出数据包中用户数据的大小
    char txt[128];  //保存从文件中读到的内容
} msg_t;

int main(int argc, const char* argv[])
{
    // 入参合理性检查
    if (3 != argc) {
        printf("Usage : %s <ip> <port>\n", argv[0]);
        exit(-1);
    }

    // 1.创建套接字
        // 遵循ipv4协议(AF_INET) 流式套接字(SOCK_STREAM) 无附加协议 0
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (-1 == sockfd) {
        PRINT_ERR("socket error");
    }

    // 2.填充服务器网络信息结构体
    struct sockaddr_in serveraddr;
    memset(&serveraddr, 0, sizeof(serveraddr));
        //AF_INET 表示ipv4网络协议
    serveraddr.sin_family = AF_INET;
        //填充端口号 将命令行传参的字符串转化为整形 然后再将主机字节序转换为网络字节序
    serveraddr.sin_port = htons(atoi(argv[2])); 
        //填充ip地址 通过inet_addr函数将字符串转换为点分十进制四字节整形的ip地址
    serveraddr.sin_addr.s_addr = inet_addr(argv[1]);
        //为了匹配下一步bind函数最后一个参数 
        //需要定义一个变量来表示网络信息结构体的大小
        //防止编译警告
    socklen_t serveraddr_len = sizeof(serveraddr);

    // 3.将套接字与服务器的网络信息结构体绑定
        //(struct sockaddr*)是用来强转对应bind函数的参数格式 防止编译警告
    if (-1 == bind(sockfd, (struct sockaddr*)&serveraddr, serveraddr_len)) {
        PRINT_ERR("bind error");
    }

    // 4.将套接字设置成被动监听状态
        //第二个参数表示半连接队列
        //同时监听客户端的数目为5(非零即可)
    if (-1 == listen(sockfd, 5)) {
        PRINT_ERR("listen error");
    }

        // 定义结构体保存客户端信息
    struct sockaddr_in clientaddr;
    memset(&clientaddr, 0, sizeof(clientaddr));
    socklen_t clientaddr_len = sizeof(clientaddr);

    int nbytes = 0; //记录客户端发来的数据的字节数
    msg_t buff; //与客户端数据传输使用的数据包
    int acceptfd = 0;   //一个专门用于服务器与客户端通信的文件描述符
    int fd = 0; //一个使用文件io的方式打开文件所需的文件描述符
    char file[128] = { 0 }; //用来保存客户端发来的文件名
        //服务器循环等待客户端连接
    while (1) {
        printf("正在等待客户端连接..\n");
        // 5.阻塞等待客户端连接
        if (-1 == (acceptfd = accept(sockfd, (struct sockaddr*)&clientaddr, &clientaddr_len))) {
            PRINT_ERR("accept error");
        }
        printf("客户端[%s:%d]连接到服务器..\n", inet_ntoa(clientaddr.sin_addr), ntohs(clientaddr.sin_port));
        // 收发数据 循环获取文件内容通过数据包发给客户端
        while (1) {
            // 接收客户端数据
            memset(&buff, 0, sizeof(buff));
            if (-1 == (nbytes = recv(acceptfd, &buff, sizeof(buff), 0))) {
                PRINT_ERR("recv error");
            } else if (0 == nbytes) {   //处理客户端异常中断连接 (客户端断电或信号中断等)
                printf("客户端[%s:%d]断开了连接..\n", inet_ntoa(clientaddr.sin_addr), ntohs(clientaddr.sin_port));
                break;
            }
            if (!strcmp(buff.txt, "quit")) {    //处理客户端主动中断连接 (当客户端发来一个quit的包表明客户端主动退出)
                printf("客户端[%s:%d]退出了..\n", inet_ntoa(clientaddr.sin_addr), ntohs(clientaddr.sin_port));
                break;
            }

            strcpy(file, buff.txt); // 保存接收文件名字
            printf("收到的文件名为[%s]\n", buff.txt);
            memset(&buff, 0, sizeof(buff));

            if (-1 == (fd = open(file, O_RDONLY))) {    //以只读的方式打开当前路径下存在的文件
                strcpy(buff.txt, "***NOT_EXIST***"); //如果当前目录下文件不存在 则返回一个校验用的数据包
                if (-1 == send(acceptfd, &buff, sizeof(buff), 0))
                    PRINT_ERR("send error");
            } else {
                strcpy(buff.txt, "***EXIST***");
                if (-1 == send(acceptfd, &buff, sizeof(buff), 0))
                    PRINT_ERR("send error");
                memset(&buff, 0, sizeof(buff));
                    //循环从文件中读取内容发送给客户端
                while (0 < (nbytes = read(fd, buff.txt, 128))) {
                    buff.count = nbytes;    //将从文件中读到的字符的个数赋值给buff.count
                    //通过发送固定大小的数据包解决粘包问题
                    if (-1 == send(acceptfd, &buff, sizeof(buff), 0))   //!!!!!!!!!!!!!
                        PRINT_ERR("send error");
                    memset(&buff, 0, sizeof(buff));
                }
                // 发送一个buff.count为0的数据包表示文件传输完毕
                buff.count = 0;
                if (-1 == send(acceptfd, &buff, sizeof(buff), 0))
                    PRINT_ERR("send error");
                close(fd);
            }
        }
        // 关闭套接字
        close(acceptfd);
    }
    close(sockfd);

    return 0;
}

Client code:

#include <arpa/inet.h>
#include <fcntl.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
// 打印错误码信息、打印出错行数
#define PRINT_ERR(msg)                                      \
    do {                                                    \
        printf("%s %s:%d\n", __FILE__, __func__, __LINE__); \
        perror(msg);                                        \
        return -1;                                          \
    } while (0)

// 发送消息的载体
typedef struct MSG {
    int count; // 用来记录发送
    char txt[128];
} msg_t;

int main(int argc, const char* argv[])
{
    // 入参合理性检查
    if (3 != argc) {
        printf("Usage : %s <ip> <port>\n", argv[0]);
        exit(-1);
    }
    // 1.创建套接字
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (-1 == sockfd) {
        PRINT_ERR("socket error");
    }

    // 2.填充服务器网络信息结构体
    struct sockaddr_in serveraddr;
    memset(&serveraddr, 0, sizeof(serveraddr));
    serveraddr.sin_family = AF_INET;
    serveraddr.sin_port = htons(atoi(argv[2]));
    serveraddr.sin_addr.s_addr = inet_addr(argv[1]);
    socklen_t serveraddr_len = sizeof(serveraddr);

    // 3.与服务器建立连接
    if (-1 == connect(sockfd, (struct sockaddr*)&serveraddr, serveraddr_len)) {
        PRINT_ERR("connect error");
    }
    printf("与服务器连接成功..\n");

    // 收发数据
    msg_t buff;
    char file[128] = { 0 };
    int fd, ret = 0;
    while (1) {
        memset(&buff, 0, sizeof(buff));
        memset(file, 0, sizeof(file));
        printf("请输入文件名:");
        fgets(buff.txt, 128, stdin); // 从终端获取文件名字
        buff.txt[strlen(buff.txt) - 1] = '\0'; // 清除终端输入文件名后最后敲的回车键
        if (!strcmp(buff.txt, "quit")) { // 如果终端输入quit表示主动结束进程
            if (-1 == send(sockfd, &buff, sizeof(buff), 0)) // 发送客户端断开的校验包
                PRINT_ERR("send error");
            break;  //跳出循环 结束进程
        }
        // 发送数据
        if (-1 == send(sockfd, &buff, sizeof(buff), 0)) // 发送文件名
            PRINT_ERR("send error");
        strcpy(file, buff.txt);
        printf("[%s]文件已被接收\n", buff.txt);
        // 接收应答消息
        memset(&buff, 0, sizeof(buff));
        if (-1 == recv(sockfd, &buff, sizeof(buff), 0))
            PRINT_ERR("recv error");
        if (!strcmp(buff.txt, "***EXIST***")) {
            if (-1 == (fd = open(file, O_WRONLY | O_CREAT | O_TRUNC, 0666)))
                PRINT_ERR("open error");
            while (1) {
                memset(&buff, 0, sizeof(buff));
                    //循环接收服务器发来的文件的内容的数据包
                if (-1 == (ret = recv(sockfd, &buff, sizeof(buff), 0)))
                    PRINT_ERR("recv error");
                if (0 == buff.count) {  //判断是否收到文件传输完成的校验包
                    printf("文件传输完成\n");
                    break;  //跳出循环结束进程
                }
                if (-1 == write(fd, buff.txt, strlen(buff.txt)))
                    PRINT_ERR("write error");
            }
            close(fd);
        } else if (!strcmp(buff.txt, "***NOT_EXIST***")) {  //收到文件不存在的校验包 重新循环输入文件名
            printf("文件不存在请 重新输入\n");
        }
    }

    // 关闭套接字
    close(sockfd);

    return 0;
}

Note: This method of solving sticky packets will cause unnecessary resource overhead when dealing with high concurrency and large traffic requirements.

Method 4:

        To turn off the Nagle algorithm, use the setsockopt function to turn off the Nagle algorithm at the TCP level .

    int flag =1;
    if(-1 == setsockopt(sockfd,IPPROTO_TCP,TCP_NODELAY,flag,sizeof(flag))){
        printf("setsockopt error");
    }

Guess you like

Origin blog.csdn.net/Little_Star0/article/details/129298110