TCP custom protocol, serialization and deserialization

Table of contents

foreword

1. Understand the protocol

2. Network version calculator

2.1 Design ideas

2.2 Interface design

2.3 Code implementation:

2.4 Compilation test

Summarize


foreword

        In the previous article, we said that TCP is byte-oriented, but we may not understand the concept of byte-oriented. Today we will introduce how to understand that TCP is byte-oriented. Through encoding The way to customize the protocol by yourself to achieve serialization and deserialization. I believe that after reading this article, you will have a clear understanding of the concept of TCP-oriented byte stream. Let's take a look together.

1. Understand the protocol

        Earlier, we introduced the protocol in a popular way. In the network, the protocol is a kind of agreement. Today we will talk about how the protocol is reflected when data is transmitted in the network.

According to the TCP server we wrote before to achieve data communication, we know the socket api interface. When reading and writing data, they are sent and received in the form of "strings". What if we want to transmit some "structured data"? ?

What is structured data?

To give a simple example, for example, when sending information on WeChat, in addition to the sent information, it also includes information such as nickname, time, and avatar, which together are called structured data.

So the process of packaging structured data into a string is called serialization , and the process of converting a packaged string into structured data is called deserialization

as the picture shows:

The process of TCP sending and receiving data

as the picture shows:

As a programmer, define a buffer at the application layer, and then the send interface sends the data. Here, the sending does not directly send the data to the network, but calls the send interface to copy the data to the buffer maintained by the transport layer operating system. , and read is to copy the data of the transport layer to the application layer. After the data is copied to the transport layer, how to continue sending the remaining data is maintained by the operating system, so the TCP protocol is called the transmission control protocol, and because TCP The protocol can be that the client sends information to the server, or the server sends information to the client, so TCP is full-duplex.

After understanding the process of sending and receiving data by the TCP protocol, because TCP is oriented to byte streams, when data is sent from the client to the server, is it possible that there is less than one message in the receiving buffer of the server? Is it possible? The upper layer has no time to deal with the situation that there are multiple messages in the receiving buffer of the server transport layer. How to get a complete message correctly at this time?

Because of the existence of these problems, we need to customize the protocol, specify the size of a complete message, and define the boundary between a message and a message, so we need to adopt a customized protocol solution to obtain a correct message.

There are generally three strategies adopted:

1. Fixed length

2. Special symbols

3. Self-describing way

Next, we implement the protocol customization in encoding according to the above three methods.

2. Network version calculator

Description: In order to demonstrate how protocol customization, serialization and deserialization are implemented in encoding, and how to reflect the byte-oriented characteristics of TCP in encoding, we will introduce it to you by implementing a network version calculator

Implement the network version computer convention:

The client sends a string in the form of "1+1";
there are two operands in this string, both of which are integers;
there will be a character between the two numbers that is an operator, and the operator can only be +;
numbers no spaces between and operators;

2.1 Design ideas

        The client packages the request to be calculated into a string in a serialized manner, and then sends it to the server. The server accurately receives the client's request according to the custom protocol, and then the server performs deserialization to obtain the structured data, and then process the business logic calculation results. After the calculation results are completed, the server will serialize the calculation results and package them into a string to send to the client, and then the client will accurately obtain a string sent by the server according to the custom protocol The complete message, so far a network version of the calculator has been implemented based on the TCP protocol

2.2 Interface design

To fulfill the above requirements, several interfaces must be included:

a. Request serialization and deserialization

b. Serialization and deserialization of responses

c. Protocol customization

d. Calculate business logic

e. Accurately obtain a message

f. Client and server writing

2.3 Code implementation:

1. Serialization and deserialization of requests

class Request
{
public:
    Request()
    :x(0),y(0),op(char()){}
    Request(int x_, int y_, char op_) : x(x_), y(y_), op(op_)
    {}
    bool serialize(std::string *out)
    {
        *out = "";
        // 结构化 -> "x op y";
        std::string x_string = std::to_string(x);
        std::string y_string = std::to_string(y);

        *out = x_string;
        *out += SEP;
        *out += op;
        *out += SEP;
        *out += y_string;
        return true;
    }

    // "x op yyyy";
    bool deserialize(const std::string &in)
    {
        // "x op y" -> 结构化
        auto left = in.find(SEP);
        auto right = in.rfind(SEP);
        if (left == std::string::npos || right == std::string::npos)
            return false;
        if (left == right)
            return false;
        if (right - (left + SEP_LEN) != 1)
            return false;

        std::string x_string = in.substr(0, left); // [0, 2) [start, end) , start, end - start
        std::string y_string = in.substr(right + SEP_LEN);

        if (x_string.empty())
            return false;
        if (y_string.empty())
            return false;
        x = stoi(x_string);
        y = stoi(y_string);
        op = in[left + SEP_LEN];
        return true;
    }

public:
    int x;
    int y;
    char op;
};

Serialization result: convert x,y,op -> to "xy op\r\n"

Deserialization result: "xy op\r\n" -> converted to x,y,op

2. Serialization and deserialization of the response

#define SEP " "
#define SEP_LEN strlen(SEP) // 不敢使用sizeof()
#define LINE_SEP "\r\n"
#define LINE_SEP_LEN strlen(LINE_SEP) // 不敢使用sizeof()
class Response
{
public:
    Response()
    :exitcode(0),result(0) {}
    Response(int exitcode_, int result_) : exitcode(exitcode_), result(result_)
    {}
    bool serialize(std::string *out)
    {
        *out = "";
        std::string ec_string = std::to_string(exitcode);
        std::string res_string = std::to_string(result);

        *out = ec_string;
        *out += SEP;
        *out += res_string;
        return true;
    }
    bool deserialize(const std::string &in)
    {
        // "exitcode result"
        auto mid = in.find(SEP);
        if (mid == std::string::npos)
            return false;
        std::string ec_string = in.substr(0, mid);
        std::string res_string = in.substr(mid + SEP_LEN);
        if (ec_string.empty() || res_string.empty())
            return false;

        exitcode = std::stoi(ec_string);
        result = std::stoi(res_string);
        return true;
    }
public:
    int exitcode;
    int result;
};

Serialization result: convert exitcode,result -> to "exitcode result\r\n"

Deserialization result: "exitcode result\r\n" -> converted to exitcode,result

3. Protocol customization

Description: Use self-describing method + special symbols to add the length of the message to a message header, and the special symbol "\r\n" is used to distinguish the message length and message data

#define SEP " "
#define SEP_LEN strlen(SEP) // 不敢使用sizeof()
#define LINE_SEP "\r\n"
#define LINE_SEP_LEN strlen(LINE_SEP) // 不敢使用sizeof()
//enLength 和 deLength:打包和解包,解决服务端和客户端准确拿到数据
// "x op y" -> "content_len"\r\n"x op y"\r\n
// "exitcode result" -> "content_len"\r\n"exitcode result"\r\n
std::string enLength(const std::string &text)
{
    std::string send_string = std::to_string(text.size());
    send_string += LINE_SEP;
    send_string += text;
    send_string += LINE_SEP;

    return send_string;
}

// "content_len"\r\n"exitcode result"\r\n
bool deLength(const std::string &package, std::string *text)
{
    auto pos = package.find(LINE_SEP);
    if (pos == std::string::npos)
        return false;
    std::string text_len_string = package.substr(0, pos);
    int text_len = std::stoi(text_len_string);
    *text = package.substr(pos + LINE_SEP_LEN, text_len);
    return true;
}

4. Calculate business logic

//req是反序列化后的结果,根据res业务处理填充req即可
bool cal(const Request& req,Response& res)
{
    //req是结构化的数据,可以直接使用
    // req已经有结构化完成的数据啦,你可以直接使用
    res.exitcode = OK;
    res.result = OK;

    switch (req.op)
    {
    case '+':
        res.result = req.x + req.y;
        break;
    case '-':
        res.result = req.x - req.y;
        break;
    case '*':
        res.result = req.x * req.y;
        break;
    case '/':
    {
        if (req.y == 0)
            res.exitcode = DIV_ZERO;
        else
            res.result = req.x / req.y;
    }
    break;
    case '%':
    {
        if (req.y == 0)
            res.exitcode = MOD_ZERO;
        else
            res.result = req.x % req.y;
    }
    break;
    default:
        res.exitcode = OP_ERROR;
        break;
    }

    return true;
}

5. Accurately obtain a message

//从sock中读取数据保存到text中
//continue是因为tcp协议是面向字节流的,传输数据的时候可能不完整
bool recvPackage(int sock,string &inbuffer,string *text)
{
    char buffer[1024];
    while (true)
    {
        ssize_t n = recv(sock, buffer, sizeof(buffer) - 1, 0);
        if (n > 0)
        {
            buffer[n] = 0;
            inbuffer += buffer;
            // 分析处理
            auto pos = inbuffer.find(LINE_SEP);
            if (pos == std::string::npos)
                continue;
            std::string text_len_string = inbuffer.substr(0, pos);
            int text_len = std::stoi(text_len_string);
            int total_len = text_len_string.size() + 2 * LINE_SEP_LEN + text_len;
            // text_len_string + "\r\n" + text + "\r\n" <= inbuffer.size();
            std::cout << "处理前#inbuffer: \n" << inbuffer << std::endl;

            if (inbuffer.size() < total_len)
            {
                std::cout << "你输入的消息,没有严格遵守我们的协议,正在等待后续的内容, continue" << std::endl;
                continue;
            }

            // 至少有一个完整的报文
            *text = inbuffer.substr(0, total_len);
            inbuffer.erase(0, total_len);

            std::cout << "处理后#inbuffer:\n " << inbuffer << std::endl;

            break;
        }
        else
            return false;
    }
    return true;
}

Note: Seeing here we can understand that TCP is a byte-oriented concept.

6. Client and server implementation:

calServer.hpp:

#pragma once

#include <iostream>
#include <string>
#include <cstring>
#include <cstdlib>
#include <functional>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/wait.h>
#include <signal.h>
#include "log.hpp"
#include "protocol.hpp" //按照协议约定读取请求
using namespace std;
namespace server
{
    enum
    {
        USAGE_ERR = 1,
        SOCKET_ERR,
        BIND_ERR,
        LISTEN_ERR
    };
    static const uint16_t gport = 8080;
    static const int gbacklog = 5;
    typedef function<bool(const Request& req,Response& res)> func_t;
    //读取请求,保证解耦
    void handlerEnter(int sock,func_t fun)
    {
       string inbuffer;
       while(true)
       {
            //1. 读取:"content_len"\r\n"x op y"\r\n
            // 1.1 你怎么保证你读到的消息是 【一个】完整的请求
            string req_text, req_str;
            if (!recvPackage(sock,inbuffer,&req_text))
                return;
            std::cout << "带报头的请求:\n" << req_text << std::endl;
            //req_str:获取报文
            if (!deLength(req_text, &req_str))
                return;
            std::cout << "去掉报头的正文:\n" << req_str << std::endl;
            // 2. 对请求Request,反序列化
            // 2.1 得到一个结构化的请求对象
            Request req;
            if(!req.deserialize(req_str))
                return;
            // 3. 计算机处理,req.x, req.op, req.y --- 业务逻辑
            // 3.1 得到一个结构化的响应
            Response res;
            fun(req,res);//req处理的结果放到res中,采用回调的方式保证上层业务逻辑和服务器的解耦
            // 4.对响应Response,进行序列化
            // 4.1 得到了一个"字符串"
            string resp_str;
            if(!res.serialize(&resp_str))
                return;
             std::cout << "计算完成, 序列化响应: " <<  resp_str << std::endl;
            // 5. 然后我们在发送响应
            // 5.1 构建成为一个完整的报文
            std::string send_string = enLength(resp_str);
            std::cout << "构建完成完整的响应\n" <<  send_string << std::endl;
            send(sock, send_string.c_str(), send_string.size(), 0); // 其实这里的发送也是有问题的,不过后面再说
       }
    }
    class CalServer
    {
    public:
        CalServer(const uint16_t &port = gport) : _listensock(-1), _port(port)
        {}
        void initServer()
        {
            // 1. 创建socket文件套接字对象
            _listensock = socket(AF_INET, SOCK_STREAM, 0);
            if (_listensock < 0)
            {
                logMessage(FATAL, "create socket error");
                exit(SOCKET_ERR);
            }
            logMessage(NORMAL, "create socket success: %d", _listensock);

            // 2. bind绑定自己的网络信息
            struct sockaddr_in local;
            memset(&local, 0, sizeof(local));
            local.sin_family = AF_INET;
            local.sin_port = htons(_port);
            local.sin_addr.s_addr = INADDR_ANY;
            if (bind(_listensock, (struct sockaddr *)&local, sizeof(local)) < 0)
            {
                logMessage(FATAL, "bind socket error");
                exit(BIND_ERR);
            }
            logMessage(NORMAL, "bind socket success");

            // 3. 设置socket 为监听状态
            if (listen(_listensock, gbacklog) < 0) // 第二个参数backlog后面在填这个坑
            {
                logMessage(FATAL, "listen socket error");
                exit(LISTEN_ERR);
            }
            logMessage(NORMAL, "listen socket success");
        }
        void start(func_t fun)
        {
            for (;;)
            {
                // 4. server 获取新链接
                // sock, 和client进行通信的fd
                struct sockaddr_in peer;
                socklen_t len = sizeof(peer);
                int sock = accept(_listensock, (struct sockaddr *)&peer, &len);
                if (sock < 0)
                {
                    logMessage(ERROR, "accept error, next");
                    continue;
                }
                logMessage(NORMAL, "accept a new link success, get new sock: %d", sock); // ?

                // version 2 多进程版(2)
                pid_t id = fork();
                if (id == 0) // child
                {
                    close(_listensock);
                    handlerEnter(sock,fun);
                    close(sock);
                    exit(0);
                }
                close(sock);

                // father
                pid_t ret = waitpid(id, nullptr, 0);
                if (ret > 0)
                {
                    logMessage(NORMAL, "wait child success"); // ?
                }
            }
        }
        ~CalServer() {}

    private:
        int _listensock; // 不是用来进行数据通信的,它是用来监听链接到来,获取新链接的!
        uint16_t _port;
    };

} // namespace server

calClient.hpp:

#pragma once

#include <iostream>
#include <string>
#include <cstring>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include "protocol.hpp"

#define NUM 1024

class CalClient
{
public:
    CalClient(const std::string &serverip, const uint16_t &serverport)
        : _sock(-1), _serverip(serverip), _serverport(serverport)
    {}
    void initClient()
    {
        // 1. 创建socket
        _sock = socket(AF_INET, SOCK_STREAM, 0);
        if (_sock < 0)
        {
            std::cerr << "socket create error" << std::endl;
            exit(2);
        }
    }
    void start()
    {
        struct sockaddr_in server;
        memset(&server, 0, sizeof(server));
        server.sin_family = AF_INET;
        server.sin_port = htons(_serverport);
        server.sin_addr.s_addr = inet_addr(_serverip.c_str());

        if (connect(_sock, (struct sockaddr *)&server, sizeof(server)) != 0)
        {
            std::cerr << "socket connect error" << std::endl;
        }
        else
        {
            std::string line;
            std::string inbuffer;
            while (true)
            {
                std::cout << "mycal>>> ";
                std::getline(std::cin, line);  // 1+1
                Request req = ParseLine(line); // "1+1"
                std::string content;
                req.serialize(&content);
                std::string send_string = enLength(content);
                send(_sock, send_string.c_str(), send_string.size(), 0); // bug?? 不管

                std::string package, text;
                //  "content_len"\r\n"exitcode result"\r\n
                if (!recvPackage(_sock, inbuffer, &package))
                    continue;
                if (!deLength(package, &text))
                    continue;
                // "exitcode result"
                Response resp;
                resp.deserialize(text);
                std::cout << "exitCode: " << resp.exitcode << std::endl;
                std::cout << "result: " << resp.result << std::endl;
            }
        }
    }
    Request ParseLine(const std::string &line)
    {
        // 建议版本的状态机!
        //"1+1" "123*456" "12/0"
        int status = 0; // 0:操作符之前,1:碰到了操作符 2:操作符之后
        int i = 0;
        int cnt = line.size();
        std::string left, right;
        char op;
        while (i < cnt)
        {
            switch (status)
            {
            case 0:
            {
                if(!isdigit(line[i]))
                {
                    op = line[i];
                    status = 1;
                }
                else left.push_back(line[i++]);
            }
            break;
            case 1:
                i++;
                status = 2;
                break;
            case 2:
                right.push_back(line[i++]);
                break;
            }
        }
        std::cout << std::stoi(left)<<" " << std::stoi(right) << " " << op << std::endl;
        return Request(std::stoi(left), std::stoi(right), op);
    }
    ~CalClient()
    {
        if (_sock >= 0)
            close(_sock);
    }

private:
    int _sock;
    std::string _serverip;
    uint16_t _serverport;
};

2.4 Compilation test

As shown in the figure: we have accurately implemented the network version calculator

Summarize

        Through the writing of the above code, including the implementation of custom protocols, serialization and deserialization codes, we can understand the importance of protocols in network transmission, and understand the concept that TCP is byte-oriented. Thank you for watching, I hope it can help you, and see you next time.

Guess you like

Origin blog.csdn.net/qq_65307907/article/details/132207935