Linux network programming: write your own high-performance HTTP server framework (3)

github:https://github.com/froghui/yolanda

buffer object

Buffer, as the name implies, is a buffer object that caches the data received from the socket and the data that needs to be sent to the socket.

If the data is received from the socket, the event processing callback function is constantly adding data to the buffer object. At the same time, the application needs to continuously process the data in the buffer object, so that the buffer object can vacate a new position To accommodate more data.

If it is data sent to the socket, the application continuously adds data to the buffer object. At the same time, the event processing callback function continuously calls the send function on the socket to send the data, reducing the data written in the buffer object.

It can be seen that the buffer object can be used in both the input buffer and output buffer directions at the same time, but in the two cases, the objects written and read are different.

The following shows the design of the buffer object:

                       

//数据缓冲区
struct buffer {
    char *data;          //实际缓冲
    int readIndex;       //缓冲读取位置
    int writeIndex;      //缓冲写入位置
    int total_size;      //总大小
};

The writeIndex in the buffer object identifies the current position that can be written; readIndex identifies the current position of the data that can be read. The red part in the figure from readIndex to writeIndex is the part that needs to read data, and the green part is from writeIndex to the cache At the end is the part that can be written.

As time goes on, when readIndex and writeIndex get closer and closer to the end of the buffer, the front_space_size area in the front part becomes very large, and the data in this area is already old data. At this time, you need to adjust the entire The structure of the buffer object moves the red part to the left, and at the same time, the green part will also move to the left, and the writable part of the entire buffer will increase.

The make_room function plays this role. If the green continuous space on the right is not enough to accommodate new data, and the gray part on the left plus the green part on the right can accommodate the new data, such a mobile copy will be triggered, and the red part will eventually be occupied. On the far left, the green part occupies the right, and the green part on the right becomes a continuous writable space, which can accommodate new data. The following figure explains this process.

                                    

void make_room(struct buffer *buffer, int size) {
    if (buffer_writeable_size(buffer) >= size) {
        return;
    }
    //如果front_spare和writeable的大小加起来可以容纳数据,则把可读数据往前面拷贝
    if (buffer_front_spare_size(buffer) + buffer_writeable_size(buffer) >= size) {
        int readable = buffer_readable_size(buffer);
        int i;
        for (i = 0; i < readable; i++) {
            memcpy(buffer->data + i, buffer->data + buffer->readIndex + i, 1);
        }
        buffer->readIndex = 0;
        buffer->writeIndex = readable;
    } else {
        //扩大缓冲区
        void *tmp = realloc(buffer->data, buffer->total_size + size);
        if (tmp == NULL) {
            return;
        }
        buffer->data = tmp;
        buffer->total_size += size;
    }
}

Of course, if the red part occupies too much and the writable part is not enough, it will trigger the expansion of the buffer. Here I complete the expansion of the buffer by calling the realloc function.

                                    

TCP byte stream processing

  • Receive data

Socket receiving data is done by handle_read in tcp_connection.c. In this function, the data stream from the socket is received by calling the buffer_socket_read function and buffered into the buffer object. After that, you can see that we pass the buffer object and tcp_connection object to the application's real processing function messageCallBack for message parsing. The sample of this part will be expanded in the HTTP packet analysis.

int handle_read(void *data) {
    struct tcp_connection *tcpConnection = (struct tcp_connection *) data;
    struct buffer *input_buffer = tcpConnection->input_buffer;
    struct channel *channel = tcpConnection->channel;

    if (buffer_socket_read(input_buffer, channel->fd) > 0) {
        //应用程序真正读取Buffer里的数据
        if (tcpConnection->messageCallBack != NULL) {
            tcpConnection->messageCallBack(input_buffer, tcpConnection);
        }
    } else {
        handle_connection_closed(tcpConnection);
    }
}

In the buffer_socket_read function, call readv to write data to two buffers, one is the buffer object, and the other is the additional_buffer here. The reason for this is that the buffer object cannot accommodate the data stream from the socket, and There is also no way to trigger the expansion of the buffer object. By using additional buffers, once it is determined that the data read from the socket exceeds the actual maximum writable size of the buffer object, the expansion operation of the buffer object can be triggered. Here, the buffer_append function will call the make_room function described earlier to complete Expansion of the buffer object.

int buffer_socket_read(struct buffer *buffer, int fd) {
    char additional_buffer[INIT_BUFFER_SIZE];
    struct iovec vec[2];
    int max_writable = buffer_writeable_size(buffer);
    vec[0].iov_base = buffer->data + buffer->writeIndex;
    vec[0].iov_len = max_writable;
    vec[1].iov_base = additional_buffer;
    vec[1].iov_len = sizeof(additional_buffer);
    int result = readv(fd, vec, 2);
    if (result < 0) {
        return -1;
    } else if (result <= max_writable) {
        buffer->writeIndex += result;
    } else {
        buffer->writeIndex = buffer->total_size;
        buffer_append(buffer, additional_buffer, result - max_writable);
    }
    return result;
}
  • send data

When the application needs to send data to the socket, that is, after the read-decode-compute-encode process is completed, the data after encoding is written into the buffer object, and tcp_connection_send_buffer is called to buffer the data in the buffer through the socket Area sent out.

int tcp_connection_send_buffer(struct tcp_connection *tcpConnection, struct buffer *buffer) {
    int size = buffer_readable_size(buffer);
    int result = tcp_connection_send_data(tcpConnection, buffer->data + buffer->readIndex, size);
    buffer->readIndex += size;
    return result;
}

If it is found that the current channel has not registered the WRITE event, and there is no data to be sent in the sending buffer corresponding to the current tcp_connection, directly call the write function to send the data. If the sending is not complete this time, copy the remaining data to be sent to the sending buffer corresponding to the current tcp_connection, and register the WRITE event to event_loop. In this way, the data is taken over by the framework, and the application releases this part of the data.

//应用层调用入口
int tcp_connection_send_data(struct tcp_connection *tcpConnection, void *data, int size) {
    size_t nwrited = 0;
    size_t nleft = size;
    int fault = 0;
    struct channel *channel = tcpConnection->channel;
    struct buffer *output_buffer = tcpConnection->output_buffer;

    //先往套接字尝试发送数据
    if (!channel_write_event_registered(channel) && buffer_readable_size(output_buffer) == 0) {
        nwrited = write(channel->fd, data, size);
        if (nwrited >= 0) {
            nleft = nleft - nwrited;
        } else {
            nwrited = 0;
            if (errno != EWOULDBLOCK) {
                if (errno == EPIPE || errno == ECONNRESET) {
                    fault = 1;
                }
            }
        }
    }

    if (!fault && nleft > 0) {
        //拷贝到Buffer中,Buffer的数据由框架接管
        buffer_append(output_buffer, data + nwrited, nleft);
        if (!channel_write_event_registered(channel)) {
            channel_write_event_add(channel);
        }
    }
    return nwrited;
}

HTTP protocol implementation

To this end, we first defined an http_server structure. This http_server is essentially a TCPServer, but the callback function exposed to the application is simpler. You only need to see the http_request and http_response structures.

typedef int (*request_callback)(struct http_request *httpRequest, struct http_response *httpResponse);

struct http_server {
    struct TCPserver *tcpServer;
    request_callback requestCallback;
};

In http_server, the key point is to complete the analysis of the message and convert the parsed message into an http_request object. This is done through the http_onMessage callback function. In the http_onMessage function, parse_http_request is called to complete the message parsing.

// buffer是框架构建好的,并且已经收到部分数据的情况下
// 注意这里可能没有收到全部数据,所以要处理数据不够的情形
int http_onMessage(struct buffer *input, struct tcp_connection *tcpConnection) {
    yolanda_msgx("get message from tcp connection %s", tcpConnection->name);

    struct http_request *httpRequest = (struct http_request *) tcpConnection->request;
    struct http_server *httpServer = (struct http_server *) tcpConnection->data;

    if (parse_http_request(input, httpRequest) == 0) {
        char *error_response = "HTTP/1.1 400 Bad Request\r\n\r\n";
        tcp_connection_send_data(tcpConnection, error_response, sizeof(error_response));
        tcp_connection_shutdown(tcpConnection);
    }

    //处理完了所有的request数据,接下来进行编码和发送
    if (http_request_current_state(httpRequest) == REQUEST_DONE) {
        struct http_response *httpResponse = http_response_new();

        //httpServer暴露的requestCallback回调
        if (httpServer->requestCallback != NULL) {
            httpServer->requestCallback(httpRequest, httpResponse);
        }

        //将httpResponse发送到套接字发送缓冲区中
        struct buffer *buffer = buffer_new();
        http_response_encode_buffer(httpResponse, buffer);
        tcp_connection_send_buffer(tcpConnection, buffer);

        if (http_request_close_connection(httpRequest)) {
            tcp_connection_shutdown(tcpConnection);
            http_request_reset(httpRequest);
        }
    }
}

HTTP uses carriage return and line feed as the boundary of the HTTP message protocol:

                 

The idea of ​​parse_http_request is to find the boundary of the message and record the current state of the parsing work. According to the sequence of the analysis work, the message analysis work is divided into four stages: REQUEST_STATUS, REQUEST_HEADERS, REQUEST_BODY and REQUEST_DONE, and the method of parsing in each stage is different.

When parsing the status line, first define the status line by locating the position of the CRLF carriage return and line feed. When entering the status line parsing, find the space character again as the separation boundary.

When parsing the header settings, it is also the first to define a set of key-value pairs by locating the position of the CRLF carriage return and line feed, and then to find the colon character as the separation boundary.

Finally, if the colon character is not found, the work of parsing the header is complete.

The parse_http_request function completes the four stages of HTTP message parsing:

int parse_http_request(struct buffer *input, struct http_request *httpRequest) {
    int ok = 1;
    while (httpRequest->current_state != REQUEST_DONE) {
        if (httpRequest->current_state == REQUEST_STATUS) {
            char *crlf = buffer_find_CRLF(input);
            if (crlf) {
                int request_line_size = process_status_line(input->data + input->readIndex, crlf, httpRequest);
                if (request_line_size) {
                    input->readIndex += request_line_size;  // request line size
                    input->readIndex += 2;  //CRLF size
                    httpRequest->current_state = REQUEST_HEADERS;
                }
            }
        } else if (httpRequest->current_state == REQUEST_HEADERS) {
            char *crlf = buffer_find_CRLF(input);
            if (crlf) {
                /**
                 *    <start>-------<colon>:-------<crlf>
                 */
                char *start = input->data + input->readIndex;
                int request_line_size = crlf - start;
                char *colon = memmem(start, request_line_size, ": ", 2);
                if (colon != NULL) {
                    char *key = malloc(colon - start + 1);
                    strncpy(key, start, colon - start);
                    key[colon - start] = '\0';
                    char *value = malloc(crlf - colon - 2 + 1);
                    strncpy(value, colon + 1, crlf - colon - 2);
                    value[crlf - colon - 2] = '\0';

                    http_request_add_header(httpRequest, key, value);

                    input->readIndex += request_line_size;  //request line size
                    input->readIndex += 2;  //CRLF size
                } else {
                    //读到这里说明:没找到,就说明这个是最后一行
                    input->readIndex += 2;  //CRLF size
                    httpRequest->current_state = REQUEST_DONE;
                }
            }
        }
    }
    return ok;
}

After processing all the request data, the work of encoding and sending is performed next. To this end, an http_response object is created and the encoding function requestCallback provided by the application is called. Next, a buffer object is created. The function http_response_encode_buffer is used to convert the data in http_response into the corresponding byte stream according to the HTTP protocol.

As you can see, http_response_encode_buffer sets the http_response header such as Content-Length, as well as the body part data of http_response.

void http_response_encode_buffer(struct http_response *httpResponse, struct buffer *output) {
    char buf[32];
    snprintf(buf, sizeof buf, "HTTP/1.1 %d ", httpResponse->statusCode);
    buffer_append_string(output, buf);
    buffer_append_string(output, httpResponse->statusMessage);
    buffer_append_string(output, "\r\n");

    if (httpResponse->keep_connected) {
        buffer_append_string(output, "Connection: close\r\n");
    } else {
        snprintf(buf, sizeof buf, "Content-Length: %zd\r\n", strlen(httpResponse->body));
        buffer_append_string(output, buf);
        buffer_append_string(output, "Connection: Keep-Alive\r\n");
    }

    if (httpResponse->response_headers != NULL && httpResponse->response_headers_number > 0) {
        for (int i = 0; i < httpResponse->response_headers_number; i++) {
            buffer_append_string(output, httpResponse->response_headers[i].key);
            buffer_append_string(output, ": ");
            buffer_append_string(output, httpResponse->response_headers[i].value);
            buffer_append_string(output, "\r\n");
        }
    }

    buffer_append_string(output, "\r\n");
    buffer_append_string(output, httpResponse->body);
}

Complete HTTP server example

Now, writing an HTTP server example becomes very simple. In this example, the most important part is the onRequest callback function. Here, the onRequest method has been after parse_http_request, and can be calculated and processed according to different http_request information. The logic in the example program is very simple. According to the URL path of the http request, different http_response types are returned. For example, when the request is the root directory, the 200 and HTML format are returned.

#include <lib/acceptor.h>
#include <lib/http_server.h>
#include "lib/common.h"
#include "lib/event_loop.h"

//数据读到buffer之后的callback
int onRequest(struct http_request *httpRequest, struct http_response *httpResponse) {
    char *url = httpRequest->url;
    char *question = memmem(url, strlen(url), "?", 1);
    char *path = NULL;
    if (question != NULL) {
        path = malloc(question - url);
        strncpy(path, url, question - url);
    } else {
        path = malloc(strlen(url));
        strncpy(path, url, strlen(url));
    }

    if (strcmp(path, "/") == 0) {
        httpResponse->statusCode = OK;
        httpResponse->statusMessage = "OK";
        httpResponse->contentType = "text/html";
        httpResponse->body = "<html><head><title>This is network programming</title></head><body><h1>Hello, network programming</h1></body></html>";
    } else if (strcmp(path, "/network") == 0) {
        httpResponse->statusCode = OK;
        httpResponse->statusMessage = "OK";
        httpResponse->contentType = "text/plain";
        httpResponse->body = "hello, network programming";
    } else {
        httpResponse->statusCode = NotFound;
        httpResponse->statusMessage = "Not Found";
        httpResponse->keep_connected = 1;
    }

    return 0;
}


int main(int c, char **v) {
    //主线程event_loop
    struct event_loop *eventLoop = event_loop_init();

    //初始tcp_server,可以指定线程数目,如果线程是0,就是在这个线程里acceptor+i/o;如果是1,有一个I/O线程
    //tcp_server自己带一个event_loop
    struct http_server *httpServer = http_server_new(eventLoop, SERV_PORT, onRequest, 2);
    http_server_start(httpServer);

    // main thread for acceptor
    event_loop_run(eventLoop);
}

After running this program, we can access it through the browser and curl command. You can open multiple browsers and curl commands at the same time, which also proves that our program can meet high concurrency requirements.

$curl -v http://127.0.0.1:43211/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 43211 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:43211
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Length: 116
< Connection: Keep-Alive
<
* Connection #0 to host 127.0.0.1 left intact
<html><head><title>This is network programming</title></head><body><h1>Hello, network programming</h1></body></html>%

                        

In this lecture, we mainly talked about the byte stream processing capabilities of the entire programming framework, introduced the buffer object, and on this basis, by adding HTTP features, including http_server, http_request, http_response, completed the preparation of the HTTP high-performance server. The example program uses the capabilities provided by the framework to write a simple HTTP server program.

 

Learn the new by reviewing the past!

 

Guess you like

Origin blog.csdn.net/qq_24436765/article/details/105049360