Linux network programming: write your own high-performance HTTP server framework (1)

Before starting to write a high-performance HTTP server, it is easier to build a high-performance network programming framework that supports TCP, and then add support for HTTP features.

github：https://github.com/froghui/yolanda

Demand

First of all, the TCP high-performance network framework needs to meet the following three points:

First, using the reactor model, poll/epoll can be used flexibly as an event distribution implementation.

Second, it must support multi-threading, which can support single-threaded single-reactor mode, and multi-threaded master-slave reactor mode. I/O events on the socket can be separated into multiple threads.

Third, encapsulate read and write operations into Buffer objects.

According to these three requirements, the overall design idea can be divided into three parts to explain, including reactor model design, I/O model and multi-thread model design, data read and write package, and buffer .

Reactor model design ideas

Mainly to design a reactor framework based on event distribution and callback. The main objects in this framework include:

event_loop

You can understand the event_loop object as an infinite event loop bound to a thread. You will see the abstraction of event_loop in various languages. What does it mean? Simply put, it is an infinite loop event dispatcher . Once an event occurs, it will call back a predefined callback function to complete the event processing.

Specifically, event_loop uses poll or epoll methods to block a thread and wait for various I/O events to occur .

channel

We abstract all kinds of objects registered on event_loop into channels to represent them , such as listening events registered on event_loop, socket read and write events, etc. In the APIs of various languages, you will see the channel object. In general, the meaning they express is more consistent with our design ideas here.

acceptor

The acceptor object represents the server-side listener, and the acceptor object will eventually be used as a channel object and registered on the event_loop for event distribution and detection of connection completion .

event_dispatcher

event_dispatcher is an abstraction of the event distribution mechanism , that is to say, you can implement a poll-based poll_dispatcher, or you can implement an epoll-based epoll_dispatcher. Here, we design a unified event_dispatcher structure to abstract these behaviors.

channel_map

The channel_map saves the mapping from the description word to the channel , so that when an event occurs, you can quickly find the event processing function in the chanel object according to the socket corresponding to the event type.

I/O model and multi-threaded model design ideas

It mainly solves the thread running problem of event_loop, and the thread execution problem of event distribution and callback.

thread_pool

struct thread_pool {
    //创建thread_pool的主线程
    struct event_loop *mainLoop;
    //是否已经启动
    int started;
    //线程数目
    int thread_number;
    //数组指针，指向创建的event_loop_thread数组
    struct event_loop_thread *eventLoopThreads;
    //表示在数组里的位置，用来决定选择哪个event_loop_thread服务
    int position;

};

struct thread_pool *thread_pool_new(struct event_loop *mainLoop, int threadNumber);
void thread_pool_start(struct thread_pool *);
struct event_loop *thread_pool_get_loop(struct thread_pool *);

thread_pool maintains a sub-reactor thread list, which can be provided to the main reactor thread to use. Every time a new connection is established, a thread can be obtained from thread_pool so that it can be used to complete the new connection socket The read/write event registration separates the I/O thread from the main reactor thread .

event_loop_thread

struct event_loop_thread {
    struct event_loop *eventLoop;
    pthread_t thread_tid;        /* thread ID */
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    char * thread_name;
    long thread_count;    /* # connections handled */
};

//初始化已经分配内存的event_loop_thread
int event_loop_thread_init(struct event_loop_thread *, int);
//由主线程调用，初始化一个子线程，并且让子线程开始运行event_loop
struct event_loop *event_loop_thread_start(struct event_loop_thread *);

event_loop_thread is the thread implementation of reactor, and the read/write event detection of connected sockets is all done in this thread.

Buffer and data reading and writing design ideas

buffer

#define INIT_BUFFER_SIZE 65536
//数据缓冲区
struct buffer {
    char *data;          //实际缓冲
    int readIndex;       //缓冲读取位置
    int writeIndex;      //缓冲写入位置
    int total_size;      //总大小
};

struct buffer *buffer_new();
void buffer_free(struct buffer *buffer);
int buffer_writeable_size(struct buffer *buffer);
int buffer_readable_size(struct buffer *buffer);
int buffer_front_spare_size(struct buffer *buffer);

//往buffer里写数据
int buffer_append(struct buffer *buffer, void *data, int size);
//往buffer里写数据
int buffer_append_char(struct buffer *buffer, char data);
//往buffer里写数据
int buffer_append_string(struct buffer*buffer, char * data);
//读socket数据，往buffer里写
int buffer_socket_read(struct buffer *buffer, int fd);
//读buffer数据
char buffer_read_char(struct buffer *buffer);
//查询buffer数据
char * buffer_find_CRLF(struct buffer * buffer);

The buffer object shields the write and read operations of the socket. If there is no buffer object, the read/write events of the connected socket need to deal with the byte stream directly, which is obviously unfriendly. Therefore, we also provide a basic buffer object to represent the data received from the connection socket and the data that the application will send out soon.

tcp_connection

struct tcp_connection {
    struct event_loop *eventLoop;
    struct channel *channel;
    char *name;
    struct buffer *input_buffer;   //接收缓冲区
    struct buffer *output_buffer;  //发送缓冲区

    connection_completed_call_back connectionCompletedCallBack;
    message_call_back messageCallBack;
    write_completed_call_back writeCompletedCallBack;
    connection_closed_call_back connectionClosedCallBack;

    void * data; //for callback use: http_server
    void * request; // for callback use
    void * response; // for callback use
};

struct tcp_connection *
tcp_connection_new(int fd, struct event_loop *eventLoop, connection_completed_call_back 
    connectionCompletedCallBack, connection_closed_call_back connectionClosedCallBack,
    message_call_back messageCallBack, write_completed_call_back writeCompletedCallBack);

//应用层调用入口
int tcp_connection_send_data(struct tcp_connection *tcpConnection, void *data, int size);
//应用层调用入口
int tcp_connection_send_buffer(struct tcp_connection *tcpConnection, struct buffer * buffer);
void tcp_connection_shutdown(struct tcp_connection * tcpConnection);

tcp_connection This object describes the established TCP connection. Its attributes include receiving buffer, sending buffer, channel object, etc. These are the natural properties of a TCP connection. tcp_connection is a data structure that most applications directly interact with our high-performance framework. We don't want to expose the lowest-level channel object to the application, because the abstract channel object can not only represent tcp_connection, the listening socket mentioned earlier is also a channel object, and the wake-up socketpair mentioned later is also a channel object. Therefore, we designed the tcp_connection object, hoping to provide users with a clearer programming entry.

Specific design of reactor mode

Event_loop running details:

When the event_loop_run is completed, the thread enters the loop, and first executes the dispatch event distribution. Once an event occurs, the channel_event_activate function is called. The event callback functions eventReadcallback and eventWritecallback are called in this function, and finally the event_loop_handle_pending_channel is run to modify the current monitor After completing this part, it enters the event distribution loop.

event_loop analysis

It is no exaggeration to say that event_loop is the core of the entire reactor model design. First look at the data structure of event_loop. In this data structure, the most important thing is the event_dispatcher object . You can simply think of event_dispatcher as poll or epoll, which allows our thread to suspend and wait for an event to occur. There is a little trick here, event_dispatcher_data, which is defined as a void * type, and can place an object pointer we need arbitrarily according to our needs. In this way, for different implementations, such as poll or epoll, different data objects can be placed according to requirements. Event_loop also retains several objects related to multithreading. For example, owner_thread_id retains the thread ID of each event loop, and mutex and con are used for thread synchronization. socketPair is used by the parent thread to notify the child thread that there is a new event to be processed . Pending_head and pending_tail are new events that need to be processed in the child thread.

struct event_loop {
    int quit;
    const struct event_dispatcher *eventDispatcher;

    /** 对应的event_dispatcher的数据. */
    void *event_dispatcher_data;
    struct channel_map *channelMap;

    int is_handle_pending;
    struct channel_element *pending_head;
    struct channel_element *pending_tail;

    pthread_t owner_thread_id;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    int socketPair[2];
    char *thread_name;
};

Let's take a look at the main method of event_loop, the event_loop_run method. As mentioned earlier, event_loop is an infinite while loop that continuously distributes events.

/**
 * 1.参数验证
 * 2.调用dispatcher来进行事件分发,分发完回调事件处理函数
 */
int event_loop_run(struct event_loop *eventLoop) {
    assert(eventLoop != NULL);

    struct event_dispatcher *dispatcher = eventLoop->eventDispatcher;
    if (eventLoop->owner_thread_id != pthread_self()) {
        exit(1);
    }

    yolanda_msgx("event loop run, %s", eventLoop->thread_name);
    struct timeval timeval;
    timeval.tv_sec = 1;

    while (!eventLoop->quit) {
        //block here to wait I/O event, and get active channels
        dispatcher->dispatch(eventLoop, &timeval);
        //handle the pending channel
        event_loop_handle_pending_channel(eventLoop);
    }

    yolanda_msgx("event loop end, %s", eventLoop->thread_name);
    return 0;
}

The code clearly reflects this. Here we are in the loop without the event_loop exiting. The dispatch method of the dispatcher object is called in the loop body to wait for the event to occur.

event_dispacher analysis

In order to implement different event distribution mechanisms, poll, epoll, etc. are abstracted into an event_dispatcher structure. The specific implementation of event_dispatcher includes poll_dispatcher and epoll_dispatcher.

/** 抽象的event_dispatcher结构体，对应的实现如select,poll,epoll等I/O复用. */
struct event_dispatcher {
    /**  对应实现 */
    const char *name;

    /**  初始化函数 */
    void *(*init)(struct event_loop * eventLoop);

    /** 通知dispatcher新增一个channel事件*/
    int (*add)(struct event_loop * eventLoop, struct channel * channel);

    /** 通知dispatcher删除一个channel事件*/
    int (*del)(struct event_loop * eventLoop, struct channel * channel);

    /** 通知dispatcher更新channel对应的事件*/
    int (*update)(struct event_loop * eventLoop, struct channel * channel);

    /** 实现事件分发，然后调用event_loop的event_activate方法执行callback*/
    int (*dispatch)(struct event_loop * eventLoop, struct timeval *);

    /** 清除数据 */
    void (*clear)(struct event_loop * eventLoop);
};

channel object analysis

The channel object is the main structure used to interact with event_dispather, and it abstracts event distribution. A channel corresponds to a description word. The description word can have READ events or WRITE events. The channel object is bound to event processing functions event_read_callback and event_write_callback.

typedef int (*event_read_callback)(void *data);
typedef int (*event_write_callback)(void *data);

struct channel {
    int fd;
    int events;   //表示event类型

    event_read_callback eventReadCallback;
    event_write_callback eventWriteCallback;
    void *data; //callback data, 可能是event_loop，也可能是tcp_server或者tcp_connection
};

channel_map object analysis

After the event_dispatcher obtains the list of event events, it needs to find the corresponding channel through the file description, so as to call back the event processing functions event_read_callback and event_write_callback on the channel. For this, the channel_map object is designed.

/**
 * channel映射表, key为对应的socket描述字
 */
struct channel_map {
    void **entries;

    /* The number of entries available in entries */
    int nentries;
};

The channel_map object is an array, the subscript of the array is the description word, and the element of the array is the address of the channel object. For example, the channel corresponding to description word 3 can be directly obtained in this way.

struct chanenl * channel = map->entries[3];

In this way, when event_dispatcher needs to call back the read and write functions on the channel, call channel_event_activate. The following is the implementation of channel_event_activate. After finding the corresponding channel object, call back the read function or write function according to the event type. Note that EVENT_READ and EVENT_WRITE are used here to abstract all the read and write event types of poll and epoll.

int channel_event_activate(struct event_loop *eventLoop, int fd, int revents) {
    struct channel_map *map = eventLoop->channelMap;
    yolanda_msgx("activate channel fd == %d, revents=%d, %s", fd, revents, eventLoop->thread_name);

    if (fd < 0)
        return 0;
    if (fd >= map->nentries)
        return (-1);

    struct channel *channel = map->entries[fd];
    assert(fd == channel->fd);
    if (revents & (EVENT_READ)) {
        if (channel->eventReadCallback) 
            channel->eventReadCallback(channel->data);
    }
    if (revents & (EVENT_WRITE)) {
        if (channel->eventWriteCallback) 
            channel->eventWriteCallback(channel->data);
    }
    return 0;
}

Add, delete, modify channel event

So how to add a new channel event event? These functions are used to add, delete and modify channel event events.

int event_loop_add_channel_event(struct event_loop *eventLoop, int fd, struct channel *channel1);
int event_loop_remove_channel_event(struct event_loop *eventLoop, int fd, struct channel *channel1);
int event_loop_update_channel_event(struct event_loop *eventLoop, int fd, struct channel *channel1);

The first three functions provide entry capabilities, and the actual implementation falls on these three functions:

int event_loop_handle_pending_add(struct event_loop *eventLoop, int fd, struct channel *channel);
int event_loop_handle_pending_remove(struct event_loop *eventLoop, int fd, struct channel *channel);
int event_loop_handle_pending_update(struct event_loop *eventLoop, int fd, struct channel *channel);

Let's take a look at one of the implementations, event_loop_handle_pendign_add adds a new key-value pair to the channel_map of the current event_loop. The key is the file description word and the value is the address of the channel object. Then call the add method of the event_dispatcher object to increase the channel event event. Note that this method is always executed in the current I/O thread.

// in the i/o thread
int event_loop_handle_pending_add(struct event_loop *eventLoop, int fd, struct channel *channel) {
    yolanda_msgx("add channel fd == %d, %s", fd, eventLoop->thread_name);
    struct channel_map *map = eventLoop->channelMap;
    if (fd < 0)
        return 0;
    if (fd >= map->nentries) {
        if (map_make_space(map, fd, sizeof(struct channel *)) == -1)
            return (-1);
    }

    //第一次创建，增加
    if ((map)->entries[fd] == NULL) {
        map->entries[fd] = channel;
        //add channel
        struct event_dispatcher *eventDispatcher = eventLoop->eventDispatcher;
        eventDispatcher->add(eventLoop, channel);
        return 1;
    }
    return 0;
}

to sum up

In this lecture, we introduced the main design ideas and basic data structures of the high-performance network programming framework, as well as specific practices related to reactor design.

Learn the new by reviewing the past!

Linux network programming: write your own high-performance HTTP server framework (1)

Guess you like