Linux network programming: write your own high-performance HTTP server framework (2)

github:https://github.com/froghui/yolanda

I/O model and multi-threaded model implementation

  • Several considerations for multithreaded design

In our design, the main reactor thread is an acceptor thread. Once this thread is created, it will be blocked in the dispatch method of event_dispatcher in the form of event_loop. In fact, it is waiting for the event on the listening socket to occur, which is completed. Once the connection is completed, the connection object tcp_connection and the channel object will be created.

When the user expects to use multiple sub-reactor sub-threads, the main thread will create multiple sub-threads. After each sub-thread is created, it immediately runs and initializes according to the startup function specified by the main thread. The ensuing question is , how does the main thread judge that the child thread has finished initializing and starting, and continues to execute it? This is a key issue that needs to be resolved.

When multiple threads are set, the read and write events corresponding to the newly created connected socket need to be handed over to a sub-reactor thread for processing. Therefore, a thread is taken from thread_pool to notify this thread that a new event has been added . And this thread is probably in the blocking call of event distribution. How to coordinate the main thread data to be written to the child thread is another key issue that needs to be solved.

The child thread is an event_loop thread, which is blocked on dispatch. Once an event occurs, it will look up the channel_map, find the corresponding processing function and execute it. After that, it will add, delete or modify pending events and enter the next round of dispatch again.

A picture is placed in the manuscript to illustrate the running relationship of threads:

                                           

In order to facilitate your understanding, I have listed the corresponding function implementation in another figure.

                                           

  • The main thread waits for multiple sub-reactor sub-threads to be initialized

The main thread needs to wait for the child thread to complete the initialization, that is, it needs to obtain the feedback of the corresponding data of the child thread, and the child thread initialization also initializes this part of the data. In fact, this is a multi-threaded notification problem. The method adopted was mentioned earlier, using the two main weapons of mutex and condition.

The following piece of code is the creation of child threads initiated by the main thread, call event_loop_thread_init to initialize each child thread, and then call event_loop_thread_start to start the child thread. Note that if the thread pool size specified by the application is 0, it will return directly, so that the acceptor and I/O events will be processed in the same main thread, which degenerates into a single reactor mode.

//一定是main thread发起
void thread_pool_start(struct thread_pool *threadPool) {
    assert(!threadPool->started);
    assertInSameThread(threadPool->mainLoop);

    threadPool->started = 1;
    void *tmp;
    if (threadPool->thread_number <= 0) {
        return;
    }

    threadPool->eventLoopThreads = malloc(threadPool->thread_number * sizeof(struct event_loop_thread));
    for (int i = 0; i < threadPool->thread_number; ++i) {
        event_loop_thread_init(&threadPool->eventLoopThreads[i], i);
        event_loop_thread_start(&threadPool->eventLoopThreads[i]);
    }
}

Let's look at the event_loop_thread_start method again. This method must be run by the main thread. Here I used pthread_create to create a child thread. Once the child thread is created, event_loop_thread_run is executed immediately. As we will see later, event_loop_thread_run initializes the child thread. The most important part of event_loop_thread_start is the use of pthread_mutex_lock and pthread_mutex_unlock for locking and unlocking, and the use of pthread_cond_wait to wait for the eventLoop variable in eventLoopThread.

//由主线程调用,初始化一个子线程,并且让子线程开始运行event_loop
struct event_loop *event_loop_thread_start(struct event_loop_thread *eventLoopThread) {
    pthread_create(&eventLoopThread->thread_tid, NULL, &event_loop_thread_run, eventLoopThread);

    assert(pthread_mutex_lock(&eventLoopThread->mutex) == 0);

    while (eventLoopThread->eventLoop == NULL) {
        assert(pthread_cond_wait(&eventLoopThread->cond, &eventLoopThread->mutex) == 0);
    }
    assert(pthread_mutex_unlock(&eventLoopThread->mutex) == 0);

    yolanda_msgx("event loop thread started, %s", eventLoopThread->thread_name);
    return eventLoopThread->eventLoop;
}

Why do you do that? Look at the code of the child thread and you will get a rough idea. The sub-thread execution function event_loop_thread_run also locks up, and then initializes the event_loop object. When the initialization is completed, the pthread_cond_signal function is called to notify the main thread that is blocked on pthread_cond_wait at this time. In this way, the main thread will wake up from wait, and the code has to be executed before. The child thread itself also enters an infinite loop event distribution execution body by calling event_loop_run, waiting for the event registered on the child thread reactor to occur.

void *event_loop_thread_run(void *arg) {
    struct event_loop_thread *eventLoopThread = (struct event_loop_thread *) arg;

    pthread_mutex_lock(&eventLoopThread->mutex);

    // 初始化化event loop,之后通知主线程
    eventLoopThread->eventLoop = event_loop_init();
    yolanda_msgx("event loop thread init and signal, %s", eventLoopThread->thread_name);
    pthread_cond_signal(&eventLoopThread->cond);

    pthread_mutex_unlock(&eventLoopThread->mutex);

    //子线程event loop run
    eventLoopThread->eventLoop->thread_name = eventLoopThread->thread_name;
    event_loop_run(eventLoopThread->eventLoop);
}

It can be seen that the variable shared by the main thread and the child thread is the eventLoop object of each event_loop_thread. This object is NULL when initialized. Only when the child thread is initialized, it becomes a non-NULL value. This change is A sign that the child thread has completed initialization is also a variable guarded by the semaphore. By using locks and semaphores, the problem of synchronization between the main thread and the sub-thread is solved. When the child thread is initialized, the main thread will continue to execute.

struct event_loop_thread {
    struct event_loop *eventLoop;
    pthread_t thread_tid;        /* thread ID */
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    char * thread_name;
    long thread_count;    /* # connections handled */
};

You may ask, the main thread is looping and waiting for each sub-thread to complete initialization. If it enters the second loop and waits for the second sub-thread to complete its initialization, what should I do? Note that we are locking up here, as long as we obtain this lock and find that the eventLoop object of event_loop_thread has become a non-NULL value, we can be sure that the second thread has been initialized, and we will release the lock directly and proceed to execute it.

You may also ask, do I need to hold that lock when pthread_cond_wait is executed? Here, the parent thread will go to sleep immediately after calling the pthread_cond_wait function and release the mutex lock it held. And when the parent thread returns from pthread_cond_wait (this is achieved by the child thread through the pthread_cond_signal notification), the thread holds the lock again.

  • Add connected socket events to the sub-reactor thread

As mentioned earlier, the main thread is a main reactor thread. This thread is responsible for detecting events on the listening socket. When an event occurs, a connection has been established. If we have multiple sub-reactor threads, we The desired result is that I/O events related to this connected socket are handed over to the sub-reactor child thread for detection. The advantage of this is that the main reactor is only responsible for the establishment of connection sockets and can always maintain a very high processing efficiency. In the case of multi-core, multiple sub-reactors can make good use of the advantages of multi-core processing.

We know that the sub-reactor thread is an infinite loop event loop execution body. In the case that no registered event occurs, this thread is blocked on the dispatch of the event_dispatcher. You can simply think of blocking on poll call or epoll_wait. In this case, how can the main thread hand over the connected socket to the sub-reactor thread?

If we can let the sub-reactor thread return from the dispatch of the event_dispatcher, and then let the sub-reactor thread return to register the new connected socket event, this matter is complete.

How to make the sub-reactor thread return from the dispatch of the event_dispatcher? The answer is to construct a description word similar to a pipeline, and let event_dispatcher register the pipeline description word. When we want the sub-reactor thread to wake up, we can send a character to the pipeline.

In the event_loop_init function, the socketpair function is called to create a socket pair. The role of this socket pair is what I just said. When writing to one end of this socket, the other end can perceive the read event . In fact, the pipe on UNIX can also be used directly here, and the effect is the same.

struct event_loop *event_loop_init() {
    ...
    //add the socketfd to event 这里创建的是套接字对,目的是为了唤醒子线程
    eventLoop->owner_thread_id = pthread_self();
    if (socketpair(AF_UNIX, SOCK_STREAM, 0, eventLoop->socketPair) < 0) {
        LOG_ERR("socketpair set fialed");
    }
    eventLoop->is_handle_pending = 0;
    eventLoop->pending_head = NULL;
    eventLoop->pending_tail = NULL;
    eventLoop->thread_name = "main thread";

    struct channel *channel = channel_new(eventLoop->socketPair[1], EVENT_READ, handleWakeup, NULL, eventLoop);
    event_loop_add_channel_event(eventLoop, eventLoop->socketPair[0], channel);

    return eventLoop;
}

Pay special attention to this code in the manuscript. This tells event_loop that the READ event on the socketPair[1] descriptor is registered. If a READ event occurs, the handleWakeup function is called to complete the event processing.

struct channel *channel = channel_new(eventLoop->socketPair[1], EVENT_READ, handleWakeup, NULL, eventLoop);

In fact, this function simply reads a character from the socketPair[1] description, and it does nothing else. Its main function is to wake up child threads from the blocking of dispatch.

int handleWakeup(void * data) {
    struct event_loop *eventLoop = (struct event_loop *) data;
    char one;
    ssize_t n = read(eventLoop->socketPair[1], &one, sizeof one);
    if (n != sizeof one) {
        LOG_ERR("handleWakeup  failed");
    }
    yolanda_msgx("wakeup, %s", eventLoop->thread_name);
}

Now, let's look back again, if there is a new connection, how does the main thread operate? In handle_connection_established, the connected socket is obtained through the accept call, set it as a non-blocking socket (remember), and then call thread_pool_get_loop to obtain an event_loop. The logic of thread_pool_get_loop is very simple. A thread is selected in order from the thread_pool thread pool to serve. Next is the creation of the tcp_connection object.

//处理连接已建立的回调函数
int handle_connection_established(void *data) {
    struct TCPserver *tcpServer = (struct TCPserver *) data;
    struct acceptor *acceptor = tcpServer->acceptor;
    int listenfd = acceptor->listen_fd;

    struct sockaddr_in client_addr;
    socklen_t client_len = sizeof(client_addr);
    //获取这个已建立的套集字,设置为非阻塞套集字
    int connected_fd = accept(listenfd, (struct sockaddr *) &client_addr, &client_len);
    make_nonblocking(connected_fd);

    yolanda_msgx("new connection established, socket == %d", connected_fd);

    //从线程池里选择一个eventloop来服务这个新的连接套接字
    struct event_loop *eventLoop = thread_pool_get_loop(tcpServer->threadPool);

    // 为这个新建立套接字创建一个tcp_connection对象,并把应用程序的callback函数设置给这个tcp_connection对象
    struct tcp_connection *tcpConnection = tcp_connection_new(connected_fd, eventLoop,tcpServer->connectionCompletedCallBack,tcpServer->connectionClosedCallBack,tcpServer->messageCallBack,tcpServer->writeCompletedCallBack);
    //callback内部使用
    if (tcpServer->data != NULL) {
        tcpConnection->data = tcpServer->data;
    }
    return 0;
}

In the code that calls tcp_connection_new to create the tcp_connection object, you can see that a channel object is created first, and the READ event is registered, and then the event_loop_add_channel_event method is called to add the channel object to the child thread.

tcp_connection_new(int connected_fd, struct event_loop *eventLoop,
                   connection_completed_call_back connectionCompletedCallBack,
                   connection_closed_call_back connectionClosedCallBack,
                   message_call_back messageCallBack, write_completed_call_back writeCompletedCallBack) {
    ...
    //为新的连接对象创建可读事件
    struct channel *channel1 = channel_new(connected_fd, EVENT_READ, handle_read, handle_write, tcpConnection);
    tcpConnection->channel = channel1;

    //完成对connectionCompleted的函数回调
    if (tcpConnection->connectionCompletedCallBack != NULL) {
        tcpConnection->connectionCompletedCallBack(tcpConnection);
    }
  
    //把该套集字对应的channel对象注册到event_loop事件分发器上
    event_loop_add_channel_event(tcpConnection->eventLoop, connected_fd, tcpConnection->channel);
    return tcpConnection;
}

Please note that the operations so far have been executed in the main thread. The following event_loop_do_channel_event is no exception, the next behavior I expect you to be familiar with, that is, unlocking. If the lock can be acquired, the main thread will call event_loop_channel_buffer_nolock to add the channel event object to be processed to the data of the child thread. All added channel objects are maintained in the data structure of the child thread in the form of a list. The next part is the key point. If the current event loop thread is not adding the channel event, the event_loop_wakeup function will be called to wake up the event_loop sub-thread. The way to wake up is very simple, that is, write a byte to the socketPair[0] just now. Don't forget, event_loop has registered the readable event of socketPair[1]. If the current event loop thread is adding channel event, directly call event_loop_handle_pending_channel to process the newly added channel event event list.

int event_loop_do_channel_event(struct event_loop *eventLoop, int fd, struct channel *channel1, int type) {
    //get the lock
    pthread_mutex_lock(&eventLoop->mutex);
    assert(eventLoop->is_handle_pending == 0);
    //往该线程的channel列表里增加新的channel
    event_loop_channel_buffer_nolock(eventLoop, fd, channel1, type);
    //release the lock
    pthread_mutex_unlock(&eventLoop->mutex);
    //如果是主线程发起操作,则调用event_loop_wakeup唤醒子线程
    if (!isInSameThread(eventLoop)) {
        event_loop_wakeup(eventLoop);
    } else {
        //如果是子线程自己,则直接可以操作
        event_loop_handle_pending_channel(eventLoop);
    }
    return 0;
}

If event_loop is awakened, the event_loop_handle_pending_channel function will be executed next. You can see that the event_loop_handle_pending_channel function is also called after exiting from dispatch in the loop body.

int event_loop_run(struct event_loop *eventLoop) {
    assert(eventLoop != NULL);

    struct event_dispatcher *dispatcher = eventLoop->eventDispatcher;
    if (eventLoop->owner_thread_id != pthread_self()) {
        exit(1);
    }

    yolanda_msgx("event loop run, %s", eventLoop->thread_name);
    struct timeval timeval;
    timeval.tv_sec = 1;

    while (!eventLoop->quit) {
        //block here to wait I/O event, and get active channels
        dispatcher->dispatch(eventLoop, &timeval);

        //这里处理pending channel,如果是子线程被唤醒,这个部分也会立即执行到
        event_loop_handle_pending_channel(eventLoop);
    }
    yolanda_msgx("event loop end, %s", eventLoop->thread_name);
    return 0;
}

The function of event_loop_handle_pending_channel is to traverse the pending channel event list in the current event loop and associate them with event_dispatcher to modify the collection of events of interest. There is one point worth noting here, because after the event loop thread gets the event, it will call back the event processing function, so application code like onMessage will also be executed in the event loop thread. If the business logic here is too complicated, it will cause event_loop_handle_pending_channel to execute The time is late, which affects the I/O detection. Therefore, it is a common practice to isolate the I/O thread from the business logic thread, let the I/O thread only handle the I/O interaction, and let the business thread handle the business.

 

Learn the new by reviewing the past!

 

Guess you like

Origin blog.csdn.net/qq_24436765/article/details/104976313