[Advanced Network] Server Model Reactor and Proactor

Article directory

In the message processing of high-concurrency programming and network connections, it can usually be divided into two stages: waiting for the message to be ready and message processing. These two phases are often combined when using the default blocking sockets (e.g. each thread is dedicated to handling one connection). Therefore, the thread processing the socket needs to wait for the message to be ready, which causes the thread to sleep and wake up frequently in high concurrency scenarios, thus affecting the CPU usage efficiency.

In order to improve the efficiency of high-concurrency programming, the two phases of waiting for message ready and message processing can be separated. In other words, the code that waits for a message to be ready should be separated from the code that processes the message. This requires that the socket must be non-blocking, otherwise the code segment processing the message may cause the thread to enter a wait state when the condition is not met. The way to implement the waiting for message ready phase is for threads to actively query, or to have a single thread wait for all connections. This introduces I/O multiplexing, which can handle multiple connections at the same time, although it may still need to wait and cause the thread to sleep, but since it can monitor all connections, when the thread is woken up, there must be some connections Ready for processing.

High-performance server programs usually need to handle three types of events: I/O events, timed events, and signals. For efficient event handling, there are two main models: Reactor and Proactor. These two models use different strategies to solve the problems of waiting for message ready and message processing phases, so as to improve the performance of the server in a high-concurrency environment.

1. Reactor model

First, let's review the mechanism of a regular function call: the program calls a function, the function executes, the program waits, and then the function returns the result and control to the program, and the program continues processing. In contrast, Reactor (reactor) is an event-driven mechanism. It subverts the traditional event processing process: the application no longer actively calls an API to complete the processing, but reverses the processing logic. In this mode, the application needs to provide the corresponding interface and register it on the Reactor. When the corresponding event occurs, Reactor will actively call the interface registered by the application, and these interfaces are usually called "callback functions". In this way, Reactor implements event-driven programming, which improves the response speed and efficiency of the program.
insert image description here
The Reactor pattern is a common pattern for handling concurrent I/O and is suitable for synchronous I/O. The core idea is to register all pending I/O events to a central I/O multiplexer, while letting the main thread/process block on the multiplexer. When an I/O event arrives or is ready (for example, a file descriptor or socket can be read or written), the multiplexer returns and distributes the corresponding I/O event registered in advance to the corresponding processor. In this way, the Reactor mode can achieve efficient concurrent I/O processing and improve the performance of the program in a high-concurrency environment.

The Reactor model has three important components:

Reactor: Reactor is the core component responsible for managing the event loop and I/O event distribution. It receives client connections, listens for I/O events, and dispatches these events to corresponding event handlers. Reactor can be single-threaded or multi-threaded, which can be selected according to specific application scenarios and performance requirements.
Handlers: Event handlers (Handlers) are objects used to handle specific I/O events. Each handler usually corresponds to a client connection or a resource (such as a file, socket, etc.). Processors are responsible for handling events associated with them, such as reading data, processing business logic, sending responses, etc. Handlers can be synchronous or asynchronous, depending on the specific implementation.
Demultiplexer: The event separator (Demultiplexer) is responsible for obtaining I/O events from the operating system and passing these events to Reactor. Common implementations of event separators include select, poll, epoll, kqueue, etc. The event splitter enables Reactor to process multiple I/O events at the same time, thus achieving high concurrent performance.

insert image description here

The specific process of event handling:

Reactor receives the client connection and registers the related I/O events of the connection to the event separator (Demultiplexer). At the same time, Reactor associates event handlers (Handlers) with these I/O events.
The event separator (Demultiplexer) is responsible for detecting and collecting I/O events that occur in the operating system. When I/O events occur, the event separator notifies Reactor of these events.
Reactor matches received I/O events with previously registered event handlers (Handlers). After finding a matching handler, Reactor dispatches the event to the corresponding handler.
Event handlers (Handlers) perform corresponding operations after receiving the distributed I/O events, such as reading data, processing business logic, and sending responses.
After the event processing is completed, Reactor can update the association between the processor and the I/O event as needed, so as to continue processing subsequent events.

Advantages of the Reactor model:

High concurrency performance: The Reactor mode achieves high concurrency performance through an event-driven approach. The event separator can detect multiple I/O events at the same time, and Reactor can process multiple I/O events in parallel according to the notification of the event separator, thereby improving the concurrent processing capability of the server.
Good scalability: Reactor mode can improve the scalability of the system by increasing the number of threads or using multiple Reactor instances. In a multi-core processor environment, the multi-thread Reactor mode can be used to effectively share the load and further improve system performance.
High resource utilization: In Reactor mode, the event processor only performs operations when an I/O event occurs. In this way, invalid polling and resource waste can be avoided, and the utilization rate of system resources can be improved.
Ease of management and maintenance: The Reactor mode separates event handlers from I/O events, making event processing logic clearer and easier to manage. In addition, the Reactor mode can also realize the dynamic registration and deregistration of processors, which is convenient for system maintenance and expansion.
Adaptable to different scenarios: The Reactor mode can be customized according to different application scenarios and performance requirements. For example, you can choose single-threaded or multi-threaded Reactor, synchronous or asynchronous processor, etc., to meet the needs of specific applications.

Compared with the direct use of IO multiplexing, the Reactor model has improved development efficiency. This model usually adopts a single-thread design, and the goal is to allow a single thread to fully utilize all resources of a CPU. At the same time, it also has an advantage, that is, when processing events, in many cases, there is no need to consider the issue of mutually exclusive access to shared resources. However, this model also has obvious shortcomings. With the development of hardware, Moore's Law no longer applies, and the CPU frequency cannot continue to increase significantly due to material limitations, so performance improvement mainly depends on increasing the number of cores. In programs that need to utilize multi-core resources, the performance of the Reactor model may be affected.

For a program with a relatively simple business, such as only needing to access services that provide concurrent access, multiple reactors can be directly enabled, and each reactor corresponds to a CPU core. In this case, the requests running on each reactor are independent of each other, so as to make full use of multi-core resources. For example, HTTP static servers such as Nginx use this strategy.

A TCP echo server implemented using the Reactor model in C++:

The system needs to support epoll, and the compiler supports C++11

#include <iostream>
#include <cstdlib>
#include <cstring>
#include <cstdio>
#include <vector>
#include <map>
#include <unistd.h>
#include <sys/epoll.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <netinet/in.h>
#include <sys/socket.h>

#define MAX_EVENTS 10

class Reactor {
    
    
public:
    Reactor() {
    
    
        _epoll_fd = epoll_create1(0);
        if (_epoll_fd == -1) {
    
    
            perror("epoll_create1");
            exit(EXIT_FAILURE);
        }
    }

    ~Reactor() {
    
    
        close(_epoll_fd);
    }

    void add_fd(int fd, uint32_t events) {
    
    
        struct epoll_event event;
        event.data.fd = fd;
        event.events = events;
        if (epoll_ctl(_epoll_fd, EPOLL_CTL_ADD, fd, &event) == -1) {
    
    
            perror("epoll_ctl");
            exit(EXIT_FAILURE);
        }
    }

    void del_fd(int fd) {
    
    
        if (epoll_ctl(_epoll_fd, EPOLL_CTL_DEL, fd, nullptr) == -1) {
    
    
            perror("epoll_ctl");
            exit(EXIT_FAILURE);
        }
    }

    void run() {
    
    
        std::vector<struct epoll_event> events(MAX_EVENTS);
        while (true) {
    
    
            int n = epoll_wait(_epoll_fd, events.data(), MAX_EVENTS, -1);
            if (n == -1) {
    
    
                perror("epoll_wait");
                exit(EXIT_FAILURE);
            }

            for (int i = 0; i < n; i++) {
    
    
                if (events[i].events & EPOLLIN) {
    
    
                    handle_input(events[i].data.fd);
                } else if (events[i].events & EPOLLOUT) {
    
    
                    handle_output(events[i].data.fd);
                }
            }
        }
    }

    virtual void handle_input(int fd) = 0;
    virtual void handle_output(int fd) = 0;

private:
    int _epoll_fd;
};

class EchoReactor : public Reactor {
    
    
public:
    EchoReactor(int listen_fd) : _listen_fd(listen_fd) {
    
    
        add_fd(_listen_fd, EPOLLIN);
    }

    void handle_input(int fd) override {
    
    
    if (fd == _listen_fd) {
    
    
        struct sockaddr_in addr;
        socklen_t addrlen = sizeof(addr);
        int conn_fd = accept(_listen_fd, (struct sockaddr *)&addr, &addrlen);
        if (conn_fd == -1) {
    
    
            perror("accept");
            exit(EXIT_FAILURE);
        }

        make_socket_non_blocking(conn_fd);
        add_fd(conn_fd, EPOLLIN);
    } else {
    
    
        char buf[1024];
        ssize_t n = read(fd, buf, sizeof(buf));
        if (n <= 0) {
    
    
            if (n < 0) perror("read");
            close(fd);
            del_fd(fd);
        } else {
    
    
            _out_buffers[fd] = std::string(buf, n);
            del_fd(fd);
            add_fd(fd, EPOLLOUT);
        }
    }
}

void handle_output(int fd) override {
    
    
    auto it = _out_buffers.find(fd);
    if (it != _out_buffers.end()) {
    
    
        ssize_t n = write(fd, it->second.c_str(), it->second.size());
        if (n <= 0) {
    
    
            if (n < 0) perror("write");
            close(fd);
            del_fd(fd);
        } else {
    
    
            it->second.erase(0, n);
            if (it->second.empty()) {
    
    
                del_fd(fd);
                add_fd(fd, EPOLLIN);
            }
        }
    }
}

private:
    int make_socket_non_blocking(int sfd) {
    
    
        int flags = fcntl(sfd, F_GETFL, 0);
        if (flags == -1) {
    
    
            perror("fcntl");
            return -1;
        }

        flags |= O_NONBLOCK;
        if (fcntl(sfd, F_SETFL, flags) == -1) {
    
    
            perror("fcntl");
            return -1;
        }

        return 0;
    }

    int _listen_fd;
    std::map<int, std::string> _out_buffers;
};

int main(int argc, char *argv[]) {
    
    
    if (argc != 2) {
    
    
        std::cerr << "Usage: " << argv[0] << " <port>" << std::endl;
        exit(EXIT_FAILURE);
    }

    int port = std::stoi(argv[1]);

    int listen_fd = socket(AF_INET, SOCK_STREAM, 0);
    if (listen_fd == -1) {
    
    
        perror("socket");
        exit(EXIT_FAILURE);
    }

    int optval = 1;
    if (setsockopt(listen_fd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)) == -1) {
    
    
        perror("setsockopt");
        exit(EXIT_FAILURE);
    }

    struct sockaddr_in addr;
    addr.sin_family = AF_INET;
    addr.sin_port = htons(port);
    addr.sin_addr.s_addr = INADDR_ANY;

    if (bind(listen_fd, (struct sockaddr *)&addr, sizeof(addr)) == -1) {
    
    
        perror("bind");
        exit(EXIT_FAILURE);
    }

    if (listen(listen_fd, SOMAXCONN) == -1) {
    
    
        perror("listen");
        exit(EXIT_FAILURE);
    }

    EchoReactor echo_reactor(listen_fd);
    echo_reactor.run();

    return 0;
}

Test with telnet:
insert image description here

2. Proactor model

Among many network programming models, the Reactor model has attracted widespread attention due to its advantages of high concurrent performance and high resource utilization. The Reactor model is based on the principle of event-driven, and through the synergy of the three core components of Reactor, event processor (Handlers) and event separator (Demultiplexer), it can effectively handle a large number of concurrent connections and I/O operations. However, in some high-load scenarios, the Reactor model may encounter event processing bottlenecks, which makes us need to seek other solutions to further improve system performance.

The Proactor model is a high-performance network programming model that came into being under this background. Similar to the Reactor model, the Proactor model also uses an event-driven approach, but differs in the implementation of event processing. The Proactor model combines asynchronous I/O operations with event handlers so that I/O operations can be performed in the background, thereby further reducing blocking and improving system performance. In the Proactor model, the primary responsibility for event handling falls on the operating system, while the application focuses on handling business logic. This division of labor enables the Proactor model to better cope with high-load scenarios, and provides a powerful solution for building high-performance web servers.

insert image description here

The Proactor model has three important components:

Proactor: Proactor is the core component responsible for managing the event loop and I/O event distribution. It receives client connections, listens for I/O events, and dispatches these events to corresponding event handlers. Compared with the Reactor model, Proactor hands over the processing of most I/O operations to the operating system, thereby further reducing blocking and improving system performance.
Asynchronous Handlers: Asynchronous event handlers (Asynchronous Handlers) are objects used to handle specific I/O events. Each handler usually corresponds to a client connection or a resource (such as a file, socket, etc.). Different from the event processor in the Reactor model, the processor in the Proactor model combines asynchronous I/O operations with event processing, so that I/O operations can be executed in the background to further improve system performance.
Asynchronous Operation Processor: Asynchronous Operation Processor (Asynchronous Operation Processor) is responsible for performing the actual asynchronous I/O operation components. It works in conjunction with the event processor to decouple the execution of I/O operations from event processing, further reducing blocking and improving system performance.
Completion Dispatcher: The completion distributor (Completion Dispatcher) is responsible for notifying Proactor after the completion of the asynchronous I/O operation. When the operating system completes an asynchronous I/O operation, it sends a completion notification to the Completion Dispatcher, which in turn passes the notification to the Proactor. In this way, Proactor can distribute events to corresponding event handlers to handle business logic.

insert image description here

The specific process of event handling:

Proactor receives client connections and registers related I/O events to asynchronous event handlers (Asynchronous Handlers).
Asynchronous event handlers (Asynchronous Handlers) initiate asynchronous I/O operations. At this time, the execution of the I/O operation is in charge of the asynchronous operation processor (Asynchronous Operation Processor), which is decoupled from the event processor, so that the I/O operation can be executed in the background.
Asynchronous Operation Processor (Asynchronous Operation Processor) works with the operating system to perform actual asynchronous I/O operations. In this way, the event handler can handle other tasks while waiting for the I/O operation to complete, further improving system performance.
When the operating system completes an asynchronous I/O operation, it sends a completion notification to the Completion Dispatcher.
After the Completion Dispatcher receives the notification, it passes the notification to the Proactor. At this point, Proactor distributes the event to the corresponding asynchronous event handler (Asynchronous Handlers).
Asynchronous event handlers (Asynchronous Handlers) perform corresponding operations after receiving distributed I/O events, such as reading data, processing business logic, and sending responses.

From the above processing flow, we can find that the biggest feature of the Proactor model is the use of asynchronous I/O. All I/O operations are performed by the asynchronous I/O interface provided by the system, and the worker thread is only responsible for processing business logic. In the Proactor model, a user function initiates an asynchronous file operation and registers it with the multiplexer. The multiplexer is concerned with the completion of the asynchronous read operation, not whether the file is readable or writable. The asynchronous operation is done by the operating system, and the user program does not need to care. When the operating system finishes reading the file, that is, copying the data into the buffer previously provided by the user, it notifies the multiplexer that the related operation has completed. The multiplexer then calls the appropriate handler to process the data.

Although Proactor increases the complexity of programming, it improves the efficiency of worker threads. Proactor can optimize read and write operations in the system state and take advantage of I/O parallelism to implement a high-performance single-threaded model. On Windows, due to the lack of a mechanism similar to epoll, IOCP is used to support high concurrency. Windows more commonly implements servers using the completion port-based Proactor model due to the well-optimized nature of the operating system. On Linux, although the 2.6 kernel introduces the aio interface, the actual effect is not ideal. The emergence of aio is mainly to solve the problem of poor performance of poll, but the actual test shows that the performance of epoll is higher than that of poll+aio, and aio cannot handle accept. Therefore, Linux is mainly based on the Reactor model.

It is indeed possible to emulate Proactor through Reactor without using the asynchronous I/O interface provided by the operating system. The difference is: using the asynchronous interface can use the read and write parallel capabilities provided by the system, but in the case of simulation, this needs to be implemented in user mode. The specific approach includes the following steps:

Register for read events (provide a buffer at the same time).
The event separator waits for readable events.
When an event comes, the splitter is activated, the splitter immediately reads the data and writes it to the buffer, and then calls the event handler.
The event handler processes the data and deletes the event (need to be registered with the asynchronous interface).

It is worth noting that the Boost.asio library uses the Proactor model. However, on the Linux platform, Boost.asio uses Reactor implemented by epoll to simulate Proactor, and an additional thread is created to complete read and write scheduling. This approach combines the advantages of the two models to a certain extent to achieve higher performance and flexibility.

A TCP echo server implemented using the Proactor model in C++:

This example is based on the Boost.Asio library, which is a cross-platform C++ library for writing network and low-level I/O programs based on asynchronous I/O operations. Before using it, please make sure you have installed it correctly and configured the shared library file .

#include <iostream>
#include <memory>
#include <utility>
#include <boost/asio.hpp>

using boost::asio::ip::tcp;

class session : public std::enable_shared_from_this<session> {
    
    
public:
    session(tcp::socket socket) : socket_(std::move(socket)) {
    
    }

    void start() {
    
     read(); }

private:
    void read() {
    
    
        auto self(shared_from_this());
        socket_.async_read_some(boost::asio::buffer(data_, max_length),
            [this, self](boost::system::error_code ec, std::size_t length) {
    
    
                if (!ec) {
    
    
                    write(length);
                }
            });
    }

    void write(std::size_t length) {
    
    
        auto self(shared_from_this());
        boost::asio::async_write(socket_, boost::asio::buffer(data_, length),
            [this, self](boost::system::error_code ec, std::size_t /*length*/) {
    
    
                if (!ec) {
    
    
                    read();
                }
            });
    }

    tcp::socket socket_;
    enum {
    
     max_length = 1024 };
    char data_[max_length];
};

class server {
    
    
public:
    server(boost::asio::io_context& io_context, short port)
        : acceptor_(io_context, tcp::endpoint(tcp::v4(), port)) {
    
    
        accept();
    }

private:
    void accept() {
    
    
        acceptor_.async_accept(
            [this](boost::system::error_code ec, tcp::socket socket) {
    
    
                if (!ec) {
    
    
                    std::make_shared<session>(std::move(socket))->start();
                }
                accept();
            });
    }

    tcp::acceptor acceptor_;
};

int main(int argc, char* argv[]) {
    
    
    try {
    
    
        if (argc != 2) {
    
    
            std::cerr << "Usage: echo_server <port>\n";
            return 1;
        }

        boost::asio::io_context io_context;
        server s(io_context, std::atoi(argv[1]));
        io_context.run();
    } catch (std::exception& e) {
    
    
        std::cerr << "Exception: " << e.what() << "\n";
    }

    return 0;
}

Test with telnet:
insert image description here

3. Synchronous IO simulation Proactor model

insert image description here

The main thread registers the read ready event on the socket to the epoll kernel event table.
The main thread calls epoll_wait to wait for the data readable event on the socket.
When there is data readable on the socket, epoll_wait notifies the main thread. The main thread loops to read the data on the socket until there is no more data to read, then encapsulates the read data into a request object and inserts it into the request queue.
A worker thread waiting on the request queue is awakened, obtains the request object and processes the client request, and then registers the write-ready event on the socket in the epoll kernel event table.
The main thread calls epoll_wait to wait for the writable event of the socket.
When the socket is writable, epoll_wait notifies the main thread. The main thread writes the result of the server processing the client request to the socket.

The two modes are similar in that they both involve notification of an I/O event (that is, notification to a module that the I/O operation can be performed or has been completed). In terms of structure, the two also have something in common: demultiplexor is responsible for submitting I/O operations (asynchronous), querying whether the device is operable (synchronous), and calling back the registered processing function when the condition is met.

The difference between them is: in the case of Proactor (asynchronous), when the registered processing function is called back, it means that the I/O operation has been completed; in the case of Reactor (synchronous), when the registered processing function is called back, it means I/O The device is ready for an operation (readable or writable), at which point the registered handler begins submitting the operation.

In C++, we can use the Boost.Asio library to simulate the Proactor model using synchronous I/O. Following are the steps to implement the Proactor model using Boost.Asio:

Install the Boost library: First, make sure you have the Boost library installed and include it in your project. Boost.Asio is part of the Boost library.

Include the required header files:

#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <iostream>
#include <vector>

Create an asynchronous callback function: In order to simulate the Proactor model, we need to create an asynchronous callback function, which is called when the asynchronous operation completes.

void handle_read(const boost::system::error_code& error, std::size_t bytes_transferred) {
      
      
    if (!error) {
      
      
        std::cout << "Read: " << bytes_transferred << " bytes" << std::endl;
    } else {
      
      
        std::cerr << "Error: " << error.message() << std::endl;
    }
}

Create an asynchronous operation using Boost.Asio: Create an asynchronous read operation using Boost.Asio and associate it with an asynchronous callback function.

boost::asio::io_context io_context;
boost::asio::ip::tcp::socket socket(io_context);
boost::asio::ip::tcp::endpoint endpoint(boost::asio::ip::address::from_string("127.0.0.1"), 8080);

socket.async_connect(endpoint, boost::bind(&handle_read, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));

Run I/O Context: Run an I/O context to handle asynchronous operations.
```
io_context.run();
```

The complete code is as follows:

#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <iostream>
#include <vector>

void handle_read(const boost::system::error_code& error, std::size_t bytes_transferred) {
    
    
    if (!error) {
    
    
        std::cout << "Read: " << bytes_transferred << " bytes" << std::endl;
    } else {
    
    
        std::cerr << "Error: " << error.message() << std::endl;
    }
}

int main() {
    
    
    try {
    
    
        boost::asio::io_context io_context;
        boost::asio::ip::tcp::socket socket(io_context);
        boost::asio::ip::tcp::endpoint endpoint(boost::asio::ip::address::from_string("127.0.0.1"), 8080);

        socket.async_connect(endpoint, boost::bind(&handle_read, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred));

        io_context.run();
    } catch (std::exception& e) {
    
    
        std::cerr << "Exception: " << e.what() << std::endl;
    }

    return 0;
}

The TCP server used by the test to accept the connection from the above client:

#include <boost/asio.hpp>
#include <iostream>
#include <array>
#include <memory>

class session : public std::enable_shared_from_this<session> {
    
    
public:
    session(boost::asio::ip::tcp::socket socket) : socket_(std::move(socket)) {
    
    }

    void start() {
    
     read(); }

private:
    void read() {
    
    
        auto self(shared_from_this());
        socket_.async_read_some(boost::asio::buffer(data_),
                                [this, self](boost::system::error_code ec, std::size_t length) {
    
    
                                    if (!ec) {
    
    
                                        read();
                                    }
                                });
    }

    boost::asio::ip::tcp::socket socket_;
    std::array<char, 1024> data_;
};

class server {
    
    
public:
    server(boost::asio::io_context& io_context, short port)
        : acceptor_(io_context, boost::asio::ip::tcp::endpoint(boost::asio::ip::tcp::v4(), port)) {
    
    
        accept();
    }

private:
    void accept() {
    
    
        acceptor_.async_accept([this](boost::system::error_code ec, boost::asio::ip::tcp::socket socket) {
    
    
            if (!ec) {
    
    
                std::make_shared<session>(std::move(socket))->start();
            }

            accept();
        });
    }

    boost::asio::ip::tcp::acceptor acceptor_;
};

int main() {
    
    
    try {
    
    
        boost::asio::io_context io_context;
        server srv(io_context, 8080);
        io_context.run();
    } catch (std::exception& e) {
    
    
        std::cerr << "Exception: " << e.what() << std::endl;
    }

    return 0;
}

makefile：

.PHONY:clean
all: client.cc server.cc
        g++ client.cc -o client -I/usr/local/include -L/usr/local/lib -lpthread -std=c++11
        g++ server.cc -o server -I/usr/local/include -L/usr/local/lib -lpthread -std=c++11
clean:
        rm -f client server