Implement RPC yourself

Author: Zen and the Art of Computer Programming

1 Introduction

In distributed computing, remote procedure call (Remote Procedure Call) is a way of communication between services. The core idea is to allow different processes or computers to communicate over the network without knowing the underlying network protocol, shielding the complex underlying transmission details, allowing developers to call remote services as easily as calling local functions. Compared with the traditional Socket-based client-server model, RPC provides a more convenient, easy-to-use and object-oriented programming interface. Currently, the mainstream RPC frameworks on the market include Apache Thrift, Google gRPC, etc. This article will start from the most basic RPC mechanism and introduce step by step how to implement a simple RPC framework by yourself. After reading this article, readers can easily master the design, development and application method of a simple but fully functional RPC framework.

2.Basic concepts

2.1 Service

A core concept of remote procedure call is service. It refers to an entity that provides a certain function, such as an algorithm engine or a database service. A service generally consists of two parts: an interface definition file (.proto), which is used to describe the method signature and request parameters of the service; a process running on the server, which is responsible for monitoring client requests and processing requests. A service usually consists of multiple methods. Each method defines input parameters and return value types, and can be called remotely by the client.

2.2 Protocol Buffers

In order to solve the problem of data exchange, Google launched Protocol Buffers (hereinafter referred to as Protobuf), which is a language-independent, platform-independent, and extensible message format that can be used to serialize structured data. It uses .proto files as configuration descriptions and uses the compiler to generate class libraries in various languages. Users can use this class library to serialize the data structures of each language into byte sequences and then send them to the other party through the network. Protobuf can automatically generate code, save development time, improve efficiency, and support functions such as data compression and verification.

2.3 Network transmission (Transportation)

After the service is deployed to the server, it can start receiving requests from clients. For network communication, services need to provide underlying network transmission capabilities. Currently, the two most popular network transmission protocols are TCP/IP and HTTP. The HTTP protocol is a stateless connectionless protocol, that is, only one request is processed per connection. On the other hand, the TCP/IP protocol is a stateful connection protocol that can handle multiple requests at one time and ensure that data packets arrive in order. Therefore, which transmission protocol to choose is more important.

3. Overview of RPC framework

The RPC framework includes the following main modules:

  • Transport layer: Responsible for network transmission, such as establishing connections, disconnecting, sending and receiving messages, etc.
  • Protocol layer: Responsible for encoding and decoding, such as how to encode a string, an integer, etc.
  • Serialization layer: Responsible for converting objects into byte streams, such as Serialization API in Java and Pickle in Python.
  • Network layer: Above the transport layer, it maintains remote host addresses and is responsible for routing and forwarding.
  • Client Stub: This Stub is created when the user calls and is used to send requests and receive responses.
  • Server Skeleton: Skeleton receives client requests, calls local services, and then encapsulates the results into responses and returns them to the client.

The following figure shows a typical RPC framework process:

4.RPC framework design and implementation

The design and implementation of the RPC framework can be divided into the following four steps:

  • Determine communication mode
  • Determine transport protocol
  • Design IDL file
  • Implement serialization and deserialization
  • Implement transport protocol
  • Realize network transmission
  • Implement client Stub
  • Implement server Skeleton

These steps are explained in detail below.

4.1 Determine communication mode

RPC is a two-way communication mode, that is, the client calls the server's services, and the server also allows the server to call the client's services. Therefore, we must first consider the directionality of service calls. If the service calls the client's service, it is called the "procedure" mode; otherwise, it is called the "function" mode. Generally speaking, the "process" mode is used within the same process, and the "function" mode is used between different processes.

4.2 Determine the transmission protocol

Determining the transmission protocol usually involves two considerations. First, whether encrypted transmission is required; second, what kind of transmission protocol is required. Usually, secure transmission protocols are TLS/SSL, SSH, etc.; non-secure transmission protocols are such as HTTP, TCP/IP, etc.

4.3 Design IDL file

In order to implement RPC, we need to first define the interface of the service, that is, the input and output parameters of the service. We usually use .proto files as interface definition files, which contain service names, method names and parameter type definitions. Here is an example of a .proto file:

// calculator.proto 文件
syntax = "proto3"; // 指定protobuf版本

package example; // 定义包名

message Request {
    int32 a = 1; // 参数定义
    int32 b = 2;
}

message Response {
    int32 c = 1; // 返回值定义
}

service Calculator { // 服务定义
    rpc add (Request) returns (Response); // 方法定义
    rpc sub (Request) returns (Response);
    rpc mul (Request) returns (Response);
    rpc div (Request) returns (Response);
}

The above example defines a calculator service, which contains four methods: add, sub, mul, and div. The input parameters of each method are Request, and the return value is Response.

4.4 Implement serialization and deserialization

To implement remote invocation of a service, you first need to serialize the call parameters into a byte stream, then transmit the byte stream to the server through the network, and deserialize the return value. Generally, the process of serialization and deserialization depends on a specific programming language, such as the Serialization API in Java and Pickle in Python. Of course, you can also implement serialization and deserialization logic manually.

4.5 Implementing the transmission protocol

According to the selected transmission protocol, we need to implement the corresponding network transmission components, such as establishing connections, disconnecting, receiving messages, sending messages, etc. We can use existing open source projects such as Netty, etc.

4.6 Implement network transmission

After completing the implementation of the network transmission component, the next step is to implement the client Stub and server Skeleton. The client Stub is responsible for sending the local call request to the network transmission component and waiting for the server's response; the server Skeleton is responsible for accepting the call request from the network transmission component, calling the local service to process the request, and finally encapsulating the result into a return response and sending it to the client.

The communication path between the client Stub and the server Skeleton can be divided into three situations:

  1. One-hop call: The client directly calls the server, sends the request to the port of the network transmission component, and blocks waiting for the server's response.
  2. Multi-hop call: The client first calls middleware (such as a registration center) to obtain the server's network address information, and then calls the server. Middleware can cache service address information to avoid repeated queries.
  3. Data transparent transmission: The client sends the request to the middleware through the network transmission component. After receiving the request, the middleware sends the request to the server through the network transmission component and blocks waiting for the server's response.

4.7 Implement client Stub

The client Stub is responsible for sending the local call request to the network transmission component and waiting for the response from the server. First, Stub needs to parse the IDL file, obtain the service name and method name, and construct the request parameters. Then, the request parameters are serialized into a byte array and sent to the server through the network transmission component. Finally, upon receiving the server's response, the Stub deserializes the byte array and returns the result.

4.8 Implement server Skeleton

The server Skeleton is responsible for accepting the call request of the network transmission component, calling the local service to process the request, and finally encapsulating the result into a return response and sending it to the client. First, Skeleton needs to parse the requested byte stream and deserialize it to get the calling parameters. Then, call the local service to process the request and get the result. Finally, the result is serialized into a byte array and returned to the client through the network transmission component.

4.9 Summary

Through the above steps, we have implemented a simple RPC framework. Of course, due to space reasons, there are not many underlying technical details here. If you are interested in this, please continue reading the relevant information.

Guess you like

Origin blog.csdn.net/universsky2015/article/details/133504729