In-depth study to understand the gRPC protocol

gRPC is a high-performance, general-purpose open source RPC framework. It is designed by Google mainly for mobile application development and based on the HTTP/2 protocol standard. It is developed based on the ProtoBuf (Protocol Buffers) serialization protocol and supports many development languages. The author of this article has studied the gRPC protocol in depth and deconstructed the protocol itself.

gRPC is based on the HTTP/2 protocol. To deeply understand gRPC, it is necessary to understand HTTP/2. Here is a brief introduction to HTTP/2-related knowledge, and then how gRPC is built based on HTTP/2.

HTTP/1.x

The HTTP protocol can be regarded as the most common protocol on the Web at this stage. For a long time before, many applications were based on the HTTP/1.x protocol. The HTTP/1.x protocol is a text protocol with very good readability. But it is actually not efficient. The author mainly encountered several problems:

Parser

If we want to parse a complete HTTP request, first we need to be able to read the HTTP header correctly. Each field of the HTTP header is separated by \r\n, and then separated from the body by \r\n\r\n. After parsing the header, we can get the size of the body from the content-length in the header to read the body.

This process is actually not efficient, because we need to read multiple times to parse out a complete HTTP request, although there are many optimization methods for code implementation, such as:

Read a large block of data into the buffer at a time to avoid multiple IO reads
When reading, directly match \r\n in streaming parsing

However, the above method will still have overhead for high-performance services. In fact, the main problem is that the HTTP/1.x protocol is a text protocol, which is for people to read and is not friendly to machines. If you want to be friendly to machines, the binary protocol is a better choice.

If you are interested in parsing HTTP/1.x, you can study http-parser, a very efficient and compact C library. I have seen many frameworks integrate this library to handle HTTP/1.x.

Request/Response

Another problem with HTTP/1.x is its interactive mode. A connection can only ask and answer each time, that is, after the client sends a request, it must wait for a response before continuing to send the next request.

This mechanism is very simple, but it will cause low utilization of network connections. If a large number of interactions need to be performed at the same time, the client needs to establish multiple connections with the server, but the establishment of connections also has overhead, so for performance, these connections are usually kept alive for a long time, although for the server to process hundreds of connections at the same time Ten thousand connections is not a big challenge, but after all, the efficiency is not high.

Push

Students who have done push with HTTP/1.x probably know how painful it is, because HTTP/1.x does not have a push mechanism. So usually there are two approaches:

Long polling method, that is, directly hang a connection to the server and wait for a period of time (for example, 1 minute). If the server returns or times out, poll again.
Web-socket, through the upgrade mechanism, explicitly turns this HTTP connection into a bare TCP for two-way interaction.

Compared with Long polling, I still prefer web-socket, after all, it is more efficient, but the interaction behind web-socket is not HTTP in the traditional sense.

Hello HTTP/2

Although the HTTP/1.x protocol may still be the most widely used protocol on the Internet today, with the continuous expansion of the scale of Web services, HTTP/1.x is becoming more and more difficult, and we urgently need another set of better protocols to Build our service, so there is HTTP/2.

HTTP/2 is a binary protocol, which means that its readability is almost zero, but luckily, we still have many tools, such as Wireshark, that can parse it out.

Before understanding HTTP/2, you need to know some general terms:

Stream: A bidirectional stream, a connection can have multiple streams.
Message: That is, the request and response above the logic.
Frame:: The smallest unit of data transmission. Each Frame belongs to a specific stream or the entire connection. A message may consist of multiple frames.

Frame Format

Frame is the smallest data transmission unit in HTTP/2. A Frame is defined as follows (copied directly from the official website):

Flag and R: Reserved bits, you can leave them alone.
Stream Identifier: Identifies the stream to which it belongs. If it is 0, it means that this frame belongs to the entire connection.
Frame Payload: There are different formats according to different types.

It can be seen that the format definition of Frame is still very simple. According to the official agreement, one can be written very conveniently.

Multiplexing

HTTP/2 supports connection multiplexing through stream, improving connection utilization. Stream has many important properties:

A connection can contain multiple streams, and the data sent by multiple streams does not affect each other.
Stream can be used unilaterally or shared by client and server.
Stream can be closed by any segment.
Stream will determine the order in which frames are sent, and the other end will process them in the order received.
Stream is identified by a unique ID.

Let me talk about the Stream ID again here. If it is a stream created by the client, the ID is an odd number. If it is created by the server, the ID is an even number. Both ID 0x00 and 0x01 have specific usage scenarios.

The Stream ID cannot be reused. If the ID is allocated on a connection, the client will create a new connection. The server will send a GOAWAY frame to the client to force the client to create a new connection.

In order to increase the concurrency of streams on a connection, you can consider increasing SETTINGS_MAX_CONCURRENT_STREAMS. In TiKV, we have encountered the problem that the overall throughput cannot be improved due to the small value.

It should also be noted here that although more requests can be processed on one connection, one connection is far from enough. A connection is usually handled by only one thread, so it cannot take full advantage of the server's multi-core advantages. At the same time, each request codec still has overhead, so there will still be a bottleneck with one connection.

In one version of TiKV, we overly believed that there is no problem in running multiple streams on one connection, so we asked the client to use only one connection to interact with TiKV. It turned out that the performance was completely useless, not only the CPU of the thread processing the connection was full, The overall performance didn't improve either, and then we switched to multiple connections, and the situation got better.

Priority

Because a connection allows multiple streams to send frames on it, in some scenarios, we still want streams to have priority, so that the peer end can allocate different resources for different requests. For example, for a Web site, important resources are loaded first, and for some less important pictures, low priority is used.

We can also set Stream Dependencies to form a streams priority tree. Suppose Stream A is the parent, and Stream B and C are its children. The weight of B is 4, and the weight of C is 12. Assuming that A can allocate all resources now, then the resources that B can allocate later are only 1 of C. /3.

Flow Control

HTTP/2 also supports flow control. If the sender sends data too fast, the receiver may be too busy, or under too much pressure, or only want to allocate resources to a specific stream, and the receiver may not want to process the data. For example, if the client requests a video from the server, but the user pauses watching it, the client may tell the server not to send any more data.

Although TCP also has flow control, it only works for one connection. HTTP/2 has multiple streams on a connection. Sometimes, we just want to control some streams, so HTTP/2 provides a separate flow control mechanism. Flow control has the following features:

Flow control is unidirectional. Receiver can choose to set the window size for the stream or the entire connection.
Flow control is based on trust. Receiver just advises the sender of its initial connection and flow control window size of the stream.
Flow control cannot be disabled. When the HTTP/2 connection is established, the client and server will exchange SETTINGS frames to set the flow control window size.
Flow control is hop-by-hop, not end-to-end, that is, we can use a middleman for flow control.

It should be noted here that the default window size of HTTP/2 is 64 KB. In fact, this value is too small. In TiKV, we directly set it to 1 GB.

HPACK

In an HTTP request, we usually carry a lot of meta information of the request on the header, which is used to describe the resource to be transmitted and its related attributes. In the era of HTTP/1.x, we use plain text protocol and use \r\n to separate. If we want to transmit a lot of metadata, the header will be very large. In addition, most of the time, most of the requests on a connection, in fact, the header is not much different. For example, our first request may be GET /a.txt, followed by GET /b.txt. The only difference between the two requests is the URL The path is different, but we still have to send all the other fields completely.

HTTP/2 uses HPACK to solve this problem. Although HPACK's RFC document looks scary, the principle is actually very simple and easy to understand.

HPACK provides a static and dynamic table, the static table defines common HTTP header fields, such as method, path, etc. When sending a request, as long as you specify the index of the field in the static table, both parties will know what field to send.

For the dynamic table, it is initialized to be empty. If there is a new field found after the interaction between the two sides, it will be added to the dynamic table, so that subsequent requests can be the same as the static table, and only need to bring the relevant index.

At the same time, in order to reduce the size of data transmission, Huffman encoding is used. How to encode HPACK and Huffman will not be described in detail here.

summary

The above is just a general list of some HTTP/2 features, and some, such as push, and different frame definitions are not mentioned. If you are interested, you can refer to the HTTP/2 RFC document.

Hello gRPC

gRPC is based on HTTP/2 and protobuf by Google. To understand the gRPC protocol, you only need to know how gRPC is transmitted on HTTP/2.

gRPC usually has four modes, unary, client streaming, server streaming and bidirectional streaming. For the underlying HTTP/2, they are all streams, and they are still a set of request + response models.

Request

A gRPC request usually contains Request-Headers, 0 or more Length-Prefixed-Message and EOS.

Request-Headers HTTP/2 headers used directly, dispatched in HEADERS and CONTINUATION frame. The defined headers mainly include Call-Definition and Custom-Metadata. Call-Definition includes Method (actually HTTP/2 POST), Content-Type and so on. Custom-Metadata is any key-value customized by the application layer. It is not recommended to start with grpc- for the key, because it is reserved for gRPC itself.

Length-Prefixed-Message is mainly distributed in the DATA frame, it has a Compressed flag to indicate whether the message is compressed, if it is 1, it means the message is compressed, and the compression is defined in the Message-Encoding in the header. This is followed by the four-byte message length and the actual message.

EOS (end-of-stream) will carry the END_STREAM flag in the last DATA frame. It is used to indicate that the stream will no longer send any data and can be closed.

Response

Response mainly includes Response-Headers, 0 or more Length-Prefixed-Messages and Trailers. If you encounter an error, you can also directly return Trailers-Only.

Response-Headers mainly include HTTP-Status, Content-Type and Custom-Metadata etc. Trailers-Only also has HTTP-Status, Content-Type and Trailers. Trailers include Status and 0 or more Custom-Metadata.

HTTP-Status is our usual HTTP 200, 301, 400, etc., which are very common and will not be explained. Status is the status of gRPC, and Status-Message is the message of gRPC. Status-Message adopts the Percent-Encoded encoding method, please refer to here for details.

If Trailers is included in the last received HEADERS frame, and there is the flag END_STREAM, then it means the EOS of the response.

Protobuf

The service interface of gRPC is defined based on protobuf, and we can easily associate service with HTTP/2.

Path : /Service-Name/{method name}
Service-Name : ?( {proto package name} "." ) {service name}
Message-Type : {fully qualified proto message name}
Content-Type : "application/grpc+proto"

postscript

The above is just a simple understanding of the gRPC protocol. It can be seen that the cornerstone of gRPC is HTTP/2, and then the service RPC is defined using the protobuf protocol. Although it seems simple, if a language does not have support for HTTP/2, protobuf, etc., it is very difficult to support gRPC.

Sadly, Rust just doesn't have HTTP/2 support, and only has a protobuf implementation available. In order to support gRPC, our team has put in a lot of effort and taken a lot of detours. From the rust-grpc project using pure Rust at the beginning, to the grpc-rs package based on c-grpc, there is still a lot to say. Come back slowly. If you are interested in both gRPC and rust, welcome to participate in the development.