46. GRPC first knowledge-20:45:43, April 12, 2020

April 12, 2020 19:38:25

GRPC initial

GRPC protocol ( Google Remote Procedure Call Protocol ) is a high-performance, general-purpose RPC open source software framework based on HTTP2 protocol carried by Google . Both parties of communication carry out secondary development based on the framework, so that the two parties of communication can focus on the business without paying attention to the underlying communication realized by the GRPC software framework.

Official documentation:

http://doc.oschina.net/grpc?t=58011

The layering of the GRPC protocol stack is shown below:

Figure 1-1 GRPC protocol stack layering
img

Table 1-1 GRPC protocol stack layering

Field Explanation
TCP layer The underlying communication protocol is based on TCP connection.
TLS layer This layer is optional, based on TLS 1.2 encrypted channel and two-way certificate authentication.
HTTP2 layer GRPC is carried on the HTTP2 protocol and utilizes the features of HTTP2, such as bidirectional streaming, flow control, header compression, and multiplexing requests on a single connection.
GRPC layer Remote procedure call defines the protocol interaction format of remote procedure call.
Data model layer Both parties in communication need to understand each other's data model in order to interact correctly.

gRPC is based on the HTTP / 2 protocol. To understand gRPC deeply, it is necessary to understand HTTP / 2. Here, we first briefly introduce the knowledge about HTTP / 2, and then introduce how gRPC is built based on HTTP / 2.

HTTP/1.x

The HTTP protocol can be regarded as the most common protocol on the Web at this stage. For a long time before, many applications were based on the HTTP / 1.x protocol. The HTTP / 1.x protocol is a text protocol and is very readable. However, it is not very efficient. I have encountered several problems:

Parser

If you want to parse a complete HTTP request, first we need to be able to read the HTTP header correctly. HTTP header fields each use \r\nseparate, then with between body \r\n\r\nseparated. After completion of parsing header, we can header from inside content-lengthto get the body's size, so as to read body.

This set of processes is actually not efficient, because we need to read many times to parse a complete HTTP request, although there are many optimization methods in the code implementation, such as:

  • Read a large block of data into the buffer at a time to avoid multiple IO reads
  • When read directly matching \r\nmanner stream parsing

But for the high-performance service, the above method will still have overhead. In fact, the main problem is that the HTTP / 1.x protocol is a text protocol, which is human-friendly. If you want to be machine-friendly, the binary protocol is the better choice.

If you are very interested in parsing HTTP / 1.x, you can study http-parser , a very efficient and compact C library. I have seen many frameworks that integrate this library to handle HTTP / 1.x.

Request/Response

Another problem with HTTP / 1.x is its interactive mode. A connection can only ask one question at a time, that is, after the client sends a request, it must wait for the response before it can continue to send the next request.

This mechanism is very simple, but it will cause low network connection utilization. If a large number of interactions are required at the same time, the client needs to establish multiple connections with the server, but the establishment of the connection also has an overhead, so for performance, usually these connections are always long connections to keep alive, although for the server to handle 100 Wan Connect is not a big challenge, but it is not very efficient.

Push

Students who have used HTTP / 1.x for push will probably know how painful it is because HTTP / 1.x does not have a push mechanism. So there are usually two approaches:

  • Long polling method, that is, directly hang a connection to the server and wait for a period of time (such as 1 minute), if the server returns or times out, it will poll again.
  • Web-socket, through the upgrade mechanism, explicitly turns this HTTP connection into a naked TCP for two-way interaction.

Compared with Long polling, I still prefer web-socket. After all, it is more efficient, but the interaction behind web-socket is not HTTP in the traditional sense.

Hello HTTP/2

Although the HTTP / 1.x protocol may still be the most widely used protocol on the Internet today, as the scale of Web services continues to expand, HTTP / 1.x is becoming increasingly precarious, and we urgently need another set of better protocols. Build our service, so we have HTTP / 2.

HTTP / 2 is a binary protocol, which means that its readability is almost 0, but fortunately, we still have many tools, such as Wireshark, which can parse it out.

Before understanding HTTP / 2, you need to know some general terms:

  • Stream: A bidirectional stream, a connection can have multiple streams.
  • Message: That is the logical request and response above.
  • Frame :: The smallest unit of data transmission. Each Frame belongs to a specific stream or the entire connection. A message may consist of multiple frames.

gRPC was originally developed by Google and is a language-neutral, platform-neutral, open-source remote procedure call (RPC) system.

In gRPC, the client application can directly call the method of the server application on another different machine like the local object, making it easier for you to create distributed applications and services. Similar to many RPC systems, gRPC is also based on the concept of defining a service and specifying the methods (including parameters and return types) that can be called remotely. Implement this interface on the server and run a gRPC server to handle client calls. Having a stub on the client can act like a server.

Write a picture description here

characteristic

  • Based on HTTP / 2
    HTTP / 2 provides connection multiplexing, bidirectional streaming, server push, request priority, header compression and other mechanisms. Can save bandwidth, reduce the number of TCP connections, save CPU, help mobile devices extend battery life, etc. The protocol design of gRPC uses the existing semantics of HTTP2, the request and response data is sent using HTTP Body, and other control information is represented by Header.
  • IDL uses ProtoBuf
    gRPC to use ProtoBuf to define services. ProtoBuf is a data serialization protocol developed by Google (similar to XML, JSON, Hessian). ProtoBuf can serialize data and is widely used in data storage and communication protocols. High compression and transmission efficiency, simple syntax and strong expressiveness.
  • Multi-language support (C, C ++, Python, PHP, Nodejs, C #, Objective-C, Golang, Java)
    gRPC supports multiple languages, and can automatically generate client and server function libraries based on the language. At present, C version grpc, Java version grpc-java and Go version grpc-go have been provided, and other language versions are under active development, among which grpc supports C, C ++, Node.js, Python, Ruby, Objective-C, PHP And languages ​​such as C #, grpc-java already supports Android development.

gRPC has been used in Google ’s cloud services and externally provided APIs. Its main application scenarios are as follows:-Low
latency, high scalability, and distributed systems
-Mobile application clients that communicate with cloud servers-
Independent and efficient design language , Accurate new protocol
-layered design that facilitates all aspects of expansion, such as authentication, load balancing, logging, monitoring, etc.

HTTP2.0 features

HTTP / 2, which is the second version of the Hypertext Transfer Protocol, whether it is 1 or 2, the basic semantics of HTTP are unchanged, such as method semantics (GET / PUST / PUT / DELETE), status code (200/404/500 Etc.), Range Request, Cacheing, Authentication, URL path, the main differences are the following:

Multiplexing

In the HTTP / 1.1 protocol, "the browser client has a certain number of requests for the same domain name at the same time. Requests exceeding the limit will be blocked."

Multiplexing of HTTP / 2 allows multiple request-response messages to be initiated through a single HTTP / 2 connection at the same time.
Therefore, HTTP / 2 can easily achieve multi-stream parallelism without relying on establishing multiple TCP connections. HTTP / 2 reduces the basic unit of HTTP protocol communication to one frame, and these frames correspond to the messages in the logical stream. Exchange messages in both directions on the same TCP connection in parallel.

Write a picture description here

Binary frame

The data transmitted by HTTP / 2 is binary. Compared with the plain text data of HTTP / 1.1, an obvious advantage of binary data is: smaller transmission volume. This means lower load. Binary frames are also easier to parse and less error-prone. Plain text frames also need to be considered when dealing with spaces, capitalization, blank lines, and line breaks. Binary frames do not have this problem.

Write a picture description here

Header Compression

HTTP is a stateless protocol. In short, this means that each request must carry all the details the server needs, rather than letting the server save the metadata of the previous request. Because http2 does not change this paradigm, it also needs to be so (carrying all the details), so the header of the HTTP request needs to contain data used to identify the identity, such as cookies, and the amount of these data also increases over time. The header of each request contains these large amounts of duplicate data, which is undoubtedly a big burden. Compressing the request header will greatly reduce this burden, especially for the mobile terminal, the performance improvement is very obvious.

The compression method used by HTTP / 2 is HPACK. http://http2.github.io/http2-spec/compression.html

HTTP2.0 uses a "header table" on the client and server to track and store previously sent key-value pairs. For the same data, it is no longer sent through each request and response; a common key that will hardly change during communication -Value pairs (user agent, acceptable media type, etc.) only need to be sent once.

Write a picture description here

In fact, if the request does not contain a header (such as a polling request for the same resource), then the header overhead is zero bytes. At this time, all headers automatically use the header that was previously requested to be sent.

If the header changes, then you only need to send the changed data in the Headers frame, the newly added or modified header frame will be added to the "header table". The header table always exists during the connection duration of HTTP2.0, and is updated gradually by both the client and the server.

Server Push

The work done by HTTP / 2 server push is that when the server receives a request for a resource from the client, it will determine what other resources the client will request from the client, and then send these resources to the client together. Client, even if the client has not explicitly stated that it needs these resources.

The client can choose to put additional resources into the cache (so this feature is also called Cache push), or it can choose to send a RST_STREAM frame to reject any resources it does not want.

Write a picture description here

Active reset link

After the Length HTTP message is sent, it is difficult for us to interrupt it. Of course, we can usually disconnect the entire TCP connection (but not always), but the cost of this is the need to re-establish a new TCP connection through the three-way handshake.

HTTP / 2 introduces a RST_STREAM frame to allow the client to send a reset request on an existing connection, thereby interrupting or abandoning the response. When the browser performs a page jump or the user cancels the download, it can prevent the establishment of a new connection and avoid wasting all bandwidth.

The advantages and disadvantages of gRPC:

advantage:

Protobuf binary message, good performance / high efficiency (space and time efficiency are very good)
proto file generates target code, easy to use
serialization deserialization directly corresponds to the data class in the program, does not need to be mapped after parsing (XML, JSON is all this way)
Supports forward compatibility (new added fields use default values) and backward compatibility (ignoring new added fields), simplifying upgrades
Support multiple languages ​​(you can think of proto files as IDL files)
Netty and other frameworks integrated

Disadvantages:

1) GRPC has not provided a connection pool and needs to be implemented by itself.
2) The mechanism of "service discovery" and "load balancing" has not been provided.
3) Because it is based on HTTP2, most HTTP Servers and Nginx are not supported by most of them. Load balancing as an HTTP request, but as a normal TCP request. (Supported by nginx1.9 version)
4) Protobuf has poor binary readability (it seems to provide the Text_Fromat function). It
does not have dynamic features by default (you can generate message types through dynamic definition or support dynamic compilation)

example:

gRPC differs from the usual TCP-based implementation and is directly based on the HTTP2 protocol. HTTP2 makes grpc better suitable for mobile client and server communication scenarios, and connection multiplexing also ensures the efficiency of RPC.

The gRPC protocol design uses the existing semantics of HTTP2 very well. The request and response data is sent using HTTP Body, and other control information is represented by Header.

First look at an example, suppose Protobuf is defined as follows:

package foo.bar;

message HelloRequest {
  string greeting = 1;
}

message HelloResponse {
  string reply = 1;
}

service HelloService {
  rpc SayHello(HelloRequest) returns (HelloResponse);
}

In this we have defined one service HelloService. The request sent by grpc for such a call is:

HEADERS (flags = END_HEADERS)
:method = POST
:scheme = http
:path = /foo.bar.HelloService/SayHello
:authority = api.test.com
grpc-timeout = 1S
content-type = application/grpc+proto
grpc-encoding = gzip
authorization = Bearer y235.wef315yfh138vh31hv93hv8h3v

DATA (flags = END_STREAM)
<Delimited Message>

The Path part of the Http request is used to indicate which service is called, the format is /{package}.{ServiceName}/{RpcMethodName},

content-type The current values ​​are application/grpc+proto,

In the future, when grpc supports protocols other than Protobuf, such as Json, there will be other values.

grpc-encodingCan have the gzip, deflate, snappy same value, indicating the compression method used.

grpc-timeout Indicates the time-out period of the call, in units Hour(H), Minute(M), Second(S), Millisecond(m), Microsecond(u), Nanosecond(n) .

In addition to the standard header defined by grpc, you can also add new headers yourself. If it is a binary Header, then Header Name to -bin end,

Header Value It is binary data encoded by Base64.

The server returns a Response to this request:

HEADERS (flags = END_HEADERS)
:status = 200
grpc-encoding = gzip

DATA
<Delimited Message>

HEADERS (flags = END_STREAM, END_HEADERS)
grpc-status = 0 # OK
trace-proto-bin = jher831yy13JHy3hc

If grpc-status is 0, there is no problem with the request, and it returns successfully.

grpc also defines GOAWAY Frame. When the server disconnects a connection, it needs to send such a message to the client; and PING Frame, after receiving the PING Frame, directly return the data as it is, for connection survival detection and delay detection.

The header of HTTP2 is not a particularly efficient format, and there are some efficiency issues in storage and parsing. If the encrypted connection is enabled, there will be more efficiency overhead.

Guess you like

Origin www.cnblogs.com/oneapple/p/12687498.html