GRPC Java source code analysis

introduction

GRPC is a multi-language support remote call framework launched by Google . The overall architecture of its Java version is shown in the figure below. It is divided into four layers. From top to bottom, the first layer is Server/Client Codes, and GRPC users implement business functions; the second layer is server/client code ( GRPC Protobuf Server/Client Stub), this layer is the implementation of the application interface defined by the server and the client, responsible for the serialization and deserialization of method parameters and the interaction with GRPC (friends who have used Protobuf should not be difficult to understand its Function); the third layer is the GRPC core frame layer (GRPC Frame), this layer is the functional core of GRPC, composed of several key source code packages; the fourth layer is the GRPC transport implementation layer (Transport Impl), due to the GRPC core The framework layer abstracts the underlying data transmission logic in the way of interfaces, so the actual underlying data transmission can be replaced by different implementations. Currently, there are two versions of the Netty-based network implementation and the streamlined implementation of the same process.
Insert picture description here
This article focuses on the implementation of the GRPC core framework (GRPC Frame), and uses GRPC Netty Impl as the transport layer to implement the scenario (GRPC Inprocess Impl has similar functions). Later, I will first analyze its logical architecture and runtime, and then interpret its key code implementation.
Note 1 The GRPC Java version of this interpretation is 1.25.0, and the corresponding Netty version is 4.1.42.
Note 2 In the process of analyzing the logical architecture, I will use the package diagram to divide its functions, so the package diagram in the diagram is not a source code package, it is just a logical package. In the actual source code, the classes in these different logic packages may belong to the same source package, and the classes in the same logic package may also be scattered in different source packages. Logical packages can be understood as just logical units divided according to functions, and I hope readers will not be confused with source code packages.
Note 3 The following class diagram is divided into three colors, yellow is the class of the GRPC core framework (GRPC Frame); black and gray is the class of the GRPC Netty impl; red is the class in the Netty library.
Note 4 For those who only want to understand the principles or concepts, they definitely don’t like the logic class diagrams, running interaction diagrams, etc. that I drew later, so the purpose of this article is to do an in-depth analysis of its entire logic and operating principles. It is not a simple introduction to GRPC. At the same time, using class diagrams and running interactive diagrams is also convenient for us to clarify our ideas and make a record of its general details. Especially for the huge project code, it is impossible for you to memorize it by rote after sorting it out today. Using this diagram, when it takes a long time to go deep into or modify one of the details, it can help us get a quick look at it.

1 server

1.1 Logical architecture

1.1.1 Overview

  • Overall architecture
    Insert picture description here

The service management (Server Pack) is responsible for the construction and startup of the logical service ServerImpl and the monitoring service NettyServer. It is the functional factory of the entire server logic; the
transport logic (Transport Pack) is responsible for building the real underlying IO data transmission and the corresponding event listener; the
network stream (Stream Pack) is the encapsulation of the network session of the method call, that is, the one-time method The call is a flow; the
method call (Call Pack) is the real method call logic, and finally it will call the corresponding method of the service interface we implemented; the
service registration (Registry Pack) is responsible for registering the service description, interface description and other information, For method call module query;

  • Core logic

The overall macro logic diagram of the server is as follows. Seeing this picture, some people may think it is too complicated, but this is only the core class, and most of the classes are not drawn. At this time, some people will think that GRPC is too over-designed, but it is not. A good code must have its exquisite concept extraction and logical abstraction to isolate problems and reduce complexity. This is also the way to keep the system stable and scalable. The key is also the difference between architecture-oriented programming and function-oriented programming. Concept creation is also the soul of architecture design and the key to innovation.
Note 1 : In the figure, aggregate concrete classes are used to show the entire macro-logic relationship, but in the implementation, many of them are aggregated abstract interfaces, which will be explained in detail in the subsequent interpretation of each sub-module.
Note 2 : In the figure, the class name with $ indicates that the latter is the internal class of the former; {} indicates that there is an anonymous class in a method of the previous class, and the base class name or interface name of this anonymous class is in {} name.
Insert picture description here
The entire server architecture is divided into six modules. The service management (Server Pack) is responsible for the construction and startup of the logical service ServerImpl and the monitoring service NettyServer; the service registration (Registry Pack) is responsible for registering information such as service descriptions, interface descriptions, etc., for method call module query; transport logic (Transport Pack) Responsible for building the real underlying IO data transmission and corresponding event listeners; network processing (Handler Pack (io.grpc.netty)) is based on the asynchronous callback mechanism provided by Netty to implement IO event handlers, which will eventually be handed over to Transport Pack Event listener or Stream Pack processing; the network stream (Stream Pack) is the encapsulation of the network session of the method call, that is, a method call is a stream; the method call (Call Pack) is the real method call logic, and finally it will be called to The method corresponding to the service interface we implemented.

1.1.2 Service Management (Server Pack)

The service management module is divided into logical server builder (ServerBuilder), logical server (Server), and monitor server (InternelServer). ServerBuilder is the construction factory of Server and InternelServer; Server is a real logical server. A GRPC process can monitor multiple addresses at the same time, and each listening address corresponds to an InternelServer, so Server can be regarded as the manager of InternelServer; InternelServer is real Server, one InternelServer can only handle one address.
Insert picture description here
In addition, the Server implementation class ServerImpl also holds an instance of the interface HandlerRegistry (ie InternelHandlerRegistry) to manage the service registration information; ServerImpl will construct ServerTransportListenerImpl and pass it to InternelServer, and the InternelServer implementation class NettyServer will set it to NettyServerTransport when constructing NettyServerTransport Listener.

1.1.3 Service Registration (Registry Pack)

Service registration includes registration manager (HandlerRegistry), service definition (ServerServiceDefinition), method definition (ServerMethodDefinition), service description (ServiceDescriptor) and method description (MethodDescriptor). A HandlerRegistry implementation class InternalHandlerRegistry aggregates multiple ServerServiceDefinition and ServerMethodDefinition; a ServerServiceDefinition also aggregates multiple ServerMethodDefinition and a ServiceDescriptor; ServerMethodDefinition aggregates a ServerCallHandler (implemented as UnaryServerCallHandler) instance and a MethodDescriptor; a ServiceDescriptor aggregates multiple MethodDescriptors; and finally each Each MethodDescriptor has an enumerated attribute MethodType.
Insert picture description here

1.1.4 Transport Logic (Transport Pack)

The transmission logic is relatively simple. It is divided into a transmission channel (ServerTransport) and a channel listener (ServerTransportListener). The implementation class of ServerTransport is NettyServerTransport, which is related to the implementation class ServerTransportListenerImpl of ServerTransportListener. ServerTransportListenerImpl is an internal class of ServerImpl .
Insert picture description here

1.1.5 Network Processing (Handler Pack (io.grpc.netty))

Network processing is Netty's channel processing logic. There are a total of four Handlers, WriteBufferingAndExceptionHandler, WaitUntilActiveHandler, GrpcNegotiationHandler and NettyServerHandler. They aggregate in turn, all aggregate abstract interface types; among them, WaitUntilActiveHandler and GrpcNegotiationHandler are constructed by PlaintextProtocolNegotiator. When NettyServerHandler receives the Header, it constructs NettyServerStream and calls the streamCreated method of the ServerTransportListener (implemented as ServerTransportListenerImpl) interface.
Insert picture description here

1.1.5.1 netty field

In network processing, the relationship between the interfaces and abstract classes in the netty library is as follows:
Insert picture description here

1.1.6 Stream Pack

The network stream is divided into two parts: stream object (Stream) and stream listener (StreamListener). Stream here is the implementation class NettyStream, which aggregates abstract StreamListener (implemented as JumpToApplicationThreadServerStreamListener); the implementation class of StreamListener is divided into JumpToApplicationThreadServerStreamListener (it is the internal class of ServerImpl) and ServerStreamListennerImpl (it is the internal class of ServerCallImpl), the former aggregates The latter interface type; ServerStreamListennerImpl will aggregate a ServerCallImpl and a ServerCall internal interface Listener (it is implemented as UnaryServerCallListener, which is a wrapper or entrance for calling the real interface implementation method).
Insert picture description here

1.1.7 Method Call (Call Pack)

The method call module is divided into call handler (ServerCallHandler), method interface (UnaryRequestMethod), call listener (Listener), response processing (StreamObserver) and call object (ServerCall). ServerCallHandler has two subclasses, InterceptCallHandler and UnaryServerCallHandler. The former is the method call interceptor set by the service business implementer on demand, which is used as the preprocessing of the method call, and the latter is the real method call executor. Interface implementation method; UnaryRequestMethod is an interface provided for Protobuf stub code implementation method calls; Listener (its implementation class is an internal class of UnaryServerCallHandler) is mainly responsible for monitoring method calls and executing response processing; StreamObserver provides an interface for service business implementers, Let it return the return value of the method; ServerCall is the boundary of the method call, responsible for interacting with the stream data stream, querying method registration information, etc.
Insert picture description here

1.2 runtime

1.2.1 Service Construction

The service construction function is mainly completed by the four classes of NettyServerBuilder, InternalHandlerRegistry, InternalHandlerRegistry $Builder and ServerImpl. Of course, there are many other details, please refer to the source code to read.
Insert picture description here

1.2.2 Service start

Service startup process:

  • First, a ServerImpl $ ServerListenerImpl is constructed by the start method of ServerImpl. Then use ServerImpl $ ServerListenerImpl as a parameter to call the start function of each listener InternalServer;
  • In the start function of NettyServer, ServerBootstrap (this class is the service startup factory of netty) will be constructed, and then a partial ChannelInitializer will be constructed as the service connection processor;
  • When a connection comes, the initChannel method of the local ChannelInitializer will be called asynchronously.
    Insert picture description here

1.2.3 Connection processing

Connection processing flow:

  • First, construct NettyServerTransport and call the transportCreated method of ServerImpl $ ServerListenerImpl with it as a parameter. In transportCreated, ServerImpl $ ServerTransportListenerImpl will be constructed and an abstract ServerTransportListener will be returned. Then, call the start method of the newly constructed NettyServerTransport with the abstract ServerTransportListener as a parameter;
  • In the start method of NettyServerTransport, it will construct a NettyServerHandler, call the newHandler method of ProtocolNegotiator with it as a parameter, and then construct a WriteBufferingAndExceptionHandler with the returned negotiationHandler as the delegate parameter, and set WriteBufferingAndExceptionHandler as the Netty processor of this connection channel;
  • Finally, when an HTTP2.0 header is delivered, the NettyServerHandler's OnHeadersRead method will eventually be called asynchronously; similarly, when an HTTP2.0 frame is delivered, the NettyServerHandler's OnDataRead method will be called asynchronously.
    Insert picture description here

1.2.3.1 Receive message header

When the message header is received, a new stream object of type NettyServerStream will be created, and the streamCreated method of ServerImpl $ ServerTransportListenerImpl will be called with stream as a parameter. In the streamCreated method, a jumpListener object of type ServerImpl $ JumpToApplicationThreadServerStreamListener will be constructed, and the setListener method of the stream object will be called as a parameter. Finally, build a ServerImpl $StreamCreated, and call it asynchronously and finally execute it to the runInternal method. In runInternal, a series of functions about the construction and association of the calling object and the listener will be executed.

Insert picture description here

1.2.3.2 Message body processing

  • When the message body is received, the inboundDataRecved method of the NettyServerStream $ TransportState object of the stream is called. In the inboundDataRecved, deframe will be executed first to process the data, and then when the endOfStream parameter is true, it means that it is the last frame, then the closeDeframe method will be called.
  • When closeDeframe is called, the frame close event will eventually be triggered, so the deframerClosed method of NettyServerStream $ TransportState will be called, and then the halfclose method of ServerImpl $ JumpToApplicationThreadServerStreamListener and ServerCalls $ UnaryServerCallHandler $ UnaryServerCallListener will be called in turn (halfclose means half-close, that is, only close the receiving and sending The data is still open and will be sent to the client after the method call returns). Eventually, the stub code implementation of the calling method in Protobuf will be executed.
  • In addition, in the end, the frame completion event will be triggered when the network data returns, and the entire stream will eventually be closed.
    Insert picture description here

1.3 Code interpretation

The previous logic diagram and runtime are fairly detailed, so I will add some time to the interpretation of the key code.

2 Client

Due to time constraints, for the client part, I will post the logical diagrams that have been sorted out and will not interpret the text. Friends who are interested can view it in combination with the source code.

2.1 Logical architecture

2.1.1 Overview

  • Overall architecture
    Insert picture description here

Logical transmission (Transport Pack) is the functional unit of data transmission. Divided into Pending and Real two levels, Real's Transport really interacts with netty for data transmission. Transport of Pending is a buffer for the call flow;
Stream Pack is an encapsulation of the network session of the method call process, which is divided into the logical flow PendingStream and the real flow NettyClientStream. PendingStream will first cache the business's operations on the stream, and execute it after the logical transmission is ready; the
method call (Call Pack) module is the logical package or function entry for the client to call the remote service; the
upper channel (Channel Pack) is the client and the service The connection logic channel of the terminal does not involve specific data transmission;

  • Core logic

The macro logic of the entire GRPC client part is shown in the figure below. It is divided into seven logic modules. The upper channel (Channel Pack) is the connection logic channel between the client and the server, and does not involve specific data transmission; the interceptor (Interceptor Pack) is responsible for adding the upper channel to create a logical package or functional entry for the real client to call remote services; logical transmission (Transport Pack) is the functional unit of data transmission, which truly interacts with netty for data transmission; the network channel (SubChannel Pack) is a real network channel rather than a logical channel, responsible for creating a logical transport layer and processing logical transmission events; network flow (Stream Pack) is the encapsulation of the network session of the method call process, which is divided into logical flow PendingStream and real flow NettyClientStream (PendingStream will first cache business operations on the stream, and execute it after the logical transmission is ready); network processing (Handler Pack) (io.grpc.netty)) is based on the asynchronous callback mechanism provided by Netty to implement the IO event handler, and will eventually be handed over to the event listener or Stream Pack in the Transport Pack for processing; the method call (Call Pack) is a method for the client The encapsulated object to be called.
Insert picture description here

2.1.2 Upper Channel (Channel Pack)

Insert picture description here

2.1.3 Interceptor Pack

Insert picture description here

2.1.4 Method Call (Call Pack)

Insert picture description here

2.1.5 Logical Transport Layer (Transport Pack)

Insert picture description here

2.1.6 Network Stream (Steam Pack)

Insert picture description here

2.1.7 Network Channel (SubChannel Pack)

Insert picture description here

2.1.8 Network Processing (Handler Pack (io.grpc.netty))

Insert picture description here

2.2 Runtime

2.2.1 Channel construction

Insert picture description here

2.2.2 Initiate a call

Insert picture description here

2.2.3 Start DNS

Insert picture description here

2.2.4 Domain name resolution is complete

Insert picture description here

2.2.5 Building a real stream

Insert picture description here

2.3 Code interpretation

3 Summary of experience

3.1 Analysis process

The entire GRPC code adds up to hundreds of thousands of lines of code. I spent a week analyzing it, and then spent another week to replace the underlying transport layer with the implementation of the transport layer of our custom protocol format, and there was no subsequent development and application process. BUG, after finishing busy, I took the time to organize and draw its logic and runtime diagrams one after another. I have never used GRPC before, nor have I seen any GRPC code. Many people may wonder: Why can I sort out its entire architecture so quickly? In this regard, the experience is summarized as follows:

3.1.1 First divide a reasonable boundary

3.1.2 Read the source code from a higher perspective

3.1.3 Analysis and derivation are very important

Guess you like

Origin blog.csdn.net/fs3296/article/details/103608383