When I was screwing at work, I accidentally built an RPC framework empty-handed, and quickly recorded it

origin

Recently, I shared the hand RPC in the company, so I will make a summary.

Concepts

What is RPC?

RPC, called Remote Procedure Call, is used to solve the problem of calling between services in a distributed system. In layman's terms, developers can call remote services just like calling local methods. Therefore, the role of RPC is mainly reflected in these two aspects:

  • The difference between shielding remote calls and local calls makes us feel like calling methods in the project;
  • Hiding the complexity of the underlying network communication allows us to focus more on the business logic.

RPC Framework Basic Architecture

Let's talk about the basic architecture of the RPC framework through a picture

The RPC framework consists of three most important components, namely the client, the server and the registry. In an RPC call process, these three components interact like this:

  • After the server is started, it will publish the list of services it provides to the registry, and the client subscribes to the registry for the service address;
  • The client will call the server through the local proxy module Proxy, and the Proxy module receives and is responsible for converting data such as methods and parameters into network byte streams;
  • The client selects one of the service addresses from the service list, and sends the data to the server through the network;
  • After receiving the data, the server decodes it to obtain the request information;
  • The server calls the corresponding service according to the decoded request information, and then returns the call result to the client.

RPC framework communication process and roles involved

From the above picture, you can see that the RPC framework generally has these components: service governance (registration discovery), load balancing, fault tolerance, serialization/deserialization, codec, network transmission, thread pool, dynamic proxy and other roles, of course some The RPC framework will also have roles such as connection pooling, logging, and security.

The specific calling process

  1. The service consumer (client) calls the service in the local call mode
  2. After the client stub receives the call, it is responsible for encapsulating methods, parameters, etc. into a message body that can be transmitted over the network.
  3. The client stub encodes the message and sends it to the server
  4. The server stub decodes the message after receiving the message
  5. The server stub calls the local service according to the decoding result
  6. The local service executes and returns the result to the server stub
  7. The server stub encodes the returned import result and sends it to the consumer
  8. The client stub receives the message and decodes it
  9. The service consumer (client) gets the result

RPC message protocol

During the RPC call process, the parameters need to be marshalled into messages for sending, the receiver needs to unmarshal the messages as parameters, and the process processing results also need to be marshalled and unmarshalled. The parts of the message and the representation of the message constitute the message protocol.
The message protocol used in the RPC call process is called the RPC message protocol.

Actual combat

From the above concepts, we know what parts an RPC framework consists of, so we need to consider these parts when designing an RPC framework. From the definition of RPC, we can know that the RPC framework needs to shield the underlying details and make users feel that calling a remote service is as simple as calling a local method, so these issues need to be considered:

  • How can users use our RPC framework with as little configuration as possible
  • How to register the service to ZK (here the registration center chooses ZK) and make the user unaware
  • How to call transparent (as far as the user can't perceive) the calling service provider
  • How to enable multiple service providers to achieve dynamic load balancing
  • How does the framework allow users to customize extension components (such as extending custom load balancing strategies)
  • How to define message protocol, and codec
  • ...and many more

The above problems will be solved in the design of this RPC framework.

Technical selection

  • The current mature registration centers of the registration center include Zookeeper, Nacos, Consul, and Eureka. Here, ZK is used as the registration center, and there is no function of switching and user-defined registration centers.
  • IO communication framework This implementation uses Netty as the underlying communication framework, because Netty is a high-performance event-driven non-blocking IO (NIO) framework. It does not provide other implementations and does not support user-defined communication frameworks.
  • Message protocol This implementation uses a custom message protocol, which will be explained later

Project overall structure

From this structure, it can be known that the modules starting with rpc are the modules of the rpc framework and the content of the RPC framework of this project, while the consumer is the service consumer, the provider is the service provider, and the provider-api is the exposed service API.

overall dependency

Project Implementation Introduction

In order to achieve as little configuration as possible when users use our RPC framework, the RPC framework is designed as a starter, and users only need to rely on this starter, which is basically fine.

Why design two starters (client-starter/server-starter) ?

This is to better reflect the concept of client and server, consumers rely on the client, service providers rely on the server, and minimize dependencies.

Why design as a starter?

Based on the spring boot automatic assembly mechanism, the spring.factories file in the starter will be loaded, and the following code will be configured in the file. Here, the configuration class of our starter will take effect, and some required beans will be configured in the configuration class.

org.springframework.boot.autoconfigure.EnableAutoConfiguration=com.rrtv.rpc.client.config.RpcClientAutoConfiguration
复制代码

Publishing and consuming services

  • For publishing service
    providers, the annotation @RpcService needs to be added to the exposed service. This custom annotation is based on @service and is a composite annotation with the function of @service annotation. The service interface and service version are specified in the @RpcService annotation. , publish the service to ZK, it will be registered according to these two metadata
    • Principle of publishing service:
      After the service provider is started, according to the spring boot automatic assembly mechanism, the configuration class of server-starter takes effect. In a bean post-processor (RpcServerProvider), the bean decorated with the annotation @RpcService is obtained, and the annotation The metadata is registered with ZK.
  • For consuming services
    , consuming services need to be identified by a custom @RpcAutowired annotation, which is a composite annotation based on @Autowired.
    • Principle of Consumption Service
      In order for the client to call the service provider imperceptibly, a dynamic proxy needs to be used. As shown above, HelloWordService has no implementation class, so it needs to be assigned a proxy class and initiate a request call in the proxy class. Based on spring boot automatic assembly, the service consumer starts, and the bean post-processor RpcClientProcessor starts to work. It mainly traverses all beans to determine whether the properties in each bean are modified by the @RpcAutowired annotation, and if so, dynamically Assign the proxy class, which will call the invoke method of the proxy class when it is called again.
    • The proxy class invoke method obtains server-side metadata through service discovery, encapsulates requests, and initiates calls through netty.

Registration Center

The registration center of this project uses ZK, because the registration center is used by both service consumers and service providers. So put ZK in the rpc-core module.

The rpc-core module is shown in the figure above, and the core functions are all in this module. Services are registered under the register package.

Service registration interface, the specific implementation is implemented using ZK.

load balancing strategy

Load balancing is defined in rpc-core, which currently supports round-robin (FullRoundBalance) and random (RandomBalance), and uses a random strategy by default. Specified by rpc-client-spring-boot-starter.

When discovered through ZK service, multiple instances will be found, and then one of the instances will be obtained through the load balancing strategy

You can configure rpc.client.balance=fullRoundBalance in the consumer to replace it, or you can customize the load balancing strategy by implementing the interface LoadBalance and adding the created class to the IOC container. Since we configure @ConditionalOnMissingBean, user-defined beans will be loaded first.

Custom message protocol, codec

The so-called agreement is that the two parties negotiate the rules in advance, and the server knows how to parse the data sent.

  • custom message protocol
    • Magic number: The magic number is a secret code negotiated by both parties of the communication, usually represented by a fixed number of bytes. The role of the magic number is to prevent anyone from sending data to the server's port indiscriminately. For example, the magic number 0xCAFEBABE is stored at the beginning of the java Class file. When loading the Class file, the correctness of the magic number will be verified first.
    • Protocol version number: With the change of business requirements, the protocol may need to modify the structure or fields, and the analysis methods corresponding to different versions of the protocol are also different.
    • Serialization algorithm: The serialization algorithm field indicates which method the data sender should use to convert the requested object into binary, and how to convert the binary into an object, such as JSON, Hessian, and Java's own serialization.
    • Packet type: In different business scenarios, there may be different types of packets. There are requests, responses, heartbeats and other types of messages in the RPC framework.
    • Status: The status field is used to identify whether the request is normal (SUCCESS, FAIL).
    • Message ID: The unique ID of the request, through which the response is associated, and the link can also be tracked through the request ID.
    • Data length: Indicate the length of the data, which is used to judge whether it is a complete data packet
    • Data content: Request body content
  • Codec
    Codec is implemented in the rpc-core module, under the package com.rrtv.rpc.core.codec.
  • The custom encoder implements message encoding by inheriting the MessageToByteEncoder<MessageProtocol<T>> class of netty.
  • The custom decoder implements message decoding by inheriting the ByteToMessageDecoder class of netty.

When decoding, you need to pay attention to the problems of TCP sticking and unpacking

What is TCP sticking and unpacking

The TCP transport protocol is stream-oriented and has no packet boundaries, which means that messages have no boundaries. When the client sends data to the server, it may split a complete packet into multiple small packets for transmission, or it may combine multiple packets into one large packet for transmission. Hence the unpacking and sticking.

In the process of network communication, the size of data packets that can be sent each time is limited by various factors, such as MTU transmission unit size, sliding window and so on.
So if the size of the network packet data transmitted at one time exceeds the transmission unit size, then our data may be split into multiple data packets and sent out. If the network packet data of each request is very small, such as a total of 10,000 requests, TCP will not send 10,000 times separately. The Nagle (batch sending, mainly used to solve the network congestion problem caused by the frequent sending of small data packets) algorithm adopted by TCP is optimized for this.

So, the network transmission will appear like this:

  1. The server happened to read two complete data packets A and B, and there was no unpacking/sticking problem;
  2. The server receives the data packets that A and B are glued together, and the server needs to parse out A and B;
  3. The server receives the complete A and B part of the data packet B-1, the server needs to parse the complete A, and wait to read the complete B data packet;
  4. The server receives a part of the data packet A-1 of A, and needs to wait for the complete A data packet to be received at this time;
  5. The data packet A is large, and the server needs several times to receive the data packet A.

How to solve the problem of TCP sticking and unpacking

The fundamental means to solve the problem: find the boundary of the message:

  • Fixed message length
    Each datagram requires a fixed length. When the receiver accumulatively reads the fixed-length messages, it considers that a complete message has been obtained. When the sender's data is less than the fixed length, it needs to be filled with blanks.
    The fixed-length message method is very simple to use, but the shortcomings are also very obvious. It is impossible to set the value of the fixed length very well. If the length is too large, it will cause a waste of bytes, and if the length is too small, it will affect the message transmission. Therefore, in general, the message is fixed-length. Law will not be adopted.
  • Specific separator
    Add a specific separator at the end of each sent message, and the receiver can split the message according to the special separator. The choice of delimiter must be avoided to be the same as the characters in the message body to avoid conflicts. Otherwise incorrect message splitting may occur. The recommended practice is to encode the message, such as base64 encoding, and then choose a character other than the 64 encoded characters as a specific delimiter
  • Message length + message content
    Message length + message content is the most commonly used protocol in project development. The receiver reads the message content according to the message length.

This project uses the method of "message length + message content" to solve the problem of TCP sticking and unpacking. Therefore, when decoding, it is necessary to judge whether the data is long enough to read. If it is not enough, it means that the data is not ready. Continue to read the data and decode it. Here, one complete data packet can be obtained in this way.

Serialize and deserialize

Serialization and deserialization The rpc-core module com.rrtv.rpc.core.serialization package provides HessianSerialization and JsonSerialization serialization.
The default serialization is HessianSerialization. User cannot customize.

Serialization performance:

  • spatially

  • in time

Network transmission, using netty

The netty code is fixed. It is worth noting that the order of handlers cannot be mistaken. Taking the server as an example, the encoding is an outbound operation (it can be placed after the inbound), and decoding and receiving the response are both inbound operations. Front.

Client RPC call method

Mature RPC frameworks generally provide four calling methods, namely synchronous Sync, asynchronous Future, callback Callback and one-way Oneway.

  • Sync Synchronous call. After the client thread initiates the RPC call, the current thread will be blocked until the server returns the result or handles the timeout exception.
  • Future asynchronous call
    After the client initiates the call, it will not block and wait, but will get the Future object returned by the RPC framework. The call result will be cached by the server, and the client will decide when to obtain the returned result in the future. When the client actively obtains the result, the process is blocking and waiting
  • When the Callback Callback client initiates a call, the Callback object is passed to the RPC framework, and it returns directly without waiting for the return result synchronously. When the server response result or the timeout exception is obtained, the Callback callback registered by the user is executed.
  • Oneway one-way calling client returns directly after initiating the request, ignoring the return result

The first one is used here: client-side synchronous calls, others are not implemented. The logic is in RpcFuture, using CountDownLatch to implement blocking waiting (timeout waiting)

Overall Architecture and Process

The process is divided into three parts: service provider startup process, service consumer startup, and calling process

  • The service provider starts the service provider will rely on rpc-server-spring-boot-starterProviderApplication to start, according to the springboot automatic assembly mechanism, RpcServerAutoConfiguration automatic configuration takes effect. RpcServerProvider is a bean post-processor that will publish services and register service metadata to The RpcServerProvider.run method on ZK will start a netty service
  • The service consumer starts the service consumer consumer will rely on rpc-client-spring-boot-starterConsumerApplication to start, according to the springboot automatic assembly mechanism, the RpcClientAutoConfiguration automatic configuration takes effect. The service discovery, load balancing, proxy and other beans are added to the IOC container post-processor RpcClientProcessor will Scan beans and dynamically assign properties modified by @RpcAutowired to proxy objects
  • The calling process service consumer initiates a request http://localhost:9090/hello/world?name=hello The service consumer calls the helloWordService.sayHello() method, and it will be delegated to execute the ClientStubInvocationHandler.invoke() method. The service consumer serves through ZK It is found that the service metadata is obtained, and the error 404 cannot be found. The service consumer custom protocol encapsulates the request header and the request body. The service consumer uses the custom encoder RpcEncoder to encode the message. The service consumer obtains the ip and the service provider through service discovery. Port, initiate a call through the Netty network transport layer. The service consumer enters the return result through RpcFuture (timeout) and waits for the service provider to receive the consumer request. The service provider decodes the message through the custom decoder RpcDecoder. Process in RpcRequestHandler, execute the local method of the server through reflection call and obtain the result. The result that the service provider will execute will encode the message through the encoder RpcEncoder. (Because the protocol of the request and the response is the same, the encoder and the decoder can use one set.) The service consumer decodes the message through the custom decoder RpcDecoder. The service consumer writes the message into the request and response pool through RpcResponseHandler, and sets the RpcFuture's response result service consumer gets the result

The above process can be combined with code analysis, which will be given later in the code

Environment construction

  • Operating System: Windows
  • Integrated development tools: IntelliJ IDEA
  • Project technology stack: SpringBoot 2.5.2 + JDK 1.8 + Netty 4.1.42.Final
  • Project dependency management tool: Maven 4.0.0
  • Registration Center: Zookeeper 3.7.0

project test

  • Start the Zookeeper server: bin/zkServer.cmd
  • Start the provider module ProviderApplication
  • Start the consumer module ConsumerApplication
  • Test: Enter http://localhost:9090/hello/world?name=hello in the browser, successfully return Hello: hello, rpc call successfully

Project code address

gitee.com/listen_w/rp…

Guess you like

Origin blog.csdn.net/wdjnb/article/details/124449335