How to implement a distributed RPC framework

Remote Procedure Call (Remote Procedure Call, RPC) is a computer communications protocol. The agreement allowed to run on a single computer program calls a subroutine on another computer, but programmers without additional programming for this interaction. The main goal of RPC is to make it easier to build distributed applications, without losing the simplicity of the semantics of local calls at the same time provide a powerful remote call capability.

Take advantage of this free time before the internship, I implemented a distributed lightweight RPC framework, called buddha, small amount of code, but the sparrow is small but perfectly formed. This article will be disassembled and the factors to be considered a step by step clarify buddha design, the frame assembly.

Serialization and de-serialization

In the network, all data will be converted to byte is transmitted, so the level of the code, a frame needs to implement RPC interconversion between the data and the byte arrays specified format. Like Java already provides a default serialization, but if you are at high concurrency scenarios, using native Java serialization may experience performance bottlenecks. Thus, there were many open source, efficient serialization framework: The Kryo, fastjson Protobuf and the like. buddha Kryo currently supports both sequences and fastjson framework.

TCP unpacking, stick package

Because TCP only care about byte stream, the upper does not know the data format. If the client application layer data to be transmitted next is too large, the TCP will transmit data decomposition, hence the need for the service packet processing sticky end (to guarantee the ordering by the TCP data); if a client to send the amount of data is small, TCP will not immediately send data out, but will be in the buffer, and then store it sent out when a certain threshold is reached, it is necessary to work unpacking on the server.

Through the above analysis, we understand the reasons for TCP stick package or unpacking, the key to solve this problem is to add border information to the packet, commonly used methods are the following three.

  • Transmitting each packet is added to the end of the packet header, the header comprises at least the length of the data packet, so that when the receiver receives the data, the packet length obtained by reading the valid data length information in the header portion.
  • Each sending end fixed length packet is encapsulated (excess filled with zeros), so that the receiving terminal reads the data packet according to the conventions of each well after receiving a fixed length data.
  • Use special symbols to distinguish each packet zone, the receiving side is divided by the boundary data packets of the special symbol.

buddha using the first method to resolve TCP unpacking, stick package of issues.

BIO given NIO

BIO often used for classical per connection per thread model, the reason for the use of multi-threading, because like accept (), read () and write () and other functions are synchronous blocking, which means that when the application is single-threaded and performed when the IO operation, if the thread is blocked then the application is bound to enter the state linked to death, but in fact this time the CPU is idle. Open multiple threads, you can let the CPU go into more threads services, improve the utilization of the CPU. But in many cases the number of active threads, multi-threading model to bring back the following questions.

  • Thread creation and destruction costs high, the Linux operating system, the thread is essentially a process of creating and destroying threads belong to a heavyweight operation.
  • In the JVM, each thread stack space will occupy a fixed size, and JVM memory space is limited, so if the number of threads in the thread itself will occupy too many too many resources.
  • High cost of switching threads, each thread context switch requires relates to save, restore, and switching of user mode and kernel mode. If the thread count too much, then there will be a larger proportion of CPU time spent on a thread switch.

Use the thread pool way to solve the first two problems, but the thread switching overhead caused still exists. So in highly concurrent scenarios, the traditional BIO is powerless. The important feature is that NIO: reading, writing, registration and reception functions are non-blocking, waiting to return immediately ready stage, which allows us to not take full advantage of multi-threading CPU. If a connection can not read and write, this event can be recorded, and then switch to another connection for data ready to read and write. In buddha in, Netty is used to write a clearer structure NIO program.

Service registration and discovery

在实际应用中,RPC服务的提供者往往需要使用集群来保证服务的稳定性与可靠性。因此需要实现一个服务注册中心,服务提供者将当前可用的服务地址信息注册至注册中心,而客户端在进行远程调用时,先通过服务注册中心获取当前可用的服务列表,然后获取具体的服务提供者的地址信息(该阶段可以进行负载均衡),根据地址信息向服务提供者发起调用。客户端可以缓存可用服务列表,当注册中心的服务列表发生变更时需要通知客户端。同时,当服务提供者变为不可用状态时也需要通知注册中心服务不可用。buddha使用ZooKeeper实现服务注册与发现功能。

Guess you like

Origin blog.csdn.net/qqyb2000/article/details/78291304