RPC technology and framework for understanding

RPC (Remote Procedure Call): Remote Procedure Call, which is a request for service from a remote computer through a network, without the need to understand the idea of ​​the underlying network technology.

RPC is a technology idea and not a specification or protocol, and a common RPC technology framework are:

  • Application-level services framework: Ali Dubbo / Dubbox, Google gRPC, Spring Boot / Spring Cloud.
  • Remote communication protocols: RMI, Socket, SOAP (HTTP XML), REST (HTTP JSON).
  • Communication Framework: MINA and Netty.

RPC popular open-source framework, or more, there is Alibaba Dubbo, Facebook's Thrift, Google's gRPC, Twitter's Finagle and so on.

We focus on the following three ways:

  • gRPC: Google announced the open source software is based on HTTP 2.0 protocol *** and supports many common programming languages. RPC framework is based on the HTTP protocol, using the underlying framework to support Netty.
  • Thrift: Facebook is open source RPC framework, is a major cross-language services development framework.

As long as the user secondary development on the line above it, the underlying application for RPC communications are transparent. But this need for users to learn the language characteristics of a particular field, or have a certain cost.

  • Dubbo: Ali Group is a very well-known open source RPC framework that is widely used in many Internet companies and enterprise applications. And serialization framework agreement can plug is extremely distinct characteristics.

Complete RPC framework

In a typical scenario using the RPC, including service discovery, the load, fault-tolerant, network transmission, etc. assembly sequence, wherein "RPC protocol," and it indicates the sequence of how the network transmission procedures.

Figure 1: Complete RPC Chart

The following is Dubbo's design architecture diagram, layered clear, functional complexity:

Figure 2: Dubbo Chart

RPC core functionality

RPC's core function is to implement an RPC most important function module, the image above is the "RPC protocol" section:

Figure 3: RPC core functionality

A core function RPC has five main components, namely: a client, the Stub client, network transmission module, the Stub server, the server and the like.

Figure 4: RPC core function chart

The following describes the essential core RPC framework:

  • Client (Client): caller to the service.
  • Client stub (Client Stub): storing the address information of the server, the data request parameter information packed into the client network message before sending it to the server over the network.
  • Stub server (Server Stub): receiving a request sent by the client unpacks the message and then call the local service process.
  • Server (Server): true service providers.
  • Network Service: the underlying transport, can be TCP or HTTP.

RPC's core functions to achieve

RPC's core function consists of five modules, if you want to implement a RPC own, the easiest way to achieve the three technical points are:

  • Addressing
  • Serializing and deserializing the data stream
  • network transmission

Addressing

Addressing using Call ID mapping. In a local call, the function body is directly specified by the function pointer, but in the long-distance call, the function pointer is not enough, because the address space of both processes are completely different.

So in the RPC, all functions must have its own an ID. This ID is the only certainty in all processes.

The client when making remote procedure call, you must include this ID. Then we also need the client and server respectively function and maintain a Call ID correspondence table.

When a client needs a remote call, it will check this table, find the appropriate Call ID, and then pass it on to the server, the server also look-up table to determine the function of the client need to call, then perform the appropriate function code.

Implementation: the service registry.

To call the service, first you need to register a service center to inquire what are the other service instance. Dubbo service registry is configurable, the official recommended Zookeeper.

Implementing stories: RMI (Remote Method Invocation, remote method invocation) is the implementation of RPC itself.

Figure 9: RMI Chart

Registry (Service Discovery): With the release JNDI and RMI call service. In fact, JNDI is a registry, the server will serve to put the registry, the client gets the service object from the registry.

RMI service after the service side need to register to achieve the RMI Server, then the client from the specified address RMI Lookup service, the service call methods to complete the corresponding remote method invocation.

Registry is a very important function, when the server finished development services to the external exposure, if the service is not registered, the client is unable to call, even if the service end of the service there.

Serialization and de-serialization

The client how the parameters passed to the function remote? In a local call, we just need to arguments onto the stack, and then let yourself go function to read the stack on the line.

But in the remote procedure call, the client with the server is a different process, not through memory to pass parameters.

This time we need to put the client first argument turned into a byte stream, passed after the end of the service, then the byte-stream into a format they can read.

Only binary data can be transmitted in the network, serialization and de-serialization is defined as:

  • The process of converting an object into a binary stream is called serialization
  • The binary stream into an object process is called deserialization

This process is called serialization and de-serialization. Similarly, the value returned from the server may also require serialization deserialization process.

network transmission

Network Transmission: remote calls are often used on the network, client and server are connected through a network.

All data is transmitted over the network are required, and therefore there is a need for network transport layer. Network transport layer needs to parameter byte after the Call ID and serialized stream to the server, and then calls the serialized result back to the client.

As long as both can be completed, it can be used as the transport layer. Thus, in fact it uses the protocol is not limited, to complete the transmission line.

Although most RPC framework uses the TCP protocol, UDP but in fact can, and gRPC simply use HTTP2.

TCP connections are the most common, a brief analysis of TCP-based connections: TCP connections can usually be connected as needed (need to call when you establish a connection cut off immediately after the end of the call), can also be a long connection (client after the connection is established and the server to maintain long-term holding, regardless of whether the data packets sent at this time, can be used with a regular heartbeat detection mechanism established connection is alive valid), multiple remote procedure calls to share the same connection.

Therefore, to achieve a RPC framework, just need to realize the following three points basically completed:

  • Call ID map: function strings may be used directly, can also use the integer ID. General mapping table is a hash table.
  • Serialization deserialization: You can write your own, or you can use Protobuf FlatBuffers like.
  • Network Transmission library: You can write your own Socket, or with Asio, ZeroMQ, Netty and the like.

RPC core of the network transport protocol

It is described in the third to achieve an RPC, the network need to select the transmission mode.

Figure 10: the transmission network

Optional RPC network in a variety of transmission, can be selected TCP protocol, UDP protocol, HTTP protocol.

Has different implications for each protocol on the overall performance and efficiency, how to choose the correct network transport protocols? First of all wants to understand various work transport protocol in the RPC.

TCP protocol-based RPC calls

By the caller is connected with the services entering into Socket service, the service by the caller through the Socket interface name will need to call back to the service delivery method name and parameters serialization provider, service provider deserialize and then use reflection to call related methods.

*** The results are returned to the service of the caller, the whole RPC TCP-based protocol calls much the same.

However, in the example applications will be a series of packages, such as RMI is transmitted serializable Java objects in the TCP protocol.

RPC calls based on HTTP protocol

This method is more like a Web page to access the same, except that it returns a single result is more simple.

It is the general process: sending a request to the provider of the service by the caller services, such a request may be a way to GET, POST, PUT, DELETE and the like, the service provider may make the request in accordance with the different ways different processing, or some method only allows certain request methods.

The specific method call is carried out according to the method call URL, and the parameters required for the method may be the result of transmission of the caller to the service XML or JSON data past data analysis, data *** return results JOSN or XML.

Because there are a lot of open-source Web server, such as Tomcat, so its easier to implement, just do the same Web project.

Comparison of two ways

RPC calls based on the TCP protocol implementation, due to the TCP protocol stack in the lower, more flexibility to customize the protocol field, reduce network overhead and improve performance and achieve greater throughput and concurrent.

But it requires more attention to detail underlying complexity and higher cost of implementation. While different platforms, such as Andrews, iOS the like, need to develop different kits to transmit requests and corresponding resolution, workload, and a rapid response is difficult to meet user needs.

HTTP protocol-based implementation can use RPC request JSON or XML format and the response data.

The JSON and XML as a universal standard format (using the HTTP protocol also requires serialization and de-serialization, but this is not the content of interest under the agreement, mature Web program has already done the serialized content), open source analysis tool has been quite mature, secondary development will be very convenient and simple on it.

However, since the upper layer protocol is the HTTP protocol, content information transmitted contains the same number of bytes occupied by the transmission using the HTTP protocol will be higher than the number of bytes occupied by the transmission protocol TCP.

Therefore, in the same network, the same transmission content via HTTP protocol, based on data efficiency than TCP protocol efficiency is lower, occupied by the transmission of information takes longer, of course, the compressed data, can close the gap.

RabbitMQ use of RPC architecture

OpenStack using RESTful API calls between the service and the service, and then use RPC calls to each function module in-house services.

Because of the use RPC to decouple the internal service function module, such OpenStack service has scalability, and low coupling.

OpenStack RPC infrastructure added RabbitMQ message queue, the purpose of this is to ensure the safety and stability in RPC messaging process.

How to use OpenStack RabbitMQ implementation calls RPC following analysis.

RabbitMQ Profile

The following excerpt knew almost:

For example, beginners, for a restaurant to explain what these three are right. *** is not appropriate, but it should be enough to explain the difference between the three.

RPC: Suppose you are a restaurant waiter, the customer to your order, but you can not cook, so you do the dishes collected Houchu tell the customer what the customer point after point, called RPC (remote procedure call) because the kitchen chef is another person (a process on the computer world is the Remote machine) with respect to the terms of the waiter. Chef do the dishes is RPC return value.

Task queue and message queue: essentially the queue, so it just gives an example of a task queue. Assuming that the restaurant at the peak of many customers, but only a few chefs, the waiters had to press a single list order on the kitchen table, one by one for chefs do, this bunch is the task queue list chefs each finished a dish, it is off the table in order to continue cooking and then remove a list.

Role-sharing in the following figure:

Figure 11: RabbitMQ role in the RPC

The benefits of using RabbitMQ:

  • Synchronous mutation step: You can use the thread pool to become asynchronous synchronization, but the drawback is to achieve its own thread pool, and strong coupling. Message Queuing can easily become asynchronous request synchronization request.
  • Poly low internal high coupling: decoupling, strong reducing dependence.
  • Clipping Flow: *** request message queue setting value, than an error threshold to discard or screen.
  • Improve network communication performance: TCP overhead of creating and destroying a large, 3-way handshake to create, destroy four times to break up thousands of links to the peak will cause a huge waste of resources, and the operating system handle the number of TCP per second is also a quantitative restrictions, will result in performance bottlenecks.

RabbitMQ using channel communication, direct communication without using TCP. A thread one channel, multiple threads plurality of channels, a common TCP connection.

*** TCP connection of channels can be accommodated (it sufficiently hard disk capacity), without causing a performance bottleneck.

Three types of exchangers RabbitMQ

RabbitMQ using the Exchange (switch), and Queue (queues) to implement the message queue.

In RabbitMQ There are three types of switches, each switch type has very distinct characteristics.

Based on these three types of switches, OpenStack call two ways to complete the RPC. First briefly describe three switches.

Figure 12: RabbitMQ Chart

① broadcast exchanger type (Fanout)

This class does not switch analyzes the received message Routing Key, forward the message to the default all bound with the switch queue.

Figure 13: Switch Broadcast

② direct exchange type (Direct)

Such switches require exact matching Routing Key and Binding Key, such as Routing Key = Cloud message, then the message can be forwarded article Binding Key = Cloud messages to the queue.

Figure 14: Direct switches

③ theme exchanger (Topic Exchange)

Such matching Binding Key switch the mode Routing Key message, forwarding the message to all the queues with bounded.

Binding Key support wildcards, where "*" matches a phrase, "#" to match multiple phrases (including zero).

Figure 15: Theme switches

Note: The above four pictures from the blog garden, if infringement, please contact the author: https: //www.cnblogs.com/dwlsxj/p/RabbitMQ.html.

When sending a message producer Routing Key = FCE, this time to meet Queue1 only, it will be routed to the Queue.

Routing Key = ACE If this time will be simultaneously routed to the Queue2 Queue1 and, if the Routing Key = AFB, where a message is only sent to the Queue2.

Nova implement two RabbitMQ based RPC calls:

  • RPC.CALL (call)
  • RPC.CAST (notification)

Wherein RPC.CALL manner based on a request and response, RPC.CAST request only provide one-way, two kinds of ways a typical RPC calls in Nova in both scenarios.

RPC.CALL

RPC.CALL is a bidirectional communication flow, i.e., the receiving system RabbitMQ request message generated by the message producer, consumer message after a respective processing results of the system back to the calling program.

Figure 16: RPC.CALL Schematic

A user creates a virtual machine Dashboard, NOVA-API interface to transmit after the message encapsulation.

NOVA-API as message producers, the message is forwarded to the message queue RPC.CALL manner by Topic exchanger.

At this time, Nova-Compute consumers as a message, and receives the information to start the process performed by the virtual machine corresponding underlying virtualization software.

After the virtual machine user to be successfully started, Nova-Compute a message producer via Direct switch and responding to a message queue virtual machine starts success response message back to Nova-API.

At this Nova-API consumers to receive the message as a message and notifies the user virtual machine starts successfully.

RPC.CALL works in the following figure:

FIG 17: RPC.CALL specific implementation of FIG.

work process:

  • Reply_to queue name specified by the client, correlation_id mark the caller when creating a Message.
  • Through the queue, the server receives the message. Call processing function, and then return.
  • Reply_to return queue is specified queue, and carry correlation_id.
  • Return message reaches the client, the client calls a function which returns a determination according correlation_id.

If there are multiple threads simultaneously for remote method invocation, then there will be a lot of messages sent by the two sides established between the Client Server Socket connection transfer, before and after the order may be random.

After Server processed the results, the results send a message to Client, Client received a lot of messages, how do you know which message is the result of which thread originally called?

Client each thread by calling a remote Socket front interface, generate a unique ID, i.e., Request ID (Request ID is necessary to ensure the connection in which a unique Socket), generally AtomicLong often used to generate a unique ID number from 0 begin accumulating.

RPC.CAST

RPC.CAST remote procedure calls and RPC.CALL similar, but the lack of a system message response process.

Topic producer sends a message to the system message request switch Topic, Topic switch forwards the message queue to the shared message according to the message Routing Key.

Shared message queue coupled to all Topic consumer receives the system message request, and pass it to the server for processing the response.

Its call flow as shown:

Figure 18: RPC.CAST Schematic

Connectivity Design

RabbitMQ implementation of the general design ideas RPC network: consumers are connected to long, short sender is connected. But it can be freely controlled long and short connecting connector.

The average consumer is long connected and ready to receive process messages; and involve RabbitMQ Queues, Exchange of auto-deleted, etc. without special needs do not need short connections. The sender can use short connections, not long occupy the port number, save port resources.

Nova in the RPC code design:

Restful API and RPC simple comparison

RESTful API architecture

REST *** Several features are: resources, unified interface, URI, and stateless.

① Resources

The so-called "resource" is an entity on the network, or is a specific information on the network. It can be a piece of text, a picture, a song, a service that is a concrete reality.

② unified interface

RESTful architecture predetermined style, metadata operations, i.e. CRUD (Create, Read, Update, and the Delete, i.e. deletions change check data) operation, respectively, corresponding to the HTTP method: GET to obtain resources, POST for new resources (which may used to update the resource), PUT for updating resources, dELETE to delete the resource, so that a unified interface to data manipulation, only through the HTTP method, you can complete investigation of all additions and deletions to change the working of the data.

③URL

We can point to a resource with a URI (uniform resource locator), i.e. each URI corresponds to a particular resource.

To obtain this resource, you can access its URI, URI therefore become a resource for each address or identifier.

④ stateless

The so-called stateless, that is, all the resources are available through a URI, and this has nothing to do with locating other resources, other resources will not change because of change. There is no difference between state and state, give a simple example to explain.

Such as a query staff wages, wages need to be logged if the query system, query page to enter wage, after the implementation of the relevant steps to obtain the wages, then this case is the state.

Because the query wage every step of operations are dependent on the previous step of the operation, as long as the pre-operation is not successful, subsequent operations can not be executed.

If you enter a URI to obtain the specified wage employees, then this case is stateless, because wages do not get dependent on other resources or status.

And in this case, a resource is wages, with the corresponding URI by one, the resource can be obtained by the HTTP GET method, which is a typical RESTful style.

 

to sum up

RPC is mainly used for service calls within the company, the performance of low consumption, high transmission efficiency, implementation complexity.

HTTP is mainly used for external heterogeneous environments, browser interface calls, App interface calls, calls and other third-party interfaces.

RPC usage scenarios (large sites, many internal subsystems, the interface very much the case for the use of RPC):

  • Long link. It does not always go as HTTP communication to be like 3-way handshake, reducing network overhead.
  • Registration release mechanism. RPC framework generally have a registry, there is a wealth of monitoring and management; publish, off the assembly line interfaces, dynamic expansion of the caller is not aware, unified operation.
  • Security, there is no exposure to resource operations.
  • Micro-service support. Is the recent popular service oriented architecture, service governance, RPC framework is a strong support.

 

 

 

Published 136 original articles · won praise 6 · views 1524

Guess you like

Origin blog.csdn.net/weixin_42073629/article/details/104567846