Serialization solution is to use a binary communication protocol (data packet format)

Foreword

The so-called serialization solutions, and is the object of mutual conversion of binary solutions.

Why binary it? Here are a few points to understand clearly.
1. The application is the data in the object
2. network transmission process has been a binary
or without object-binary conversion and mutual, final transmission network are binary data.
3. Why should say and object-binary conversion?
In fact, the essence here is not to convert, but that both sides of communication protocols (ie, data packet format) must be binary data packet protocol to be able to minimize the size of data packets.

Therefore, as summarized below, the sequence of the solution, is to use a binary protocol (packet data format).

Comparative solutions of different sequences

json and google the difference?

There Protocol buffer A lightweight serialization deserialization tool, Json why is there still a lot of use?
Of course json format clear, but is clearly greater than the amount of data protocol buffer


In most cases, JSON is not the bottleneck, the need to replace or optimize.

Small and medium companies should not copy the Google program.


protobuf does not have much room for efficiency optimization in the text-based transmission, bring indeed fall debug efficiency.


Cross-language serialization scheme
de facto cross-language serialization scheme only three: protobuf, thrift, json.

json bulky, and lacks the type of information, in fact, only in RESTful interface, and not see the frame RPC is selected by default do json serialized. Some large companies usage:

protobuf, Tencent, Baidu, etc.

thrift, millet, and other US corporations

hessian, Ali used to maintain their own version, there js achieve / cpp, because Ali master java, is more for historical reasons.


Reference
www.zhihu.com/question/29...

Serialized in the type of information

Object serialization is to convert into binary data, the binary data put deserialized into an object.

Endless variety of serialization library, there is an important difference: the type of information stored in what?

It can be divided into three types:
1, type of information is not saved

Typically various json serialization library, the advantage is flexibility, the disadvantage is that both sides have to know what type of use Yes. Of course, there are some json library will provide some extensions, secretly put in the type of information into json.

2, the type of information stored in the serialization of the

Java example comes serialization, hessian like. The disadvantage is that the type of information redundancy. For example, RPC request should be put in each type. Thus there is a common means is to optimize the RPC ends after negotiation, subsequent requests do not need to take the type of information.

3, tape type information code generating

Usually written package and class names in the IDL file, the generated code will have a direct type of information. For example protobuf, thrift. The disadvantage is the need to generate code, both sides should know IDL file.

The type of information appears to be a trivial matter, but it will be a big security problem will be discussed later.


Reference
hengyunabc.github.io/thinking-ab...

hessian

What is
Binary RPC data solutions.


Action
1. Conversion Data type
objects - Binary
2.RPC communication
can be considered a communications framework, and is the same netty.
But, this piece of communication, certainly not netty well, just that it can use to communicate.


The biggest advantage
across languages


Design goals

  1. Introduction This document describes the portions of the Hessian 2.0 protocol concerning web services. The document is intended to be read by implementors of Hessian 2.0. Hessian 2.0 supports two types of network communication: remote procedure call (RPC) and message-based. These may also be viewed as synchronous and asynchronous communication.

RPC communication is based on "methods" invoked on a server. The method being invoked is specified in a Hessian 2.0 "call". Arguments to these methods are passed in the call and are serialized using the Hessian 2.0 serialization protocol. If the method was successfully invoked, the return value of the method is also serialized using Hessian 2.0 and sent to the client. If the method was not successfully invoked, a "fault" is returned to the client. RPC communication can use any underlying network protocol for transport such as HTTP or TCP.

Message-based communication is asynchronous and does not necessarily involve the use of methods, clients, or servers. Messages may or may not receive a response message. Messages simply contain other Hessian 2.0 objects. These may be simple types, aggregates like a list or map, or an "envelope". Envelopes may have headers that specify routing or other special processing information. They may also contain encrypted, signed, and/or compressed data. Thus using messages with envelopes can be useful in cases where end-to-end security is necessary. Message-based communication can also use any underlying network protocol such as HTTP or TCP and may be especially appropriate in queued message systems.

  1. Design Goals Unlike older binary protocols, Hessian is both self-describing, compact, and portable across languages. The wire protocol for web services should be invisible to application writers, it should not require external schema or IDL.

The Hessian protocol has the following design goals:

It must not require external IDL or schema definitions, i.e. the protocol should be invisible to application code. It must be language-independent. It must be simple so it can be effectively tested and implemented. It must be as fast as possible. It must be as compact as possible. It must support Unicode strings. It must support 8-bit binary data (i.e. without encoding or using attachments.) It must support encryption, compression, signature, and transaction context envelopes.


demo
www.cnblogs.com/xdp-gacl/p/…


Communications
based on http protocol.

Hessian is more commonly used binary-rpc, higher performance for Internet applications, mainly used in ordinary webservice method calls, interactive data smaller scene. Based on the data exchange hessian http protocol, the server side hessian generally required to design the container web server (such as servlet, etc.). You can use any Java class exposed to HessianServlet, and published as hessian service; then hessian client will be able to call the servlet by similar as to obtain the output of the remote method.

shift-alt-ctrl.iteye.com/blog/189709…


Converting data formats
java-based serialization API.


dubbo used as the communication protocol hessian
shiyanjun.cn/archives/34...


Reference
hessian.caucho.com/ // official website
hessian.caucho.com/doc/hessian...

shift-alt-ctrl.iteye.com/blog/189709... // good introductory article

Dubbo- serialization solutions

Serialization position is located in dubbo in?
1. The transmitting end
Object - binary
2. The receiving end
Binary Object -

Serialization is the data type conversion, to convert the object data into binary data.

Why do you want to convert a binary type of data? Because of the high transmission efficiency.

All that remains is what kind of solution that converts the binary is the fastest?
1.google open source software
2.dubbo used Hession
3. Other

4. In fact, java itself has official jdk serialization
But, there are two drawbacks, 1. 2. efficiency is not high only supports java


netty communications framework in a position in which the dubbo?
This layer is in the communication frame tcp / ip communication protocol.
Is the flow of data. Let the data flow together. So that data can be between two processes via socket communication data.


hessian in dubbo in which the position?
First, dubbo there are two noteworthy aspects
1. Communication protocol
2 data format conversion

Communication protocols, there are several options
1.dubbo custom protocol
2.hessian // custom protocol also repeatedly mentioned earlier, hessian is a communications solutions, and this communication frameworks netty

Converting the data format
communication protocol is a communication protocol, the communication protocol only defines the data packet format only.
Prior to communication, also need to be converted to binary data objects, this time on the need to convert object technology solutions --hessian binary data is one of a kind solution that is dubbo communication protocol default solution. In addition hessian, google's protobuf is a solution to solve the same problem.

Like google rpc by default, google binary serialization. dubbo hessian default binary serialization.

Difference dubbo-dubbo hessian communication protocol and communication protocol?

The biggest difference is the difference between scenarios
1.dubbo
small data
2.hessian
Big Data


why? A small data, a big data?
1. The amount of data to
one based on tcp rpc API call for a small amount of data.

One is based on http, equivalent to a web server (embedded jetty). This is more suitable for large page request data, and so on.

2. single long connection? Multi-connection short connection?
A single long connection, a consumer is only one connection, to avoid too much to bear producer connection (consumption of resources), after all, the machine is relatively small number of producers, consumers are more number of machines.

It is a multi-connection, since it is a web server (jetty) Well, that each request to create a connection, http protocol suitable for large data, the minimum data request is a page, there are other multimedia data would be greater.

3. asynchronous non-blocking? Synchronous blocking?
One is based netty java nio, asynchronous non-blocking.

One is based on hessian http server (jetty), synchronous blocking.


Specific number of data
refer to the official documentation
dubbo.apache.org/zh-cn/docs/...

Explanation
Frequently Asked Questions
1, why should the consumer more than the number of providers?
Because dubbo agreement with a single long connection, assuming that the network is Gigabit Ethernet [3], each connection can only press full 7MByte (depending on experience in testing data the environment may be different, for reference), in theory, a service provider needs 20 full service consumer can press card.

Explanation:

  1. A tcp connection 7M bytes, consumers and producers, most 7M bytes of data.

  2. Gigabit Ethernet (1000bit / 10 = 100M, that is to say, actually only 100M bytes)

100M / 7M = 20 // mean is approximately equal to, a single card producer, the consumer can take the connector 20

The following data, calculation similar!

2, why can not pass a large bag?
Due dubbo protocol uses a single long connection, if the size of each request packet is 500KByte, assume that the network is a Gigabit Ethernet [3: 1], each connection the maximum 7MByte (different environments may not Like, for reference), a single service provider TPS (transaction processing per second) up to: 128MByte / 500KByte = 262. TPS individual consumers to call a single service provider (the number of transactions processed per second) up to: 7MByte / 500KByte = 14. If you can accept, consider, or else the network will become a bottleneck.

3. Why is the use of asynchronous single long connection?
Because of the large presence service providers are small, usually only a few machines, and consumer services, probably in the entire site to access the service, such as Morgan providers only 6 provider, there are hundreds of consumers, 150 million calls every day, if using conventional hessian service, the service provider can easily be pressed across, through a single connection, to ensure that consumers are not killed a single provider, long connection to reduce connection handshake verification, and use asynchronous IO, reuse thread pool to prevent C10K problem.


Reference
dubbo.apache.org/zh-cn/docs/...
dubbo.apache.org/zh-cn/docs/...

dubbo.apache.org/zh-cn/blog/…

dubbo- communication protocol

In fact, nothing. It is based on: dubbo custom communication protocol communication protocol --tcp --netty custom protocol.

In other words, get yourself a set of protocols (that is, data packet format).

The other is also nothing different.

Reproduced in: https: //juejin.im/post/5d05b48e6fb9a07eac05d1aa

Guess you like

Origin blog.csdn.net/weixin_33991418/article/details/93177782