Comprehensive analysis of high-level core knowledge in Java-Dubbo (concept, Dubbo architecture & load balancing)

1. Important concepts

1. What is Dubbo?

Apache Dubbo (incubating) |ˈdʌbəʊ|It is a high-performance, lightweight, open source Java RPC framework that provides three core capabilities: interface-oriented remote method invocation , intelligent fault tolerance and load balancing , and automatic service registration and discovery . To put it simply, Dubbo is a distributed service framework dedicated to providing high-performance and transparent RPC remote service invocation solutions and SOA service governance solutions.

In addition, in 2018 the annual selection of the most popular Chinese open source software open source this event held in China, Dubbo is by virtue of its high popularity behind vue.jsand EChartsget third place.

Dubbo is open sourced by Ali and later joined Apache. Formally due to the emergence of Dubbo, more and more companies began to use and accept distributed architecture.

We mentioned above that Dubbo is actually an RPC framework, so what is RPC?

2. What is RPC? What is the principle of RPC?

What is RPC?

RPC(Remote Procedure Call)—Remote procedure call, which is a protocol for requesting services from a remote computer program through a network without understanding the underlying network technology . For example, if two different services A and B are deployed on two different machines, what should service A do if it wants to call a method in service B? Of course it is possible to use HTTP requests, but it may be more troublesome. The appearance of RPC is to allow you to call remote methods as easy as calling local methods.

What is the principle of RPC?

  1. The service consumer (client) calls the service in a local call mode;
  2. The client stub is responsible for assembling methods and parameters into a message body that can be transmitted over the network after receiving the call;
  3. The client stub finds the service address and sends the message to the server;
  4. The server stub decodes the message after receiving it;
  5. The server stub calls the local service according to the decoding result;
  6. The local service is executed and the result is returned to the server stub;
  7. The server stub packages the returned result into a message and sends it to the consumer;
  8. The client stub receives the message and decodes it;
  9. The service consumer gets the final result.

Here is another online sequence diagram: Having

said so much, why should we use Dubbo?

3. Why use Dubbo?

The birth of Dubbo has a lot to do with the popularity of SOA distributed architecture. SOA service-oriented architecture ( Service Oriented Architecture), that is, the project is divided into two projects of service layer and presentation layer according to business logic . The service layer contains business logic and only needs to provide external services. The presentation layer only needs to process the interaction with the page, and the business logic is implemented by calling the services of the service layer. There are two main roles in the SOA architecture: service provider (Provider) and service consumer (Consumer) .

If you want to develop distributed programs, you can also communicate directly based on the HTTP interface, but why use Dubbo?

I think we can mainly explain why Dubbo is used from the following four features provided by Dubbo:

  1. Load balancing -when the same service is deployed on different machines, the service on that machine should be called.
  2. Service call link generation -With the development of the system, there are more and more services, and the dependencies between services have become mistracked and complicated, and it is even unclear which application should be started before which application. The architect cannot describe the application completely The architectural relationship. Dubbo can help us solve how services call each other.
  3. Service access pressure and duration statistics, resource scheduling and governance -real-time management of cluster capacity based on access pressure to improve cluster utilization.
  4. Service degradation -call a backup service after a service is down.

In addition, Dubbo can not only be used in distributed systems, but also in popular microservice systems. However, because Spring Cloud is more widely used in microservices, I think that when we mention Dubbo, most of it is a distributed system.

We have just mentioned the concept of distributed, let me introduce to you what is distributed? Why is it distributed?

4. What is distributed?

Distributed or SOA Distributed is service-oriented. Simple distributed means that we split the entire system into different services and then put these services on different servers to reduce the pressure of single services and improve concurrency and performance. . For example, the e-commerce system can be simply split into an order system, a commodity system, a login system, etc. After the split, each service can be deployed on a different machine. If a certain service has a relatively large number of visits, this can also be Services are deployed on multiple machines at the same time.

5.Why distributed?

From a development perspective, the code of a single application is concentrated, while the code of a distributed system is split according to business. Therefore, each team can be responsible for the development of a service, which improves development efficiency. In addition, the code is easier to maintain and expand after it is split according to the business.

In addition, I think splitting the system into a distributed system not only facilitates system expansion and maintenance, but also improves the performance of the entire system. Do you think about it? Splitting the entire system into different services/systems, and then deploying each service/system separately on a server, does it greatly improve system performance?


Reference material: "Comprehensive Analysis of Java Intermediate and Advanced Core Knowledge" is limited to 100 copies. Some people have already obtained it through my previous article!
Seats are limited first come first served! ! ! There are more Java Pdf learning materials waiting for you! ! !
Students who want to get this learning material can click here to get it for free """""""

Two, Dubbo's architecture

1.Dubbo's architecture diagram


A brief description of the above nodes:

  • Provider : The service provider that exposes the service
  • Consumer : The service consumer who calls the remote service
  • Registry : The registry for service registration and discovery
  • Monitor : A monitoring center that counts the number of service calls and call time
  • Container : Service running container

Calling relationship description:

  1. The service container is responsible for starting, loading, and running the service provider.
  2. When the service provider starts, it registers the service it provides with the registration center.
  3. When a service consumer is started, he subscribes to the registry for the service he needs.
  4. The registration center returns the list of service provider addresses to the consumer. If there is a change, the registration center will push the change data to the consumer based on the long connection.
  5. The service consumer, based on the soft load balancing algorithm, selects one provider to call from the provider address list, and if the call fails, selects another to call.
  6. Service consumers and providers accumulate the number of calls and call time in the memory, and send statistical data to the monitoring center every minute.

Summary of important knowledge points:

  • The registration center is responsible for the registration and search of the service address, which is equivalent to a directory service. Service providers and consumers only interact with the registration center at startup, and the registration center does not forward requests, so the pressure is less
  • The monitoring center is responsible for counting the number of times each service is called, the calling time, etc. The statistics are first collected in the memory and sent to the monitoring center server every minute and displayed in reports
  • The registration center, service provider, and service consumer are all persistent connections, except for the monitoring center
  • The registry perceives the existence of the service provider through the long connection, and the service provider is down, the registry will immediately push the event to notify consumers
  • The registration center and monitoring center are all down, which does not affect the providers and consumers that are already running. The consumers cache the provider list locally
  • Both the registration center and the monitoring center are optional, and service consumers can directly connect to the service provider
  • The service provider is stateless, after any one is down, it will not affect the use
  • After the service providers are all down, the service consumer applications will be unavailable and will be reconnected indefinitely for the service provider to recover

2. Dubbo working principle


The figure is divided into ten layers from bottom to top. Each layer is one-way dependent. The black arrow on the right represents the dependency between the layers. Each layer can be stripped of the upper layer and reused. Among them, the Service and Config layers are APIs. All other layers are SPI.

Description of each layer:

  • The first layer: service layer , interface layer, for service providers and consumers to achieve
  • The second layer: config layer , configuration layer, mainly for various configurations of dubbo
  • The third layer: proxy layer , transparent proxy of service interface, generating service client Stub and server side Skeleton
  • The fourth layer: registry layer , service registration layer, responsible for service registration and discovery
  • The fifth layer: cluster layer , cluster layer, encapsulate routing and load balancing of multiple service providers, combine multiple instances into one service
  • The sixth layer: monitor layer , monitoring layer, to monitor the number of calls and call time of the rpc interface
  • Seventh layer: protocol layer , remote call layer, encapsulate rpc call
  • The eighth layer: exchange layer , information exchange layer, package request response mode, synchronous to asynchronous
  • The ninth layer: transport layer , network transport layer, abstract mina and netty as a unified interface
  • Tenth layer: serialize layer , data serialization layer, network transmission needs

Three, Dubbo's load balancing strategy

1. First explain what load balancing is

First come an official explanation.

Wikipedia's definition of load balancing: Load balancing improves the distribution of workloads across multiple computing resources (such as computers, computer clusters, network links, central processing units or disk drives. Load balancing aims to optimize resource usage and maximize throughput Minimize response time and avoid any single resource overload. Using multiple components with load balancing instead of a single component can increase reliability and availability through redundancy. Load balancing usually involves dedicated software or hardware.

What I said above may not be easy for you to understand, so let me tell you in plain language.

For example, a certain service in our system has a very high traffic volume. We deploy this service on multiple servers. When a client initiates a request, multiple servers can handle the request. Then, how to correctly select the server that processes the request is critical. If you need one server to handle requests for the service, the significance of deploying the service on multiple servers no longer exists. Load balancing is to avoid a single server responding to the same request, which is likely to cause server downtime, crashes and other problems. We can clearly feel its meaning from the four words of load balancing.

2. Let's take a look at the load balancing strategy provided by Dubbo

When the cluster load balancing, Dubbo offers a variety of balanced strategies, the default is randoma random call. You can extend the load balancing strategy by yourself.

1) Random LoadBalance (default, random load balancing mechanism based on weight)

  • Random, set random probability by weight.
  • The probability of collision on a cross-section is high, but the larger the amount of calls, the more even the distribution, and the more even the weight is used according to the probability, which is beneficial to dynamically adjusting the provider weight.

2) RoundRobin LoadBalance (not recommended, weight-based round-robin load balancing mechanism)

  • Round robin, the round robin ratio is set according to the weight after the convention.
  • There is a problem that the slow provider accumulates requests. For example, the second machine is very slow, but it is not hung up. When the request is transferred to the second machine, it is stuck there. Over time, all requests are stuck in the second machine.

3)LeastActive LoadBalance

  • The minimum number of active calls, the same active number is random, and the active number refers to the count difference before and after the call.
  • The slower provider receives fewer requests, because the slower the provider will have a greater difference in counts before and after the call.

4)ConsistentHash LoadBalance

  • Consistent Hash, requests with the same parameters are always sent to the same provider. (If what you need is not random load balancing, but a type of request to a node, then use this consistent hash strategy.)
  • When a certain provider hangs up, the request originally sent to that provider will be spread to other providers based on the virtual node without causing drastic changes.
  • By default, only the first parameter Hash, if you want to modify, please configure <dubbo:parameter key="hash.arguments" value="0,1" />
  • 160 virtual nodes are used by default, if you want to modify, please configure <dubbo:parameter key="hash.nodes" value="320" />

3. Configuration method

xml configuration method

Server service level

<dubbo:service interface="..." loadbalance="roundrobin" />

Client service level

<dubbo:reference interface="..." loadbalance="roundrobin" />

Server method level

<dubbo:service interface="..."> 
	<dubbo:method name="..." loadbalance="roundrobin"/> 
</dubbo:service>

Client method level

<dubbo:reference interface="..."> 
	<dubbo:method name="..." loadbalance="roundrobin"/> 
</dubbo:reference>

Annotation configuration method:

The consumer is based on the annotation-based service level configuration method:

@Reference(loadbalance = "roundrobin") 
HelloService helloService;

4. The situation where zookeeper is down and directly connected to dubbo

The situation that zookeeper is down and directly connected to dubbo may be frequently asked during the interview, so pay attention to it.

In actual production, if the zookeeper registry goes down, the service consumer can still call the provider's service for a period of time. In fact, it uses the local cache for communication, which is only a manifestation of the robustness of dubbo .

Robust performance of dubbo:

  1. The downtime of the monitoring center will not affect the use, but some sampling data will be lost
  2. After the database goes down, the registry can still provide service list queries through the cache, but cannot register new services
  3. Registration center peer-to-peer cluster, after any one goes down, it will automatically switch to another
  4. After the registry is completely down, service providers and service consumers can still communicate through the local cache
  5. The service provider is stateless, after any one is down, it will not affect the use
  6. After the service providers are all down, the service consumer applications will be unavailable and will be reconnected indefinitely for the service provider to recover

As we mentioned earlier, the registry is responsible for the registration and search of service addresses, which is equivalent to a directory service. Service providers and consumers only interact with the registry at startup, and the registry does not forward requests, so the pressure is less. Therefore, we can completely bypass the registration center-use dubbo direct connection , that is, configure the location information of the service provider on the service consumer.

xml configuration method:

<dubbo:reference id="userService" interface="com.zang.gmall.service.UserService" url="dubbo://localhost:20880" />

Annotation method:

@Reference(url = "127.0.0.1:20880") 
HelloService helloService;

Reference material: "Comprehensive Analysis of Java Intermediate and Advanced Core Knowledge" is limited to 100 copies. Some people have already obtained it through my previous article!
Seats are limited first come first served! ! ! There are more Java Pdf learning materials waiting for you! ! !
Students who want to get this learning material can click here to get it for free """""""

Guess you like

Origin blog.csdn.net/Java_Caiyo/article/details/112168553