Distributed topic|dubbo quick start + six pictures + interview key points

Two common registration centers in dubbo

dubbo currently supports registration centers such as zookeeper, redis, consul, etcd3, eureka, etc. I will mainly talk about the two common registration centers redis, zookeeper

Redis Registry

Use redis as the registration center, mainly using its map data structure and publish/subscribe features

  • What exactly does dubbo store in the redis map?

    • The key is the service name and type such as: "/dubbo/lezai.dubbo.server.UserService/providers"
    • The entire key is the url address of the dubbo service
    • Value is the effective time of the service, deleted after expiration. Generally, this data will be refreshed by the heartbeat service to process dirty data
  • How does dubbo use redis publish/subscribe to achieve service registration and deregistration?

    • Different event types are distinguished by the value of the event: register, unregister
    • Consumers directly subscribe to the key provided by the service provider during registration, using the key as the subject. If the provider's service hangs up, an unregister/register will be sent from this subject, and the consumer will re-acquire the provider list and then subscribe
  • Call process

    1. When the service provider starts, add the address of the current provider to Key:/dubbo/com.lezai.userService/providers
    2. And send a register event to Channel:/dubbo/com.lezai.userService/providers
    3. When the service consumer starts, subscribe to register and unregister events from Channel:/dubbo/com.lezai.userService/providers
    4. And under Key:/dubbo/com.lezai.userService/providers, add the address of the current consumer
    5. After the service consumer receives the register and unregister events, it obtains the provider address list from Key:/dubbo/com.lezai.userService/providers
    6. When the service monitoring center starts, subscribe to register and unregister, and subscribe and unsubsribe events from Channel:/dubbo/*
    7. After receiving the register and unregister events, the service monitoring center obtains the provider address list from Key:/dubbo/com.foo.BarService/providers
    8. After receiving the subscribe and unsubsribe events, the service monitoring center obtains the consumer address list from Key:/dubbo/com.lezai.userService/consumers
  • When the service provider suddenly goes down, can the status be changed immediately?
    dubboe has established a heartbeat mechanism between the registry, consumers and providers, and the validity period is updated every 30 seconds

zookeeper registry

Compared with redis as the registration center, zookeeper is more flexible. It can complete all functions by using zookeeper's watch mechanism, temporary node characteristics, and tree structure. Let's take a look at the node distribution map created by dubbo in zookeeper:
Insert picture description here

  • When the service provider (provider)
    is initially started, it will create a child node under the provider node under the service node (com.lezau.UserServie) under the dubbo node in zookeeper and write its own URL address, path (directory) It is /dubbo/com.lezau.UserServie/providers/, the sub-nodes under this path (directory) are all service providers. At this time, these nodes are all temporary nodes. Because the life cycle of the temporary node is related to the client session, once the machine where the service provider is located fails and the service cannot be provided, the temporary node will be deleted from the zookeeper.

  • When the service consumer (consumer) is
    initially started, it will subscribe to the URL address of the provider under the /dubbo/com.lezau.UserServie/providers/ path (directory), and the URL address of the provider in the /dubbo/com.lezau.UserServie/consumers/ path ( Create a temporary child node under the directory) and write your own URL address. The child nodes under the path (directory) are all service consumers.

  • Registry
    due between service providers, consumers, the registry is a long connection, the registry can sense the service provider goes down, will inform consumers. Because the monitoring center is an important part of the dubbo service governance system, it needs to know all the changes of the service providers and consumers, so it will be in the path (directory) at /dubbo/com.lezau.UserServie/ when it starts. Register a watcher on the service node (com.lezau.UserServie) to monitor the changes of the child nodes, that is, subscribe to all provider and consumer URL addresses under the /dubbo/com.lezau.UserServie/ path (directory), so it can also Perceive the downtime of the service provider.

  • Characteristics
    dubbo the zookeeper there is a feature that is designed zookeeper node structure, which is to serve the names and types, namely dubbo / com.lezau.UserServie / type as a node path (directory), in line with the needs of subscription and notification dubbo ensure With service-based change notification, the notification range is easy to control, so even if service providers and consumers change frequently, it will not have much impact on the performance of zookeeper.

Call the load balancing algorithm supported by the module

The equalization algorithms supported by dubbo include random, polling, least active calling, and consistent hash. Here are two of the random and consistent hashes. These are also the two algorithms that must be asked during the interview:

random

Definition: Set random probability by weight. Dubbo uses this algorithm by default.
Realization idea:
If the weights of a group of service providers are 1, 10, and 6, then how can I ensure that the second machine has the largest hit probability?
Insert picture description here

What is a consistent hash?

  • What problem does the consistent hash solve?
    • Data aggregation (distribute data)
    • Provides a better processing mechanism for machine expansion or downtime to prevent all re-hashing
  • Principle of consistent hash algorithm
    Insert picture description here
  1. First set 2^31 nodes
  2. Then hash the identities of all machines and then modulo 2^31 to get the position of each machine in these 2^31 nodes
  3. When the user initiates a request, it will hash according to the requested identifier, and also take the modulus of the number of 2^31 nodes, and then get a position
  4. Then search clockwise from this location, and use the first machine node as the hit point, which will be used for processing
  5. As shown above, the request will be handed over to NodeA for processing

Insert picture description here

  1. If a new machine is added or the machine goes down at this time, it will only affect a part of the data. As shown in the figure above, the request will be handed over to the NodeB for processing. Only affect the requests of D and A nodes

Summary: The above method solves the problem of global hash caused by machine downtime or newly added machine

At this time, you may have such a question. If I have few machines but so many 2^31 nodes, will it cause a lot of data to fall on the same node? This is the problem of data skew. The interview must ask, let’s see how the author solved this problem:

  • How to solve the problem of data skew:
    • Users can set the total number of nodes by themselves, no need to set the default 2^31, it depends on the business situation
    • Virtual node mapping:
      If there are 100 real machines scattered in different locations, the total number of nodes at this time is set to 2^31, then the consistent hash algorithm will virtualize 100 or more machines, scattered separately At different locations, these virtual machines will point to the 100 real machines.

Call the fault tolerance strategy supported by the module

  1. Automatic switching on failure: Retry other servers based on the retries="2" attribute after the call fails, the default fault tolerance strategy
  2. Fast failure: Fast failure, only initiate a call, and immediately report an error if it fails.
  3. Ignore failure: Ignore after failure and don't throw exception to the client.
  4. Failure retry: automatic recovery from failure, background recording of failed requests, and regular retransmission. Usually used for message notification operations
  5. Parallel call: As long as one succeeds, it will return, call the specified number of machines in parallel, and set the maximum number of parallels by forks="2".
  6. Broadcast call: broadcast call to all providers, call them one by one, and report an error if any one reports an error

How does Dubbo prevent service links from being stolen?

Consumers will get the address link of the target service from zookeeper every time they call the server: dubbo://xxx
and then directly call the target service. If the address is obtained by an illegal user, then can’t you call it directly? How to avoid link being stolen?
The token mechanism is provided in dubbo to protect the link from being stolen:

Insert picture description here

Do you know what dubbo's generalization provides and references?

Many students were asked what generalization is during the interview, and many students first thought they were asking about generics!

  • Generalization provision The
    generalization provision trial on the service provider means that the service is directly exposed without using an interface. Usually used for Mock framework or service degradation framework implementation. GenericService injection can be used in the code
  • Generalized reference
    is usually used on the consumer side, which refers to referencing services without using conventional interfaces. It is usually used in testing frameworks. The consumer side does not need to rely on the interface provided by the server side, but directly uses the full class name to call
   //弱类型接口名  
        reference.setInterface("com.tuling.teach.service.DemoService");
        //声明为泛化接口
   reference.setGeneric(true);

How to implement dubbo call chain tracking?

Dubbo provides implicit parameters to achieve the call chain tracking requirements, which can be passed from the consumer to the server, and exist in the entire call chain. The setting and obtaining methods are as follows:

RpcContext.getContext().setAttachment("index","1");
//隐式传参,后面的远程
Stringindex=RpcContext.getContext().getAttachment("index");

Because during the dubbo calling process, dubbo will maintain a rpc local thread map to store these parameters,
but if you do not directly call the target service, but an additional service in the middle, then this service will call the target service, this time the target service This parameter is not available, for example:
A sets a parameter, then calls C, and then C calls B, so that B cannot get this parameter, but C can get it.

Link of dubbo calling process

Insert picture description here

Search on WeChat [AI Coder] Follow the handsome me and reply [Receive dry goods], there will be a lot of interview materials and architect must-read books waiting for you to choose, including java basics, java concurrency, microservices, middleware, etc. More information is waiting for you.

Guess you like

Origin blog.csdn.net/weixin_34311210/article/details/111053457