Summary of interview knowledge related to microservice learning (Nacos, MQ)


insert image description here

One micro service Nacos


1.1 Common components of Spring Cloud

Spring Cloud contains many components, and many functions are repeated. The most commonly used components include:

  • Registry components: Eureka, Nacos, etc.
  • Load balancing component: Ribbon
  • Remote call component: OpenFeign
  • Gateway components: Zuul, Gateway
  • Service protection components: Hystrix, Sentinel
  • Service configuration management components: SpringCloudConfig, Nacos

1.2 Nacos service registry structure

  • Nacos adopts a hierarchical data storage model, and the outermost layer is Namespace, which is used to isolate the environment. Then there is Group, which is used to group services. Next is the service (Service), a service contains multiple instances, but may be in different computer rooms, so there are multiple clusters (Cluster) under the Service, and different instances (Instance) under the Cluster.

  • Corresponding to the Java code, Nacos uses a multi-layer Map to represent. The structure is Map<String, Map<String, Service>>, where the key of the outermost Map is namespaceId and the value is a Map. The key of the inner Map is the concatenated serviceName of the group, and the value is the Service object. Inside the Service object is a Map, the key is the cluster name, and the value is the Cluster object. The Cluster object internally maintains a collection of Instances.
    insert image description here


1.3 How does Nacos support the internal pressure of hundreds of thousands of service registrations

  • When Nacos receives a registration request internally, it does not write data immediately, but puts the service registration task into a blocking queue and immediately responds to the client. Then use the thread pool to read the tasks in the blocking queue and complete the instance update asynchronously, thereby improving the concurrent writing capability.

1.4 Nacos avoids concurrent read and write conflicts

  • When Nacos updates the instance list, it will use the CopyOnWrite technology. First, copy the old instance list, then update the copied instance list, and then overwrite the old instance list with the updated instance list. In this way, during the update process, the request to read the instance list will not be affected, and the dirty read problem will not occur.

1.5 The difference between Nacos and Eureka

Nacos has similarities and differences with Eureka:

  • Interface method : Both Nacos and Eureka expose the Rest-style API interface to the outside world, which is used to realize functions such as service registration and discovery
  • Instance type : Nacos instances are divided into permanent and temporary instances; Eureka only supports temporary instances
  • Health detection : Nacos uses heartbeat mode detection for temporary instances, and active request for permanent instances; Eureka only supports heartbeat mode
  • Service discovery : Nacos supports two modes of timing pull and subscription push; Eureka only supports timing pull mode

1.6 The difference between Sentinel's current limiting and Gateway's current limiting

  • There are three common implementations of current limiting algorithms: sliding time window, token bucket algorithm, and leaky bucket algorithm . Gateway uses the token bucket algorithm based on Redis. However, Sentinel is more complicated inside:
    • The default current limiting mode is based on the sliding time window algorithm
    • The current limiting mode of queuing is based on the leaky bucket algorithm
    • The current limit of hotspot parameters is based on the token bucket algorithm

1.6.1 Fixed Window Counter Algorithm

  • Divide time into multiple windows, and the window time span is called Interval, which is 1000ms in this example;
  • Each window maintains a counter, and every time there is a request, the counter is incremented by one. Current limiting is to set the counter threshold, which is 3 in this example.
  • If the counter exceeds the throttling threshold, requests exceeding the threshold are discarded.
    insert image description here

1.6.2 Sliding window counter algorithm

The sliding window counter algorithm divides a window into n smaller intervals, for example

  • The window time span Interval is 1 second; the number of intervals n = 2, then the time span between each cell is 500ms
  • The current limit threshold is still 3. When the request exceeds the threshold within the time window (1 second), the excess request will be limited
  • The window will move according to the time of the current request (currentTime), and the window range starts from the first time zone after (currentTime-Interval) and ends in the time zone where currentTime is located.
    insert image description here

1.6.3 Token Bucket Algorithm

  • Tokens are generated at a fixed rate and stored in the token bucket. If the token bucket is full, excess tokens are discarded
  • After the request enters, it must first try to obtain the token from the bucket, and it can only be processed after obtaining the token
  • If there is no token in the token bucket, the request waits or discards.
    insert image description here
    Leaky bucket algorithm description:
  • Treat each request as a "water drop" and store it in a "leaky bucket";
  • The "leaky bucket" is executed at a fixed rate to "leak" out requests, and if the "leaky bucket" is empty, the "leakage" will stop;
  • If the "leaky bucket" is full, the excess "water droplets" will be discarded directly.
  • It can be understood that the request is queued in the bucket
    insert image description here

1.6.4 Leaky Bucket Algorithm

  • When Sentinel implements the leaky bucket, it adopts the queue waiting mode:
  • Let all requests enter a queue, and then execute them sequentially according to the time interval allowed by the threshold. Multiple concurrent requests must wait, and the expected waiting time = the expected waiting time of the latest request + the allowed interval. If the expected wait time for the request exceeds the maximum time, it will be rejected.
  • For example: QPS = 5, means that a request in the queue is processed every 200ms; timeout = 2000, means that requests that are expected to wait for more than 2000ms will be rejected and an exception will be thrown
    insert image description here

1.7 The difference between Sentinel's thread isolation and Hystix's thread isolation

  • By default, Hystix implements thread isolation based on the thread pool. Each isolated business must create an independent thread pool. Too many threads will bring additional CPU overhead. The performance is average, but the isolation is stronger.
  • Sentinel is a thread isolation based on a semaphore (counter). It does not need to create a thread pool. It has better performance, but the isolation is average.

2MQ knowledge

2.1 Reason for choosing RabbitMQ

  • Comparison of various message queue technologies:
    insert image description here
  • Kafka is famous for its high throughput, but its data stability is average, and the order of messages cannot be guaranteed.
  • Alibaba's RocketMQ is based on the principle of Kafka, which makes up for the shortcomings of Kafka and inherits its advantages of high throughput. Currently, its clients are mainly Java.
  • RabbitMQ is developed based on the concurrency-oriented language Erlang. The throughput is not as good as Kafka, but it is enough for our company. Moreover, the message reliability is good, and the message delay is extremely low, and the cluster construction is more convenient. It supports multiple protocols and has clients in various languages, which is more flexible. Spring's support for RabbitMQ is also relatively good, and it is more convenient to use.

2.2 How does RabbitMQ ensure that messages are not lost

RabbitMQ provides targeted solutions for various places where problems may occur during message delivery:

  • When the producer sends a message, the message may not reach the exchange due to network problems :
    • RabbitMQ provides a publisher confirm mechanism
      • After the producer sends the message, you can write the ConfirmCallback function
      • After the message reaches the switch successfully, RabbitMQ will call ConfirmCallback to notify the sender of the message and return ACK
      • If the message does not reach the switch, RabbitMQ will also call ConfirmCallback to notify the sender of the message and return NACK
      • An exception will also be thrown if the message is not sent successfully after timeout
  • After the message reaches the exchange, if it fails to reach the queue, the message will also be lost :
    • RabbitMQ provides a publisher return mechanism
      • Producers can define the ReturnCallback function
      • When the message arrives at the switch but not in the queue, RabbitMQ will call ReturnCallback to notify the sender of the failure reason
  • After the message arrives in the queue, MQ downtime may also cause loss of messages :
    • RabbitMQ provides persistence function, cluster master-slave backup function
      • Message persistence, RabbitMQ will persist switches, queues, and messages to disk, and restart after downtime can restore messages
      • Both mirror clusters and arbitration queues can provide master-slave backup functions. When the master node goes down, the slave node will automatically switch to master, and the data is still in the
  • After the message is delivered to the consumer, if the consumer does not handle it properly, the message may also be lost :
    • SpringAMQP provides consumer confirmation mechanism, consumer retry mechanism and consumer failure processing strategy based on RabbitMQ:
      • Confirmation mechanism for consumers:
        • The consumer processes the message successfully, and when no exception occurs, Spring returns ACK to RabbitMQ, and the message is removed
        • The consumer fails to process the message, throws an exception, crashes, Spring returns NACK or does not return the result, and the message is not abnormal
      • Consumer retry mechanism:
        • By default, when a consumer fails to process, the message will return to the MQ queue again and then delivered to other consumers. The consumer retry mechanism provided by Spring does not return NACK after the processing fails, but directly retries locally on the consumer. After multiple retries fail, the message is processed according to the consumer failure handling strategy. It avoids the extra pressure caused by frequent messages entering the queue.
      • Consumer failure strategy:
        • When a consumer fails local retries multiple times, the message is discarded by default.
        • Spring provides the Republish strategy. After multiple retries fail and the number of retries is exhausted, the message is redelivered to the specified exception switch, and the exception stack information will be carried to help locate the problem.

2.3 How RabbitMQ avoids message accumulation

The reason for the problem of message accumulation is often because the speed of message sending exceeds the speed of consumer message processing. So the solution is nothing more than the following three points:

  • Improve consumer processing speed
  • add more consumers
  • Increase the upper limit of queue message storage
  1. Improve consumer processing speed
  • The processing speed of consumers is determined by the business code, so what we can do includes:
    • Optimize business code as much as possible to improve business performance
    • After receiving the message, open the thread pool and process multiple messages concurrently
  • Advantages: low cost, just change the code
  • Disadvantages: Enabling the thread pool will bring additional performance overhead, which is not suitable for high-frequency, low-latency tasks. It is recommended for services with a long task execution period.
  1. add more consumers
  • A queue binds multiple consumers to compete for tasks together, which can naturally increase the speed of message processing.
  • Advantages: Problems that can be solved with money are not problems. Realize simple and rude
  • Cons: The problem is that there is no money. the cost is too high
  1. Increase the upper limit of queue message storage
  • After version 1.8 of RabbitMQ, a new queue mode was added: Lazy Queue. This queue does not store messages in memory, but writes them directly to disk after receiving messages, theoretically, there is no storage limit. It can solve the problem of message accumulation.
  • Advantages: more secure disk storage; unlimited storage; avoid Page Out problems caused by memory storage, and more stable performance;
  • Disadvantages: Disk storage is limited by IO performance, and message timeliness is not as good as memory mode, but the impact is not significant.

2.4 RabbitMQ guarantees the order of messages

  • RabbitMQ is a queue storage, which naturally has the characteristics of first-in-first-out. As long as the sending of messages is orderly, theoretically the reception is also orderly. However, when multiple consumers are bound to a queue, messages may be polled and delivered to consumers, and the processing order of consumers cannot be guaranteed. Therefore, to ensure the order of messages, the following points need to be done:
    • Guarantee the order of message sending
    • Ensure that a set of ordered messages are sent to the same queue
    • Ensure that a queue contains only one consumer

2.5 Prevent MQ messages from being repeatedly consumed

  • The reasons for the repeated consumption of messages are various and unavoidable. Therefore, we can only start from the consumer side. As long as the idempotence of message processing can be guaranteed, the message will not be repeatedly consumed. There are many solutions to guarantee idempotence:
    • Add a unique id to each message, record the message table and message status locally, and make judgments based on the unique id of the database table when processing messages
    • The same is to record the message table, and use the message status field to realize the judgment based on optimistic lock to ensure idempotence
    • Based on the idempotency of the business itself. For example, according to the deletion of id, the query business is inherently idempotent; the business of adding and modifying can be considered based on the uniqueness of the database id, or the optimistic locking mechanism to ensure idempotence. The essence is similar to the message table scheme.

2.6 Ensure high availability of RabbitMQ

  • To achieve high availability of RabbitMQ is nothing more than the following two points:
    • Do a good job in the persistence of switches, queues, and messages
    • Build a mirrored cluster of RabbitMQ, and do a good job of master-slave backup. Of course, you can also use a quorum queue instead of a mirrored cluster.

2.7 What problems can be solved by using MQ?

  • Problems that RabbitMQ can solve:

  • Decoupling: Modifying several business-related microservice calls to MQ-based asynchronous notifications can decouple the business coupling between microservices. It also improves business performance.

  • Traffic peak clipping: Put sudden business requests into MQ as a buffer. The back-end business obtains messages from MQ according to its own processing capability, and processes tasks one by one. The flow curve becomes much smoother

  • Delay queue: Based on RabbitMQ's dead letter queue or DelayExchange plug-in, it can achieve the effect of delaying the reception of messages after they are sent.


Guess you like

Origin blog.csdn.net/yang2330648064/article/details/130319517