Service discovery mechanism and service registration center

Author: Zen and the Art of Computer Programming

1 Introduction

Service Discovery and Service Registry are an important part of the microservice architecture. By centrally managing service information and providing service catalogs, each microservice can discover and call each other, achieving cross- Service Communications. This article first briefly introduces the service discovery mechanism and its basic concepts, and then elaborates on the two implementation methods of service discovery - static configuration and dynamic monitoring. Then it mainly introduces the functions and roles of the service registration center, and raises several key issues, including service registration, cancellation, subscription, feedback, etc. Finally, it explains some challenges and countermeasures of the service registration center. It is hoped that through the introduction of this article, readers can better understand the role of service discovery mechanism and service registration center in microservice architecture, and master how to choose appropriate service discovery and service registration center solutions in actual development to achieve service governance. and high availability for service discovery.

2. Explanation of basic concepts and terms

2.1 Service discovery mechanism

Service discovery mechanism is a technology used in distributed systems to find the location of network services. It allows distributed client applications to find specific network service endpoints based on service names or other location information.

2.1.1 Service discovery mode

  • Client/Server mode In Client/Server mode, the service discovery mechanism relies on a registration server or central node. This node saves a complete service list. When the client needs to request a certain service, it sends a request to the node. , get the service list, and then select the corresponding service to call. The advantage of this model is that it is simple and easy to use, but the disadvantage is that when the central node goes down, the entire system may become unavailable.
  • DNS mode In DNS mode, the service discovery mechanism uses the DNS protocol to resolve domain names and obtain service addresses. DNS domain name resolution is usually completed by the local domain name server. Through the service discovery mechanism, the service address can be recorded in the local DNS server for direct resolution by client applications. The advantage of this model is that the degree of centralization is low, and the service provider can choose the IP address of the service independently; the disadvantage is that the resolution speed is slow and it depends on the DNS server.
  • API mode The API mode adopts the publish/subscribe model. The client application subscribes to the service registration center for interested services. After the registration center receives the subscription, it will notify all registered client applications, and then the client application can actively connect to the service. The provider makes service calls. The advantage of this model is that it does not need to rely on DNS resolution and is convenient and fast to use. The disadvantage is that there is no centralized control and you need to implement the notification mechanism yourself.

    2.2 Service Registration Center

    Service Registry, also known as service directory, is an independent component in a distributed system, used to store and manage service information, and provide service search capabilities based on region, environment, organization and even business line. The service registration center is the infrastructure layer in the microservice architecture. All microservices should be registered in the service registration center and register their own service information with the center when starting.

    2.3 CAP principle

    The CAP principle (Consistency, Availability, Partition Tolerance) refers to a theory in distributed computing systems. It refers to consistency (Consistency), availability (Availability), and partition fault tolerance ( Partition tolerance) must be met at the same time to ensure the robustness of the system.

    C: Consistency

    Consistency means that after the data is updated successfully, the data read by multiple processes or threads at the same time is the same. Typically, consistency needs to be considered from the beginning of system design. In order to maintain data consistency, systems often adopt strong consistency, but this approach sacrifices performance and is therefore replaced by weak consistency.

    A:Availability

    Availability represents the proportion of time the system continues to provide service. When the system fails, it only affects a small number of users and will not cause serious impact. Availability usually requires the system to have redundant fault tolerance, that is, to ensure that the system can still work normally under certain special circumstances.

    P: Partition fault tolerance

    Partition Tolerance describes the behavior of a distributed system when encountering a partition failure. Partition fault tolerance includes two aspects. The first is that the system can still provide sufficient services to the outside world at the same time; the second is that the system can still continue to run when communication between partitions is abnormal. To achieve partition fault tolerance, data generally needs to be copied to different machines, but this approach also introduces additional complexity.

3. Static configuration and dynamic monitoring

3.1 Static configuration

Static configuration means writing the service address in the configuration file, as shown in the figure below: This method is relatively simple and crude. It can work well for simple scenarios, but for a large number of microservices, the configuration is cumbersome and the modification is inflexible. It doesn't apply to other questions.

3.2 Dynamic monitoring

Dynamic monitoring automatically scans the specified registry at startup and dynamically adds the service address to the local routing table, as shown in the figure below: This method is more flexible and can add, delete, and modify services at any time during operation. No need to restart the system.

4. Service Registration Center

The service registration center is responsible for service registration, subscription and query. The service registration process includes the service provider writing the service information it provides into the service registration center, using the service provider identifier as the key, and the metadata of the service instance (such as IP address, port number, service name, etc.) as the value. storage.

The service subscription process includes the client application subscribing to the specified service, the registration center returning a service list, and the client application can select one of the services to call.

The service query process includes the client application querying the service through the service name, the registration center returning the metadata of the corresponding service, and the client application making service calls through these metadata.

In addition to managing the life cycle of services, the service registration center should also have good scalability, robustness and high availability, making service discovery more convenient, rapid and reliable.

Let's take a look at the functions of the service registration center with examples.

4.1 Service registration

When a new service goes online, it must first report its metadata to the service registration center through the service registration interface. The service registration interface usually supports protocols such as HTTP or gRPC and accepts registration requests from clients. After receiving the registration request, the service registration center first checks whether the client has been registered for other services. If it has been registered, it will first clear the previous service information and then store the new service information.

For example, client A registers service foo, and the request parameters are as follows:

{
    "serviceName": "foo",
    "host": "localhost",
    "port": 8080
}

After receiving the request, the service registration center will check whether client A has already registered for other services. If so, it will clear the previous service information. Then write the information of the foo service registered by client A into the storage space of the service registration center. The metadata of the foo service includes serviceName, host, port and other information.

4.2 Service logout

When a service goes offline, it must first log out of the service it provides to the service registration center. The client needs to actively initiate a logout request, and the request parameters include serviceName. After receiving the request, the service registration center deletes the metadata stored in the service.

4.3 Service Subscription

The client subscribes to the service through the service subscription interface. The service subscription interface supports protocols such as HTTP or gRPC and accepts subscription requests from clients. When the client initiates a subscription request, it will pass in the service name to be subscribed, such as:

["foo"]

After the service registration center receives the subscription request, it will return the metadata of all instances of the service, such as:

[
  {
    "id": "[email protected]:8080",
    "serviceName": "foo",
    "host": "127.0.0.1",
    "port": 8080
  }
]

After client A receives the service subscription response, it can select one of the services to call.

4.4 Service feedback

The client can feedback the service availability status to the service registration center through the service feedback interface. After receiving the service availability status fed back by the client, the service registration center can update the availability status of the corresponding metadata.

5. Some challenges and countermeasures in the service registration center

5.1 Data consistency

When storing service metadata in the service registry, data consistency needs to be considered. Service registration centers usually use distributed transaction processing technology to ensure strong consistency of service metadata. However, instability caused by network reasons, hardware failures, software errors and other factors cannot be completely avoided, so data inconsistencies between different nodes in the service registration center may still occur.

In order to solve the data consistency problem, it is necessary to have clear constraints on the service metadata when designing the service registration center, such as using message queues or database transactions to ensure strong data consistency. In addition, the service registration center can also introduce a cache layer to reduce the frequency of access to the database, thereby improving the overall performance of the system.

5.2 Availability

Since the service registry is an independent component, its availability directly determines the overall availability of the microservice architecture. Therefore, the service registration center must ensure high availability, otherwise it will lead to the paralysis of the microservice architecture.

For service registration centers, the following methods are generally needed to ensure high availability:

  1. Service registration center cluster deployment. Deploy multiple service registration center nodes, each with different capacity and traffic volume, to ensure the high availability of the service registration center.
  2. Use message queue or MySQL database cluster to store service metadata. These clusters are maintained by dedicated operation and maintenance teams to ensure their high availability.
  3. The network transmission between the service registration center and the client adopts an asynchronous and non-blocking method. The use of asynchronous non-blocking network transmission can effectively prevent service discovery timeouts caused by client lags or network congestion, and ensure the high availability of the service registration center.
  4. Implement the service health detection mechanism in the service registration center. When a service fails, the registration center will promptly update the corresponding service information so that other services can quickly identify the failure and reallocate resources.
  5. Provides the monitoring function of the service registration center to facilitate administrators to understand the operating status of the system, predict and avoid abnormalities, and improve the reliability of the system.

The above methods can effectively improve the availability of the service registration center.

5.3 Distributed Consensus Algorithm

When implementing a service registry, the distributed consistency of service metadata needs to be considered. The most commonly used distributed consensus algorithms are Paxos algorithm and Raft algorithm.

The Paxos algorithm is a solution for solving the coordination and consistency problem of distributed systems. It is a consensus algorithm based on message passing. The Paxos algorithm is mainly used to solve the problem of how to execute decisions on a value or log between multiple nodes. In actual engineering applications, the Paxos algorithm can be divided into two categories, namely single decision-making type and multi-valued decision-making type.

A single decision class can be used in service registration center election scenarios, such as the election of master nodes. A typical problem in this scenario is distributed locks. Multi-value decision-making scenarios include distributed log replication based on the Raft algorithm, shared leases, etc.

Both distributed consensus algorithms have pros and cons. The Paxos algorithm has strong correctness and can ensure strong consistency of data without sacrificing efficiency. However, implementing the Paxos algorithm is complicated and requires many details to be considered, especially in terms of scalability and security.

The Raft algorithm is relatively simple and easy to implement. It is a very classic distributed consensus algorithm that can be found in many open source frameworks and software. However, the Raft algorithm is not an atomic commit protocol like Paxos, so its availability and consistency are not comparable to Paxos. Therefore, in actual engineering applications, it is necessary to select an appropriate distributed consensus algorithm based on actual needs.

Guess you like

Origin blog.csdn.net/universsky2015/article/details/133502524