Basic: Zookeeper, Eureka, Nacos, Consul, Etcd selection

foreword

What is a service registry?

The process by which a service registers its location information with a "central registration node". The service generally registers its host IP address and port number, and sometimes also has authentication information for service access, usage protocol, version number, and some details about the environment.

What is service discovery

Service discovery means that the service provider posts or updates the address it provides to the service intermediary, and the service consumer gets the address of the service it wants from the service intermediary.

But there are two problems:

  • The first question: If a service provider goes down, there will be an inaccessible address in the key/value of the intermediary, what should I do?
  • Answer: Heartbeat mechanism. The service provider needs to report the survival to the service intermediary every 5s or so, and the service intermediary records the service address and reporting time in the value and score of the zset data structure. The service intermediary needs to check the zset data results every 10 seconds or so, and kick out addresses that are seriously behind in reporting time. This ensures the validity of addresses in the service list
  • The second question: how to notify consumers when the service address changes. There are two solutions.
    • The first is polling, where consumers query the service list every few seconds for changes. If there are many service addresses, the query will be very slow. At this time, the service version number mechanism can be introduced to provide a version number for each service. When the service changes, the version number is incremented. Consumers only need to cycle through the changes of this version number to know whether the service list has changed
    • The second is pubsub. The timeliness of this method is obviously better than round robin. To reduce waste of threads and connections, we use a single pubsub to broadcast changes to the global version number. The so-called global version number means that any change in the service list will increase the version number. Consumers who receive version changes check whether the version numbers of their dependent service lists have changed. This global version number can also be used for the first polling scheme.

CAP theory

CAP theory is an important theory in distributed architecture

  • Consistency (all nodes have the same data at the same time)
  • Availability (guaranteed that each request has a response regardless of success or failure)
  • Partition tolerance (the loss or failure of any information in the system will not affect the continued operation of the system)

Regarding the understanding of P, I think it is because a certain part of the whole system hangs up or goes down, which does not affect the operation or use of the whole system, but the availability is that a certain node of a certain system is down, But it does not affect the acceptance or request of the system. It is impossible to take all CAPs, and only two of them can be taken. Because

(1) If C is the first requirement, it will affect the performance of A, because data synchronization is required, otherwise the request results will be different, but data synchronization will consume time, and the availability will decrease during the period.

(2) If A is the first requirement, then as long as there is a service, the request can be accepted normally, but the return result cannot be guaranteed. The reason is that in the distributed deployment, the process of data consistency is impossible to think about. The line is so fast.

(3) If colleagues satisfy consistency and availability, it is difficult to guarantee partition fault tolerance, that is, a single point, which is also the basic core of distribution. Well, if you understand these theories, you can select service registration in the corresponding scenario and found out.

Consul, zookeeper, etcd, eureka comparison conclusion

selection Development language Interface (multilingual capability) Consensus Algorithm|CAP Theorem Service Health Check multi data center kv storage service watch support
zookeeper java sdk client Paxos CP (weak) long connection, keepalive support support
consul go http/dns Raft CA Service status, memory, hard disk, etc. support Full amount/support long polling
Etcd go http/grpc Raft CP connection heartbeat support support long polling
Doozer go

Since eureka has announced that it will give up maintenance in 2018, it is no longer recommended here.

Development language:

  • Developed with java, the java environment needs to be deployed during installation
  • Developed with golang, all dependencies are compiled into executable programs, plug and play.

zookeeper

Zookeeper is an open source distributed coordination service currently maintained by Apache. Zookeeper can be used to implement functions such as publish/subscribe, load balancing, command service, distributed coordination/notification, cluster management, Master election, distributed lock and distributed queue, which are common in distributed systems. It has the following properties:

  • Sequential consistency: Transaction requests initiated from a client will eventually be applied to Zookeeper in strict accordance with the order in which they were initiated;
  • Atomicity: The processing results of all transaction requests are consistent on all machines in the entire cluster; there is no situation where some machines apply the transaction while others do not;
  • Single view: the server-side data model seen by all clients is consistent;
  • Reliability: Once the server successfully applies a transaction, the changes caused by it will remain until it is changed by another transaction;
  • Real-time: Once a transaction is successfully applied, Zookeeper can ensure that the client can immediately read the latest state data after the transaction change.

The function of zookeeper

  • Central server as storage of configuration information
  • naming service
  • distributed synchronization
  • packet service

It can be seen that zookeeper is not only used as a service discovery framework, it is very large.

If you just plan to use zookeeper as a service discovery tool, you need to use its configuration storage and distributed synchronization functions. The former can be understood as a consistent KV storage, and the latter provides a zookeeper-specific watcher registration and asynchronous notification mechanism. Zookeeper can asynchronously notify the zookeeper client of the status of the node in real time.

The use process of zookeeper

  • Make sure you have the SDK for the language you choose. In theory, there are some third-party libraries on github, and you should be able to use it after careful screening.
  • Call the zookeeper interface to connect to the zookeeper server.
  • Register your own service
  • Obtain the status of the monitoring service through watcher
  • The service provider needs to maintain the heartbeat with the zookeeper server by itself.

advantage

  • Powerful, not just service discovery
  • Provide a watcher mechanism to obtain the status of the service provider in real time
  • Framework support such as dubbo

Why not use zookeeper?

  • The deployment and maintenance of zookeeper is complicated, and administrators need to master a series of knowledge and skills. The Paxos strong consensus algorithm used by zookeeper has always been famous for its complexity and difficulty; moreover, the use of zookeeper is also more complicated.
  • Written in java, because java is biased towards heavy-duty applications, it will introduce a lot of dependencies. The operation and maintenance personnel hope to maintain a strongly consistent and highly available machine cluster as simple as possible, and it is not easy to make mistakes in maintenance.
  • Slow development, the large structure of the Apache Foundation and loose management lead to slow development of the project.

etcd

etcd is a distributed key-value pair storage system using the http protocol, because it is easy to use and simple. Many systems adopt or support etcd as part of service discovery, such as kubernetes. But because it is just a storage system, if you want to provide a complete service discovery function, you must use some third-party tools.
For example, with the combination of etcd, Registrator, and confd, a very simple and powerful service discovery framework can be built. But this kind of construction operation is a little troublesome, especially compared to consul. Therefore, most scenarios of etcd are used for kv storage, such as kubernetes.

consul

overview

Consul is a service discovery and configuration management center service developed by Google open source using go language. Built-in service registration and discovery framework, distributed consistency protocol implementation, health check, Key/Value storage, multi-data center solution, no longer need to rely on other tools (such as ZooKeeper, etc.). Service deployment is simple, with only one executable binary package. Each node needs to run the agent, which has two operating modes server and client. The official recommendation of each data center is that 3 or 5 server nodes are required to ensure data security and ensure that the election of the server-leader can be performed correctly.

Consul consists of multiple components, but as a whole, provides service discovery and service configuration tools for your infrastructure. It provides the following key features:

  • Built-in service registration and discovery framework, distributed consensus protocol implementation, health check, Key/Value storage, multi-data center solution, no longer need to rely on other tools (such as ZooKeeper, etc.).
  • Service Discovery:
    • Consul provides a way to register and discover services through DNS or HTTP interfaces. Some external services can easily find the services it depends on through Consul.
    • The realization of service discovery is to register on consul through the service provider (http or dns mode), and then the service user can obtain the required service through consul, and the registration and service acquisition are completed through the API provided by consul ( http)
    • Member management and message broadcasting adopt GOSSIP protocol and support ACL access control.
      • ACL technology is widely used in routers, and it is a flow control technology based on packet filtering. The control list uses the source address, destination address and port number as the basic elements of data packet inspection, and can specify whether the qualified data packets are allowed to pass.
      • Gossip is a p2p protocol. The main thing he has to do is to decentralize. This protocol is derived from simulating the behavior of spreading rumors in humans. First of all, to spread rumors, there must be seed nodes. The seed node will randomly send the list of nodes it owns and the message that needs to be propagated to other nodes every second. Any newly added node is quickly known by the whole network in this way of propagation.
  • Health Checking:
    • Consul's Client can provide any number of health checks, either associated with a given service ("is the webserver returning 200 OK") or the local node ("is the memory utilization below 90%"). Operators can use this information to monitor the health of the cluster, and service discovery components can use this information to route traffic away from unhealthy hosts.
    • Consul has its own health check mechanism and a web UI interface (consul_IP:8500/ui)
  • Key/Value storage:
    • Applications can use the Key/Value storage provided by Consul according to their needs.
    • Consul provides a simple and easy-to-use HTTP interface, which can be combined with other tools to realize dynamic configuration, function marking, leader election and other functions.
  • Security Services Communications:
    • Consul can generate and distribute TLS certificates for services to establish mutual TLS connections.
    • Intents can be used to define which services are allowed to communicate. Service segmentation can be easily managed, and its intent can be changed in real time, rather than using complex network topologies and static firewall rules.
  • Multiple data centers:
    • Consul supports multi-data centers out of the box. This means that users don't need to worry about the need to build additional abstraction layers to allow services to expand to multiple regions.

Consul is also relatively simple to use, written in Go language, so it has natural portability (supports Linux, windows and Mac OS X); the installation package contains only one executable file, which is easy to deploy, and it is compatible with lightweight containers such as Docker. seam fit.

client

  • CLIENT means the client mode of consul, which is the client mode. It is a mode of the consul node. In this mode, all services registered to the current node will be forwarded to the SERVER, and the information itself will not be persisted.

server

  • SERVER represents the server mode of consul, indicating that consul is a server. In this mode, the function is the same as that of CLIENT. The only difference is that it will persist all the information locally, so that the information can be retained in case of failure. of.

server-leader

  • There is the word LEADER under the server in the middle, indicating that this server is their boss. It is different from other servers in that it needs to be responsible for synchronizing registration information to other servers, and is also responsible for the health monitoring of each node.

agent

  • Each member of the consul cluster must run an agent, which can be started by the consul agent command. Agent can run in server state or client state. Naturally, a node running in the server state is called a server node; a node running in the client state is called a client node.

The function of consul

  • Applications can easily find the systems they depend on via DNS or HTTP
  • Provides a variety of health check methods: http return code 200, whether the memory is overrun, whether the tcp connection is successful
  • kv storage, and provide http api
  • Multi-data centers, which zookeeper does not have.

Compared with zookeeper's service discovery, consul does not require a special sdk to be integrated into the service, so it does not limit the use of any language.

Consul usage process

  • A consul agent must be installed on each server.
  • Consul agent supports registering services through configuration files, or registering services through http interfaces in services.
  • After the service is registered, the consul agent periodically checks whether the service is alive through the specified health check method.
  • If the service wants to query the survival status of other services, it only needs to initiate an http request or dns request with the local consul agent.

To put it simply, the use of consul does not depend on any sdk, and all the logic of service discovery can be satisfied by a simple http request.
However, the service obtains the survival status of other services from the consul agent every time. Compared with the watcher mechanism of zookeeper, the real-time performance is slightly worse. It is necessary to consider how to improve the real-time performance as much as possible, and the problem will not be too big.

service management

service registration

Consul supports two ways to realize service registration. One is to register the http API through the consul service, and the service calls the API to realize the registration. The other way is to realize the registration through the json configuration file, and configure the services that need to be registered in json format file is given. Consul officially recommends using the second method.

service discovery

Consul supports two ways to realize service discovery, one is to query what services are available through the http API, and the other is to use the DNS (port 8600) that comes with the consul agent. The domain name is given in the form of NAME.service.consul. NAME is the name of the service in the defined service configuration file. The DNS method can check the service by checking.

communication protocol between services

Consul uses the gossip protocol to manage membership and broadcast messages to the entire cluster. It has two gossip pools (LAN pool and WAN pool). The LAN pool communicates within the same data center, and the WAN pool communicates with multiple data centers. LAN There are multiple pools, but only one WAN pool

scenes to be used

Consul's application scenarios include service discovery, service isolation, and service configuration:

  • In the service discovery scenario, consul is used as the registration center. After the service address is registered in consul, you can use the dns and http interfaces provided by consul to query. consul supports health check.
  • In the service isolation scenario, consul supports setting access policies on a service-by-service basis, supports both classic and emerging platforms, supports tls certificate distribution, and service-to-service encryption.
  • In the service configuration scenario, consul provides the key-value data storage function, and can quickly notify changes. The tool consul-template can render configuration files in real time more conveniently.

Each node registered in consul deploys a consul agent, which is responsible for monitoring and checking local services and forwarding query requests to the consul server. The consul server is responsible for storing and backing up data (using the raft protocol to ensure consistency). Usually, multiple servers form a cluster and elect a leader. When querying the service address, you can directly query the consul server, or query through the consul agent, which will forward it to the consul server. If it is a multi-data center, each data center deploys a set of consul servers. Cross-data center queries are performed through the consul server in this data center. Note: When there are multiple data centers, key-value data will not be synchronized between consul servers in different data centers.

insert image description here

Comparison between zookeeper and consul

  • In terms of development language, zookeeper is developed in java, and a java environment needs to be deployed during installation; consul is developed in golang, and all dependencies are compiled into executable programs, which is plug-and-play.

  • In terms of deployment, zookeeper generally deploys an odd number of nodes to facilitate a simple majority election mechanism. When consul is deployed, it is divided into server nodes and client nodes (distinguished by different startup parameters). The server node performs leader election and data consistency maintenance. The client node is deployed on the service machine and serves as the interface for the service program to access consul.

  • Zookeeper does not support multiple data centers. Consul can support multiple data center deployments across computer rooms, effectively avoiding the situation that a single data center cannot be accessed due to failure.

  • In terms of the connection method, the zookeeper client api maintains a long connection with the server, which requires the service program to manage and maintain the validity of the link. The service program registers a callback function to handle zookeeper events, and maintains the validity of the directory structure established on zookeeper (such as temporary node maintenance ); consul uses DNS or http to obtain service information, without active notification, you need to obtain it by rotation training

  • In terms of tools, zookeeper comes with a cli_mt tool, which can log in to the zookeeper server through the command line and manually manage the directory structure. Consul comes with a Web UI management system, which can be started with parameters and view information directly in the browser.

  • https://www.jianshu.com/p/9bcbb7c26539

  • https://www.cnblogs.com/traditional/p/9445930.html

Guess you like

Origin blog.csdn.net/zhizhengguan/article/details/130365415