Zookeeper - what can Zookeeper do

There is a sentence on Zookeeper's official website: ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.

This roughly describes what Zookeeper can do: configuration management, name service, and providing distribution Synchronization and cluster management. So what are these services? Why do we need such a service? Why do we use Zookeeper to achieve it, and what are the advantages of using Zookeeper? Next, I will introduce what these are and which open source systems are used.

Configuration management

In addition to the code in our application, there are also various configurations. such as database connections. Generally, we use configuration files to introduce these configuration files into the code. But when we have only one configuration, only one server, and do not modify it often, it is a good practice to use configuration files, but if we have a lot of configurations, many servers need this configuration, and it may be dynamic Using a configuration file is not a good idea. At this time, we often need to find a way to centrally manage the configuration. We modify the configuration in this centralized place, and all those who are interested in this configuration can get changes. For example, we can put the configuration in the database, and then all the services that need to be configured go to this database to read the configuration. However, because the normal operation of many services is very dependent on this configuration, the service that centrally provides configuration services needs to be highly reliable. Generally, we can use a cluster to provide this configuration service, but using a cluster to improve reliability, how to ensure the consistency of the configuration in the cluster? At this time, you need to use a service that implements a consensus protocol. Zookeeper is such a service, which uses Zab, a consensus protocol, to provide consistency. There are many open source projects that use Zookeeper to maintain configuration. For example, in HBase, the client connects to a Zookeeper to obtain the necessary HBase cluster configuration information before further operations. Also in the open source message queue Kafka, Zookeeper is also used to maintain broker information. In Alibaba's open source SOA framework Dubbo, Zookeeper is also widely used to manage some configurations to achieve service governance.

name service

The name service is easy to understand. For example, in order to access a system through the network, we have to know the IP address of the other party, but the IP address is very unfriendly to people. At this time, we need to use the domain name to access. But the computer cannot be another domain name. How to do it? If we have a mapping of domain names to IP addresses in each machine, this can solve part of the problem, but what if the IP corresponding to the domain name changes? So we have the DNS thing. We just need to visit a well-known (known) point, it will tell you what is the IP corresponding to this domain name. There are also many such problems in our application, especially when we have a lot of services, it will be very inconvenient if we save the address of the service locally, but if we only need to access a well-known Access point, where a unified entry is provided, it will be much more convenient to maintain.

Distributed locks

In fact , Zookeeper has been introduced in the first article as a distributed coordination service. This way we can leverage Zookeeper to coordinate activities among multiple distributed processes. For example, in a distributed environment, in order to improve reliability, the same service is deployed on each server in our cluster. However, if each server in the cluster is doing one thing, it needs to coordinate with each other, and it will be very complicated to program. And if we only have one service to operate, then there is a single point. There is usually another way to use distributed locks. Only one service is allowed to work at a certain time. When this service fails, the lock is released and immediately fails over to another service. This is done in many distributed systems, and this design has a better name called Leader Election. For example, the Master of HBase adopts this mechanism. However, it should be noted that distributed locks are still different from locks in the same process, so they are used more cautiously than locks in the same process.

Cluster management

In a distributed cluster, some nodes often come in and out due to various reasons, such as hardware failures, software failures, and network problems. New nodes join in, and old nodes leave the cluster. At this time, other machines in the cluster need to perceive this change, and then make corresponding decisions based on this change. For example, we are a distributed storage system. There is a central control node responsible for storage allocation. When new storage comes in, we need to allocate storage nodes according to the current state of the cluster. At this time, we need to dynamically perceive the current state of the cluster. Also, for example, in a distributed SOA architecture, services are provided by a cluster. When consumers access a service, they need to use some mechanism to discover which nodes can provide the service (this is also called service Discovery, such as Alibaba's open source SOA framework Dubbo uses Zookeeper as the underlying mechanism for service discovery). There is also an open source Kafka queue that uses Zookeeper as the online and offline management of Cosnumer.

Postscript

In this article, some services that Zookeeper can provide are listed, and some examples in open source systems are given. Later, we will start with the installation and configuration of Zookeeper, and further introduce how to use Zookeeper with examples.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326286811&siteId=291194637