Article Directory
1 zookeeper Introduction
ZooKeeper
Is a highly available distributed data management system is not coordinated framework. Based on the Paxos
implementation of the algorithm, so that the framework ensures strong consistency of data in a distributed environment, it is also based on the characteristics that ZooKeeper
solve many problems distributed. Online on ZK
application scenarios, there are many reports, this article will systematically ZK
scenarios Zhong line a sub-classification of the door presentation.
It is worth noting, ZK
it is not inherently designed for these scenarios, many developers are then based on the characteristics of its frame, using a series that provides the API
interface to the typical use of groping out.
1.1 publish and subscribe data (distribution center)
Not publish subscribe model, known as 配置中心
the name suggests is the publisher will publish data to ZK
the node for dynamic subscriber data acquisition, centralized management and dynamic update configuration information. Such as global configuration information, service address lists and other service-service framework is very suitable for use.
- Some configuration information into the applications used in
ZK
the Zhong-line centralized management. Such a scene is usually this: at the time of application of Mai Kai will take the initiative to obtain a configuration, at the same time, registered on a nodeWatcher
, so that each subsequent configuration updates, that they will subscribe to real-time notification of a client, to achieve the purpose for the latest configuration information. - Distributed search service, node status and meta-information server cluster machine index is stored in
ZK
a number of specified node for each client subscription to use. - Distributed log collection system. The core work of this system is to collect logs in different machines. Use according collectors typically collect tasks allocated means thus need
ZK
to create an application-name as thepath
nodeP
, and all the machines of this applicationip
, to register as a child node of the nodeP
on, so that the machine can be achieved time changes, real-time notification to the collector adjust assignments. - Some of the information systems required to obtain state Mai, Mai artificial hand and will continue to exist to modify this information is to ask questions. Usually it exposes interfaces, such as
JMX
interfaces, to get some information about running. IntroducedZK
later, do not realize their own set of programs, as long as the information is stored to Die a givenZK
node can be.
Note: In the scenario mentioned above, there is a default assumption that: a small amount of data, but the data may be updated faster scene
1.2 Load Balancing
Here that is soft load balancing load balancing. In a distributed environment, in order to ensure high availability, usually with the same application or a service provider will deploy more than to reach for other services. And consumers need to select a relevant business logic to perform these peer servers, wherein the message is a typical middleware producers, consumers, load balancing.
Posted in messaging middleware load balancing and subscribers, the linkedin
open source KafkaMQ
and open source Ali metaq
are through zookeeper
to do producers, consumers load balancing. Here to metaq
, for example, speaking at:
- Manufacturer Load balancing:
metaq
When sending a message, a producer must be selected at the time of sending a messagebroker
a message to send on the partition, sometaq
during operation, will allbroker
and all the partition information register corresponding toZK
the specified node, the default policy is a依次轮询
process, by the producerZK
after acquiring the partition list, they will followbrokerId
, andpartition
arranged organized into an ordered sequence list partitioning, from beginning to end as when transmitted Kuo iterative manner to send the message to select a partition . - Consumer Load Balancing:
in the consumption process, a consumer will be spending one or more partitions in the news, but a partition will be consumed by a consumer.MetaQ
Consumer policy is:
- Each district for the same
group
mount only a consumer. - If the same
group
number of consumers is greater than the number of partitions, the more out of the consumer will not participate in not spending - If the same
group
number of consumers is less than the number of partitions, there are some consumers need to undertake additional tasks consumption.
In the case of a failure or reboot the consumer, other consumers will perceive this change (through zookeeper watch
the list of consumers), and then re-Zhong line load balancing to ensure that all lines of partition has consumer Zhong consumption.
1.3 Naming Services (Naming Service)
Naming Service is a distributed system relatively common type of scene. In distributed systems, by using a naming service, the client application able to get the address of the specified resource or service name, informants and so on. Named entity can usually cluster of machines, service address provided, the process object, and so on - which we can refer to them as the name ( Name
). One of the more common is some distributed service framework service address lists. By calling the ZK
API to create a node provided, can easily create a globally unique path
, this path
can be used as a name.
Ali distributed service framework open source Dubbo
in use ZooKeeper
as its naming service, maintain a global address list of services, click here to see Dubbo
open source projects. In Dubbo implementation:
服务提供者
At startup, the ZK
designated node on /dubbo/${serviceName}/providers
write their own directory URL
address, this operation is complete publishing services.
服务消费者
Startup, subscribe to /dubbo/${serviceName}/providers
the provider directory URL
address, and /dubbo/${serviceName}/consumers
write their own URL address of the next destination彔.
注意
All the ZK
registered addresses are on 临时节点
, so we can guarantee service providers and consumers can automatically sense changes in resources. In addition, Dubbo
there is the granularity for service monitoring, is to subscribe to /dubbo/${serviceName}
Head彔information under all providers and consumers.
1.4 Distributed notification / coordination
ZooKeeper
The unique watcher
registration and asynchronous notification mechanism to achieve a good notification system between different uncoordinated distributed environment, real-time processing of the data changes. Usually use different systems are all ZK
on the same znode
register, monitor znode
changes (including znode
their own content and child nodes), a system in which update
the znode
, the other system can be notified and take appropriate action
- Another heartbeat detection: between the detection system and the detection system are not directly associated, but by
zk
association on a node, the system greatly reduces coupling. - Another system scheduling modes: a system console, and a push system composed of two parts, the control console functions corresponding push push system work. In some operations manager for the console is actually changed
ZK
status on some nodes, andZK
took notice of these changes to their registeredWatcher
clients, namely push system, then make the appropriate push task. - Another report on the work mode: something similar task distribution system, the Kai Mai subtasks, to
zk
register a temporary node, and the timing of their reporting lines of Zhong Zhong degree (Zhong will write back of the temporary node), so the task managers will be able to know in real-time task Zhong degrees.
In short, usezookeeper
to Zhong line distributed notification and coordination can greatly reduce the coupling between systems
1.5 Cluster Management and Master election
- Cluster Machine Monitoring: This is usually the kind used to cluster state of the machine, the machine has a scene online rate higher requirements, can quickly respond to changes in the cluster machine. Such a scenario, tend to have a monitoring system, whether real-time detection machine cluster survival. Past practice is usually: monitoring system by some means (for example
ping
) the timing of detecting each machine or each machine its own regular reports to the monitoring system, "I'm alive." This approach works, but there are two obvious problems: -
- Cluster machines are subject to change when more implicated modify things.
-
- A certain delay.
The use of ZooKeeper
two properties, you can in real time to another cluster machine liveness monitoring system:
-
- Client node
x
registered oneWatcher
, then ifx
the child node changes, and will notify the client.
- Client node
-
- Creating
EPHEMERAL
types of nodes, once the session client and server end or expired, then the node will disappear.
- Creating
For example, the monitoring system in /clusterServers
registering a node Watcher
, then every dynamic encryption machine, then go out into the lower / clusterServers create a EPHEMERAL
class
type node: /clusterServers/{hostname}
In this way, real-time monitoring system will be able to know the machine increases and decreases, as follow-up treatment is to monitor system of the industry.
- Master election is the
zookeeper
most classic scenarios of.
In a distributed environment, the same business applications distributed on different machines, and some business logic (such as calculating a number of time-consuming networkI/O
processing), often just let a line execution Zhong machines across the cluster, the rest of the machine this result can be shared, which can greatly reduce duplication of effort Mai, improve performance, so thismaster
election is the main problem encountered in this scenario.
UseZooKeeper
of strong consistency, to ensure that the nodes created under high concurrency distributed globally unique, that is: there are multiple client requests to create a/currentMaster
node, and ultimately only a certain client requests to create success. Using this feature, you can very easily in a distributed environment Zhong row cluster selected.
In addition, look at the evolution of this scenario, is the dynamicMaster
election. This use toEPHEMERAL_SEQUENTIAL
the characteristics of the type of node.
Mentioned above, all client requests to create, ultimately only one able to create success. In this slight variation, is to allow all requests can be created, but have to have the order of creation, then all requests eventuallyZK
created a possible case where the result is:/currentMaster/{sessionId}-1 , /currentMaster/{sessionId}-2 , /currentMaster/{sessionId}-3 …..
every time that the machine selected as the smallest sequence numberMaster
, If the machine hung up, he created since the node will immediately hour, then after that the machine is the smallest ofMaster
the. - In search system, if each machine in the cluster to generate a full index amount, time consuming and index Ji ensure consistency of data between each other. So let the cluster
Master
generation to Zhong row full amount of the index, and then sync to other machines in the cluster. In addition,Master
disaster recovery measures elections are, you can always Zhong specified line manuallymaster
, that is appliedzk
in not obtainmaster
the time information, such as can behttp
, a place to get waymaster
. - In the
Hbase
middle, also useZooKeeper
to achieve the dynamicHMaster
of the election. In theHbase
implementation, we will be inZK
the store someROOT
address and tableHMaster
of addresses,HRegionServer
will put themselves in a temporary node (Ephemeral
registered) to the wayZookeeper
in, so thatHMaster
you can perceive the individual at any timeHRegionServer
of the survival of the state, at the same time, onceHMaster
a problem occurs, re-election aHMaster
run, thus avoidingHMaster
single-point problem
1.6 Distributed Lock
Distributed Lock, this is mainly due ZooKeeper
to our strong guarantee data consistency. Lock service can be divided into two categories, one is to maintain exclusivity, the other is to control the timing.
- Keep the so-called exclusive, is that all attempts to acquire the lock of the client, and ultimately only one can successfully get the lock. The usual practice is to put
zk
one'sznode
seen as a lock, bycreate znode
the way to achieve. All clients to create/distribute_lock
nodes, that create the ultimate success of the client that is also owned the lock. - Control of the timing is that all views to get the client the lock, and ultimately will be scheduled for execution, but there is a global timing up. The basic approach is similar to above, but here
/distribute_lock
already pre-existing trip, the client creates a temporary order nodes (this can be controlled by attribute nodes: beneath itCreateMode.EPHEMERAL_SEQUENTIAL
is specified).Zk
The parent node (/distribute_lock
) maintain asequence
to ensure that the timing of the creation of a child node, thus forming a global timing for each client.
1.7 Distributed Queue
Queue aspects Simply put, there are two, one is the conventional FIFO queue, the other is to wait until after the unified order execution queue members gathered. For the first Zhong first in first out queue control timing scenarios, and distributed lock service in the same basic principle, not repeat them here.
The second queue is in fact FIFO
made an enhanced queue basis. May be generally /queue
the znode
pre-established to the next /queue/num
node, and assigned to n
(or directly to the /queue
assignment n
), indicates that the queue size, then there is added after each queue member, it is judged whether the queue size has been reached, the decision whether to start the execution. This usage is typical scenario, a distributed environment, a big task Task A
, it is necessary in many sub-task is completed (or condition Ready) case to Zhong line. This time, all of which a sub-task is completed (ready), then go /taskList
down to build their own temporary timing node ( CreateMode.EPHEMERAL_SEQUENTIAL
), when /taskList
found himself following sub-nodes to meet the specified number, you can sequentially Zhong Zhong to the next step of the process line .