Eight typical application scenarios ZooKeeper

First, the data publish / subscribe (distribution center)

1.1 What is the distribution center, what is the use

Data Publish / Subscribe (Publish / Subscribe) system, the so-called distribution center, by definition is the publisher will publish data on a ZooKeeper or series of nodes for subscriber data subscription, thus achieving the purpose of dynamic data acquisition, realization dynamic update configuration information and centralized management of data.

1.2 Configuration mode design center

Publish / Subscribe systems typically have two design patterns, are pushing (Push) mode and pull (Pull) mode. In push mode, the server transmits the active data to all subscribers update clients; is pulled by the client model is used to initiate a request for the current data, the client generally adopt polling timing of the pulled embodiment. ZooKeeper uses a combination of push and pull: node client registers its own need to focus on the service side, once the data of the node is changed, then the server will send Watcher event notification to the appropriate client, the client receives this after the notification message, the server needs to take the initiative to get the latest data.

1.3 Configuration Center implementation principle

If the configuration information stored in the centralized management on ZooKeeper, it's generally the case, the application start time will take the initiative to conduct obtain a configuration information on ZooKeeper server, at the same time, registered on the specified node a Watcher monitor, so that , provided that the configuration information is changed, the server will be real-time notification to all subscribed clients, so as to achieve real-time for the latest configuration information.

In our usual application system development, often encounter such a demand: the need to use some common system configuration information, such as machine list information, run-time switch configuration, database configuration information. The global configuration information typically includes the following three features.

1.4 global configuration information typically possess characteristics which

  • The amount of data is usually relatively small.
  • Data content will change dynamically at runtime.
  • Each machine in the cluster share the same configuration.

For this type of configuration information, the general practice usually can choose to store the local configuration file or in the memory variable.

Why 1.5 To use the configuration center

Either way, in fact, can be simply implemented configuration management. If a local profile mode, then the system may read the application typically activated when a file to the local disk to be initialized, and the file is read periodically during operation, in order to detect changes in the contents of the file . Time, if we need to update the configuration information, so long as the changes in the appropriate configuration files in the actual operation of the system until the system reads the configuration file again, you can read the latest configuration information, and updated into the system, so that you can update the system configuration information. Another configuration management is achieved by means of a memory variable way is also very simple, the Java system, for example, can usually be achieved using JMX way to update the system run-time memory variables. From the above description, we have a basic understanding of how to use the local profile and memory variable approach to configuration management. Often small scale machines in a cluster configuration changes are not particularly frequent situation, either way mentioned above, can easily solve the problem of configuration management. However, once the machine has become larger and more frequent information change after configuration, we find that rely on these two ways to solve the existing configuration management becomes increasingly difficult. We do not only want to be able to quickly change the global configuration information, but small enough to want to change the cost, therefore we must seek - the kind of solution that is more distributed.

Step 1.6 Configuration Center achieved

Step 1: Configure storage
prior to configuration management, first of all we need to initialize the configuration store up to ZooKeeper. In general, we can select a node for storing data arranged in the ZooKeeper, e.g. / app / database_ config ( "Node Configuration")

Step Two: Configure acquire
cluster each machine in the initial startup, it reads the information from the database ZooKeeper node configuration mentioned above, at the same time, the client also need to register a change in data on the configuration node monitor Watcher Once the node data changes occur, all subscribed clients are able to obtain the data change notification.

The third step: configuration changes
in the system is running, there may be the need for the database switch, this time on the need for configuration changes. With ZooKeeper, we only need to content on ZooKeeper configuration node is updated, ZooKeeper can help us to send notifications of data changes to each client, each client after receiving the notice of this change, you can re latest data acquisition.

Second, load balancing

2.1 What is load balancing

Load balancing (Load Balance) is a fairly common computer network technology, used on multiple computers (cluster computers), network connections, CPU, disk drives, or other resources to distribute the load in order to optimize resource utilization, maximize throughput rate, minimizing response time and to avoid excessive overload of purposes. Load balancing can generally be divided into two types of hardware and software load balancing, this section focuses on ZooKeeper is in the "soft" load balancing scenarios.

In a distributed system, load balancing is a common technique, substantially each distributed system require the use of load balancing. Peer distributed systems have
sex, in order to ensure high availability of the system, usually a copy of the way to be deployed to data and services. For consumers, the need to provide a prescription choose to perform the relevant business logic in these peer services, which is typical DNS service.

2.2 ZooKeeper how to achieve load balancing

ZooKeeper uses a dynamic DNS load balancing of a scheme implementation. General business zk do not use load balancing. There nginx and Ribbon not that it is not a fool.

Third, the Naming Service

3.1 What is the Naming Service

Naming Service (Name Service) is a distributed system like the more common scenarios, naming service is a distributed system, one of the most basic public services. Application systems can help to achieve the positioning and use of resources through a resource reference.
In a distributed system, named entity typically be a cluster of machines, service address or remote objects such as these that we can provide them collectively as the name (Name), one of the more common is some distributed service framework (such as service address list RPC, RMI) in.

3.2 What is the use naming service

By using a naming service, the client application can be acquired, information resources and the physical address of the service provider's name and other specified. Java language is a typical JNDI naming service. JNDI is the acronym Java Naming and Directory Interface (JavaNaming and Directory Interface) is one of the important specifications J2EE system, the standard J2EE container provides an implementation of the JNDI specification. Thus, in the actual development, developers often comes with the application server to achieve finished JNDI configuration and management of a data source using JNDI embodiment, the developer need not be concerned completely any information related to the database, including its type, JDBC driver type and database accounts and so on.

Naming services provided by ZooKeeper JNDI technology somewhat similar, they are able to help application system to achieve positioning and the use of resources through a resource reference. In addition, a broad resource location naming services are not the true sense of physical resources - in a distributed environment, the upper application only needs a globally unique name, similar to the unique primary key in the database.

3.3 ZooKeeper implementation of distributed unique ID

The so-called ID, is a identifier that uniquely identifies an object. We are familiar with relational databases, each table needs a primary key to the only - identify each database record, the primary key is such a unique ID. In the past a single phenotype single library system, generally used database field comes auto_increment attribute for each database record to automatically generate a unique ID, generate this database will ensure that the globally unique ID.
But with the increasing size of the database data, sub-sub-library table appears, but only auto_ increment attribute ID is automatically generated for a single record in the table, so in this case, you can not rely on auto_ increment database property to uniquely identify a record. So, we must find a method capable of generating a globally unique ID in a distributed environment.
Speaking globally unique ID, certainly missed UUID. Yes, the UUID is a universal unique identifier (Universally Unique ldentifier) for short, is a widely used in a distributed system for the CD - the standard identification elements, the most typical implementation GUID (Globally Unique ldentifier, globally unique - identifier), mainstream ORM framework Hibernate has direct support for UUID. Indeed, UUID is a very good way to generate globally unique ID, it can very easily ensure the uniqueness of a distributed environment. UUID is a standard and a 32-bit character string comprising four short, e.g. "e70f1357-f260-46ff- a32d- 53a086c57ade". However UUID also has drawbacks, such as too long, meaning unknown like.

The ZooKeeper implementation of distributed node unique ID as long as the order of creation on it.
In ZooKeeper, each node can maintain a data sequence of a part of in-line node, ZooKeeper automatically added as a suffix on its number of child nodes thereof when the client order to create a child node, in this scenario in this feature is the use of ZooKeeper. Figure

Fourth, distributed coordination / notification

4.1 What is Distributed Coordination / Notification

Distributed Coordination / notification service is a distributed system is an indispensable part of, is the key to a different distributed components combine of.

4.2 Role distributed coordination / notification

For applications running in a deployed on multiple machines, usually require a coordinator (Coordinator) to control the operation flow of the entire system, for example, each other. Coordination between the distributed transaction processing machine. Meanwhile, a coordinator to the introduction of so easy to separate from the distributed coordination duty applications, which can greatly reduce the coupling between the systems, and can significantly improve the scalability of the system.

ZooKeeper peculiar Watcher registered with the asynchronous notification mechanism, can well realize distributed environment different machines, and even between the different coordination and notification systems to achieve real-time processing of the data changes. Based on ZooKeeper coordination and implementation of distributed notification, the usual practice is different clients on the same node on ZooKeeper data were registered Watcher, change monitor data nodes (including data node itself and its child nodes), if the data node change, then all subscribed clients are able to receive the corresponding notification Watcher, and make the appropriate treatment.

Inter-machine communication system Distributed 4.3

In most distributed systems, communication between systems is nothing less than a machine heartbeat, progress reports and scheduling system of the three types. Next, we will focus on three types of machines based on ZooKeeper communication to explain how to achieve communication between a distributed system.

4.3.1 heartbeat

Heartbeat detection means between machines in a distributed environment, it is necessary to detect whether or between different machines to each other in normal operation, for example, the machine needs to know whether B A machine uptime. In traditional development, we usually take it to judge each other by PING pass between hosts, a little more complicated, it will be through the establishment of long connection between the machine to achieve the upper machine through a TCP connection inherent heartbeat detection mechanism heartbeat, these really are some very common heartbeat detection method.

ZooKeeper how to achieve heartbeat between distributed machines?

Based on provisional ZooKeeper node characteristics, allowing different machines to create a temporary child node at a specified node ZooKeeper, you can determine whether the corresponding client machines survival between different machines based on the temporary node. In this way, between the detection system and the detection system does not need to be directly associated, but by associating a node on the ZooKeeper, greatly reducing the coupling system.

4.3.2 progress report

In a common task distribution system, usually the task is distributed to the execution, their progress in implementing the tasks required to report to the distribution system in real-time on different machines. This time can be achieved by ZooKeeper. Select a node on ZooKeeper, each task client to create a temporary child node in the node below, so that we can achieve two functions:

  • Determining whether the survival of the task by the machine determines whether there is a temporary node;
  • Each task in real time, the machine will perform their task schedule written up this temporary node, in order to be able to get to the center of the system to perform real-time progress of the task.

4.3.3 Scheduling System

Using the ZooKeeper, another system can achieve scheduling mode: a distributed system consists of two parts and a console consisting of a number of client systems, the console is the need to send some duty instruction information to all clients, to control them accordingly business logic. Some operations managers back on the console to do, in fact, is to modify the data for certain nodes on ZooKeeper, and ZooKeeper further changes to the data in the form of event notification is sent to the corresponding subscribing clients.

Use ZooKeeper to realize the benefits of communication between distributed systems machines?

Not only can save a lot of communication and the underlying network protocol design. On repetitive tasks, more important point is to greatly reduce the coupling between systems, it can easily realize flexible communication between heterogeneous systems.

V. Cluster Management

5.1 What is the cluster management

The so-called cluster management, monitoring and cluster control cluster comprising two blocks, the former focusing the collection of the cluster runtime state, the latter is operated and controlled cluster. In daily operation and maintenance and the development process, we often have similar to the following requirements.

  • We want to know how much of the current cluster machine at work.
  • The operation of each machine in the cluster state data collection.
  • Of upper and lower cluster machine operation.
    In the tradition of Agent-based distributed cluster management systems, are deployed by a Agent on each machine in the cluster by the Agent in charge of this initiative is accountable to a central monitoring system (specified by the system control center will centralize all data processing, forming a series of reports, and is responsible for real-time alerts, hereinafter referred to as the "monitoring Center") report states where their machines. In the moderate scenario cluster size, this is indeed a solution that is widely used in the production practice, to quickly and efficiently implement a distributed cluster environment monitoring, but once the system after business scenarios increased cluster size is large, this solution malpractice scheme will unravel.

5.2 Agent-based distributed cluster management system disadvantages

A massive upgrade of difficulty 5.2.1

The existence of the client in the form of the Agent, after large-scale use, as soon as circumstances require large-scale upgrade of the encounter, it is very troublesome, face enormous challenges in control upgrade costs and upgrade schedule.

5.2.2 unified Agen unable to meet the diverse needs

For machine CPU usage, load (Load) memory usage, network throughput and disk capacity and other basic physical state of the machine, using a unified Agent to be monitored may have to meet. However, if you need in-depth internal applications, monitor the status of some business, for example, in a distributed messaging middleware, it is desirable to monitor the status of each message consumer spending; or in a distributed task scheduling system, the need monitor the implementation of the tasks on each machine. Obviously, for these businesses coupled closely monitoring requirements, not suitable for the Agent to provide a uniform.

5.2.3 programming language diversity

With the advent of more and more programming languages, endless variety of heterogeneous systems. If you are using Agent traditional way, you need to provide a variety of languages ​​Agent client. Another - -, the "control center" is facing enormous challenges in the heterogeneous data integration system.

5.3 ZooKeeper has two major characteristics

  • If the client registering Watcher listens to a data node ZooKeeper, then when the content of the data node or its child node list changed, ZooKeeper server will be sent to the subscribed clients change notification.
  • Temporary node created on ZooKeeper, once the session between the client and the server fails, then the temporary node will be cleared automatically.

ZooKeeper use of these two characteristics, the system can be achieved another cluster machine liveness monitoring. For example, the monitoring system /clusterServersregistered a Watcher to listen on a node, then whenever the machine is operated dynamically add, it will create a temporary node under / clusterServers node: /clusterServers/[Hostname]In this way, real-time monitoring system can detect changes in the situation of the machine, as follow-up treatment is to monitor the system's operations.

Six, Master election

6.1 Master significance of the elections

Master election is a very common scenario in a distributed system. The system unit is a distributed core characteristics independent computing power can be deployed on different machines, constituting - complete distributed system. At the same time, the actual scene often need to select a so-called "boss" in these distributed across different machines stand-alone system unit, in computer science, we call the Master. In a distributed system, Master often used to coordinate other cluster systems unit, it has the right to decide on a distributed system status changes. For example, in some application scenarios the separate read and write, the write request by a client is often handled Master; in other scenarios, the Master often responsible for some complex logic, and the processing result to the other cluster synchronization system unit. Master election can be said ZooKeeper the most typical application scenarios.

In a distributed environment, often faced with this scenario: all systems in the cluster unit to provide data required for front-end services, such as a product ID, or a website carousel ad ID (usually appeared in some ads system) and the like, which product ID or advertising ID obtained often requires a calculation which is usually a very time-consuming process I / O and CPU resources from a series of massive data processing. Given the complexity of the calculation process, if all machines in the cluster to perform this calculation logic, it will cost a lot of resources. A better approach is to only allow some of the cluster, even just let them go to a machine for processing data computing, data once calculated results can be shared to all other client machines across the cluster, which can greatly reduce duplication of effort, improve performance.

6.2 general election process to achieve Master

First, to clarify the requirements under the Master elections: elections in all the machines in a cluster of a machine as a Master.
To address this demand, under normal circumstances, we can select the key characteristics common relational database to achieve: all the machines in the cluster inserting records a same primary key ID to the database, the database will help us automate the primary key conflict check, That is, all of insertion, client machine, only one machine can be a success then we believe that the success insert data into the database client machine to become Master.

6.2.1 Ordinary achieve Master election malpractice

At first glance, this program does work, relying on the primary key characteristic relational database can well guarantee in the cluster elected only one Master. But another issue we need to consider is that if the current elected Master hung up, then how to deal with? Who will tell me Master hang of it? Obviously, relational databases can not inform us of this event.

6.3 Zookeeper process to achieve Master of elections

ZooKeeper use of strong consistency can be a good guarantee high concurrency in a distributed node creation will be able to ensure global uniqueness, namely ZooKeeper will ensure that clients can not create duplicate an existing node data. In other words, if there are multiple client requests to create the same node at the same time, it will ultimately have only one client requests to create success. Using this feature, it can easily be Master elected in a distributed environment.

For example:
client clusters are timed to create a day on a temporary ZooKeeper node, for example /master_ election/2020-1-26/binding, in the process, only one client can successfully create the node, then the machine where the client has become a Master. Meanwhile, the other did not successfully created node on ZooKeeper client, the node will /master_ election/2020-1-26register a child node change Watcher on to monitor whether the current Master machines to survive, once the current Master hung up, then the rest of the client Master will be re-election.

Seven, distributed lock

7.1 What is a distributed lock

Distributed Lock is a way to synchronize access to shared resources between the distributed control system. If different systems or shared with one or a set of resources between different hosts a system, then when accessing these resources, often we need to prevent interference between each other by means of a number of mutually exclusive, in order to ensure consistency in In this case, you need to use a distributed lock.

In peacetime the actual project development, we tend to care rarely distributed lock, but depends on the exclusivity inherent in the relational database to achieve mutual exclusion between different processes. This is indeed a very simple and widely used distributed lock implementation. However, there is an indisputable fact is that the vast majority of the current performance bottlenecks in large distributed systems are focused on database operations. Therefore, if the upper-layer service to give the database to add some additional locks, such as row locks, table locks and even heavy transaction processing, it will make the database more overwhelmed. It is generally distributed lock is achieved without the use of the database.

7.2 Zookeeper distributed lock to achieve

7.2.1 exclusive lock

7.2.1.1 What is the exclusive lock

Exclusive lock (Exclusive Locks, referred to as X lock), also known as a write lock or exclusive lock, the lock is a basic type.
If the transaction data object T O plus the exclusive lock, so during the entire lock, only allows transactions on T O read and update operations, and any other matters can no longer be any data object of this type of operation until T released an exclusive lock. How row lock is to ensure that the core of his current one and only one transaction to acquire a lock, and the lock was released, all are waiting to acquire a lock transaction can be notified.

7.2.1.2 ZooKeeper achieve exclusive lock

Lock defined
in the usual Java development programming, there are two common ways can be used to define the lock mechanism are synchronized and ReentrantLock JDK5 provided, however, in ZooKeeper, there is no similar to this API can be used directly, but through ZooKeeper nodes on the data to indicate a lock, for example, /exclusive_ lock/locka node can be defined as a lock.

Acquire lock
in needs to obtain an exclusive lock, all clients will attempt to call create () interface in /exclusive_ lockthe creation of temporary child nodes under node /exclusive_ lock/lock. In the previous sections we have introduced, ZooKeeper will ensure that all clients in the end, only one client can successfully created, then it can be considered that the client obtains a lock. Meanwhile, all is not acquired to lock the client needs to /exclusive_lockthe node to register a child node change monitor Watcher to monitor in real time to changing circumstances lock node.

Release lock
in "Defining lock" section, we have already mentioned, /exclusive_ lock/lockis a temporary node, in the following two cases are likely to release the lock.

  • Current acquire a lock client machine downtime occurs, then the temporary node on ZooKeeper will be removed.
  • After completion of the normal execution of business logic, the client will take the initiative to remove the temporary node that you create.
    No matter what happens in addition to lock down a node, ZooKeeper will notify all registered child node change Watcher listening client on / exclusive_ lock node. These clients after receiving the notice, once again re-launch a distributed lock acquisition, namely repeated "Get lock" process. The entire exclusive lock acquisition and release processes, as shown below.

7.2.2 shared lock

7.2.2.1 What is a shared lock

A shared lock (Shared Locks, locks as S), also known as a read lock, it is also a basic lock type. If the transaction data object O1 T plus a shared lock, then the current transaction can only perform read operations O1, other transactions can only be shared locks on the data object until all shared locks on that data objects are released . Shared locks and exclusive locks most fundamental difference is that, plus exclusive lock, data objects are only visible to a transaction, and after adding a shared lock, data are visible for all transactions.

7.2.2.2 ZooKeeper to achieve shared lock

Defined lock

和排他锁一样,同样是通过ZooKeeper.上的数据节点来表示一一个锁, 是一个类似于“/shared_lock/[Hostname]-请求类型-序号”的临时顺序节点,例如/shared_lock/192.168.0.1-R-0000000001,那么,这个节点就代表了一个共享锁。

获取锁
在需要获取共享锁时,所有客户端都会到/shared_lock这个节点下面创建一个临时顺序节点,如果当前是读请求,那么就创建例如/shared_ lock/192.168.0.1-R-0000000001的节点;如果是写请求,那么就创建例如/shared_ lock/192.168.0.1- W000000001的节点。

判断读写顺序
根据共享锁的定义,不同的事务都可以同时对同一个数据对象进行读取操作,而更新操作必须在当前没有任何事务进行读写操作的情况下进行。基于这个原则,我们来看看如何通过ZooKeeper的节点来确定分布式读写顺序,大致可以分为如下4个步骤。

  1. 创建完节点后,获取/shared_ lock节点下的所有子节点,并对该节点注册子节点变更的Watcher监听。
  2. 确定自己的节点序号在所有子节点中的顺序。
  3. 对于读请求:
    如果没有比自己序号小的子节点,或是所有比自己序号小的子节点都是读请求,那么表明自己已经成功获取到了共享锁,同时开始执行读取逻辑。如果比自己序号小的子节点中有写请求,那么就需要进入等待。
    对于写请求:
    如果自己不是序号最小的子节点,那么就需要进入等待。
  4. 接收到Watcher通知后,重复步骤1。

释放锁

和排他锁一样。

完整的共享锁流程。如图。

7.2.2.2 共享锁带来的问题(羊群效应)

上面的这个共享锁实现,大体上能够满足一般的分布式集群竞争锁的需求,开且性能都还可以,这里说的一般场景是指集群规模不是特别大,一般是在10台机器以内。

7.2.2.2.1共享锁在实际运行中最主要的步骤
  1. 192.168.0.1 这台机器首先进行读操作,完成读操作后将节点/192.168.0.1-R-000000001删除。
  2. 余下的4台机器均收到了这个节点被移除的通知,然后重新从/shared_lock节点上获取一份新的子节点列表。
  3. 每个机器判断自己的读写顺序。其中192.168.0.2 这台机器检测到自己已经是序号最小的机器”了,于是开始进行写操作,而余下的其他机器发现没有轮到自己进行读取或更新操作,于是继续等待。
  4. 继续…
    上面这个过程就是共享锁在实际运行中最主要的步骤了,我们着重看下上面步骤3中提到的:“而余下的其他机器发现没有轮到自己进行读取或更新操作,于是继续等待。”很明显,我们看到,192.168.0.1 这个客户端在移除自己的共享锁后,ZooKeeper发送了子节点变更Watcher通知给所有机器,然而这个通知除了给192.168.0.2这台机器产生实际影响外,对于余下的其他所有机器都没有任何作用。
7.2.2.2.2 羊群效应

在这整个分布式锁的竞争过程中,大量的“ Watcher通知”和“子节点列表获取”两个操作重复运行,并且绝大多数的运行结果都是判断出自己并非是序号最小的节点,从而继续等待下一次通知,这个看起来显然不怎么科学。客户端无端地接收到过多和自己并不相关的事件通知,如果在集群规模比较大的情况下,不仅会对ZooKeeper服务器造成巨大的性能影响和网络冲击,更为严重的是,如果同一时间有多个节点对应的客户端完成事务或是事务中断引起节点消失,ZooKeeper服务器就会在短时间内向其余客户端发送大量的事件通知——这就是所谓的羊群效应
上面这个ZooKeeper分布式共享锁实现中出现羊群效应的根源在于,没有找准客户端真正的关注点。我们再来回顾一下上面的分布式锁竞争过程,它的核心逻辑在于:判断自己是否是所有子节点中序号最小的。于是,很容易可以联想到,每个节点对应的客户端只需要关注比自己序号小的那个相关节点的变更情况就可以了一而不需要关注全局的子列表变更情况。

7.2.3 改进后的分布式锁实现

现在我们来看看如何改进上面的分布式锁实现。首先,我们需要肯定的一点是,上面提.到的共享锁实现,从整体思路上来说完全正确。这里主要的改动在于:**每个锁竞争者,只需要关注/shared_lock节点下序号比自己小的那个节点是否存在即可,**具体实现如下。

  1. 客户端调用create()方法创建-一个类似于“/shared_ lock/[Hostname]-请求类型序号”的临时顺序节点。
  2. 客户端调用getChildren() 接口来获取所有已经创建的子节点列表,注意,这里不注册任何Watcher。
  3. 如果无法获取共享锁,那么就调用exist()来对比自己小的那个节点注册Watcher。
    注意,这里“比自己小的节点”只是一个笼统的说法,具体对于读请求和写请求不一样。
    读请求:向比自己序号小的最后一个写请求节点注册Watcher监听。
    写请求:向比自己序号小的最后一个节点注册Watcher监听。
  4. 等待Watcher通知,继续进入步骤2。
    改进后的分布式锁流程如下图所示。
    在这里插入图片描述

7.2.4 建议

在多线程并发编程实践中,我们会去尽量缩小锁的范围一对于分布式锁实现的改进其实也是同样的思路。那么对于开发人员来说,是否必须按照改进后的思路来设计实现自己的分布式锁呢?答案是否定的。在具体的实际开发过程中,建议根据具体的业务场景和集群规模来选择适合自己的分布式锁实现。
在集群规模不大、网络资源丰富的情况下,第一种分布式锁实现方式是简单实用的选择
而如果集群规模达到一定程度,并且希望能够精细化地控制分布式锁机制,那么不妨试试改进版的分布式锁实现

八、分布式队列

分布式队列,简单地讲分为两大类,一种是常规的先入先出队列,另一种则是要等到队列元素集聚之后才统一安排执行的Barrier模型。

8.1 FIFO:先入先出队列

FIFO (First Input First Output,先入先出)的算法思想,以其简单明了的特点,广泛应用于计算机科学的各个方面。而FIFO队列也是一种非常典型且应用广泛的按序执行的。

队列模型:先进入队列的请求操作先完成后,才会开始处理后面的请求。
使用ZooKeeper实现FIFO队列,和共享锁的实现非常类似。FIFO 队列就类似于一个全写的共享锁模型,大体的设计思路其实非常简单:所有客户端都会到/queue_ fifo这个节点下面创建一个临时顺序节点,例如/queue_ fifo/192.168.0.1-0000000001

创建完节点之后,根据如下4个步骤来确定执行顺序。

  1. 通过调用getChildren()接口来获取/queue_ fifo 节点下的所有子节点,即获取队列中所有的元素。
  2. 确定自己的节点序号在所有子节点中的顺序。
  3. 如果自己不是序号最小的子节点,那么就需要进入等待,同时向比自己序号小的最后一个节点注册Watcher监听。
  4. 接收到Watcher通知后,重复步骤1。

8.2 Barrier:分布式屏障

Barrier原意是指障碍物、屏障,而在分布式系统中,特指系统之间的一个协调条件,规定了一个队列的元素必须都集聚后才能统一进行安排, 否则一直等待。这往往出现在那些大规模分布式并行计算的应用场景上:最终的合并计算需要基于很多并行计算的子结果来进行。

这些队列其实是在FIFO队列的基础上进行了增强,大致的设计思想如下:
开始时,/queue_ barrier 节点是一个已经存在的默认节点,并且将其节点的数据内容赋值为一个数字n来代表Barrier 值,例如n=10表示只有当/queue_ barrier 节点下的子节点个数达到10后,才会打开Barrier。 之后,所有的客户端都会到/queue_ barrier 节点下创建一个临时节点,例如/queue_ barrier/192.168.0.1

创建完节点之后,根据如下5个步骤来确定执行顺序。

  1. 通过调用getData()接口获取/queue_ barrier 节点的数据内容: 10。
  2. 通过调用getChildren( )接口获取/queue_ barrier 节点下的所有子节点,即获取队列中的所有元素,同时注册对子节点列表变更的Watcher监听。
  3. 统计子节点的个数。
  4. 如果子节点个数还不足10个,那么就需要进入等待。
  5. 接收到Watcher通知后,重复步骤2。
发布了46 篇原创文章 · 获赞 10 · 访问量 4313

Guess you like

Origin blog.csdn.net/weiwei_six/article/details/104084711