【消息队列之Disque 】

 Disque is ongoing experiment to build a distributed, in memory, message broker. Its goal is to capture the essence of the "Redis as a jobs queue" use case, which is usually implemented using blocking list operations, and move it into an ad-hoc, self-contained, scalable, and fault tolerant design, with simple to understand properties and guarantees, but still resembling Redis in terms of simplicity, performances, and implementation as a C non-blocking networked server.

Disque是Redis之父Salvatore Sanfilippo新开源的一个分布式内存消息代理。它适应于“Redis作为作业队列”的场景,但采用了一种专用、独立、可扩展且具有容错功能的设计,兼具Redis的简洁和高性能,并且用C语言实现为一个非阻塞网络服务器。有一点需要提请读者注意,在Disque项目文档及本文中,“消息(Message)”和“作业(Job)”可互换。

Disque是一个独立于Redis的新项目,但它重用了Redis网络源代码、节点消息总线、库和客户端协议的一大部分。由于Disque使用了与Redis相同的协议,所以可以直接使用Redis客户端连接Disque集群,只是需要注意,Disque的默认端口是7711,而不是6379。

作为消息代理,Disque充当了需要进行消息交换的进程之间的一个中间层,生产者向其中添加供消费者使用的消息。这种生产者-消费者队列模型非常常见,其主要不同体现在一些细节方面:

  • Disque是一个同步复制作业队列,在默认情况下,新增任务会复制到W个节点上,W-1个节点发生故障也不会影响消息的传递。
  • Disque支持至少一次和至多一次传递语义,前者是设计和实现重点,而后者可以通过将重试时间设为0来实现。每个消息的传递语义都是单独设置的,因此,在同一个消息队列中,语义不同的消息可以共存。
  • 按照设计,Disque的至少一次传递是近似一次传递,它会尽力避免消息的多次传递。
  • Disque集群的所有节点都有同样的角色,也就是“多主节点(multi-master)”。生产者和消费者可以连接到不同的队列或节点,节点会根据负载和客户端请求自动交换消息。
  • Disque支持可选的异步命令。在这种模式下,生产者在向一个复制因子不为1的队列中添加一个作业后,可以不必等待复制完成就可以转而执行其它操作,节点会在后台完成复制。
  • 在超过指定的消息重试时间后,Disque会自动将未收到响应的消息重新放入队列。
  • 在Disque中,消费者使用显式应答来标识消息已经传递完成。
  • Disque只提供尽力而为排序。队列根据消息创建时间对消息进行排序,而创建时间是通过本地节点的时钟获取的。因此,在同一个节点上创建的消息通常是按创建顺序传递的,但Disque并不提供严格的FIFO语义保证。比如,在消息重新排队或者因为负载均衡而移至其它节点时,消息的传递顺序就无法保证了。所以,Salvatore指出,从技术上讲,Disque严格来说并不是一个队列,而更应该称为消息代理。
  • Disque通过四个参数提供了细粒度的作业控制,分别是复制因子(指定消息的副本数)、延迟时间(将消息放入队列前的最小等待时间)、重试时间(设置消息何时重新排队)、过期时间(设置何时删除消息)。

Disque is a distributed and fault tolerant message broker, so it works as middle layer among processes that want to exchange messages.

Producers add messages that are served to consumers. Since message queues are often used in order to process delayed jobs, Disque often uses the term "job" in the API and in the documentation, however jobs are actually just messages in the form of strings, so Disque can be used for other use cases. In this documentation "jobs" and "messages" are used in an interchangeable way.

Job queues with a producer-consumer model are pretty common, so the devil is in the details. A few details about Disque are:

Disque is a synchronously replicated job queue. By default when a new job is added, it is replicated to W nodes before the client gets an acknowledge about the job being added. W-1 nodes can fail and still the message will be delivered.

Disque supports both at-least-once and at-most-once delivery semantics. At least once delivery semantics is where most efforts were spent in the design and implementation, while the at most once semantics is a trivial result of using a retry time set to 0 (which means, never re-queue the message again) and a replication factor of 1 for the message (not strictly needed, but it is useless to have multiple copies of a message around if it will be delivered at most one time). You can have, at the same time, both at-least-once and at-most-once jobs in the same queues and nodes, since this is a per message setting.

Disque at-least-once delivery is designed to approximate single delivery when possible, even during certain kinds of failures. This means that while Disque can only guarantee a number of deliveries equal or greater to one, it will try hard to avoid multiple deliveries whenever possible.

Disque is a distributed system where all nodes have the same role (aka, it is multi-master). Producers and consumers can attach to whatever node they like, and there is no need for producers and consumers of the same queue, to stay connected to the same node. Nodes will automatically exchange messages based on load and client requests.

Disque is Available (it is an eventually consistent AP system in CAP terms): producers and consumers can make progresses as long as a single node is reachable.

Disque supports optional asynchronous commands that are low latency for the client but provide less guarantees. For example a producer can add a job to a queue with a replication factor of 3, but may want to run away before knowing if the contacted node was really able to replicate it to the specified number of nodes or not. The node will replicate the message in the background in a best effort way.

Disque automatically re-queue messages that are not acknowledged as already processed by consumers, after a message-specific retry time. There is no need for consumers to re-queue a message if it was not processed.

Disque uses explicit acknowledges in order for a consumer to signal a message as delivered (or, using a different terminology, to signal a job as already processed).

Disque queues only provides best effort ordering. Each queue sorts messages based on the job creation time, which is obtained using the wall clock of the local node where the message was created (plus an incremental counter for messages created in the same millisecond), so messages created in the same node are normally delivered in the same order they were created. This is not causal ordering since correct ordering is violated in different cases: when messages are re-issued because not acknowledged, because of nodes local clock drifts, and when messages are moved to other nodes for load balancing and federation (in this case you end with queues having jobs originated in different nodes with different wall clocks). However all this also means that normally messages are not delivered in random order and usually messages created first are delivered first.

Note that since Disque does not provide strict FIFO semantics, technically speaking it should not be called a message queue, and it could better identified as a message broker. However I believe that at this point in the IT industry a message queue is often more lightly used to identify a generic broker that may or may not be able to guarantee order in all the cases. Given that we document very clearly the semantics, I grant myself the right to call Disque a message queue anyway.

Disque provides the user with fine-grained control for each job using three time related parameters, and one replication parameter. For each job, the user can control:

  1. The replication factor (how many nodes have a copy).
  2. The delay time (the min time Disque will wait before putting the message in a queue, making the message deliverable).
  3. The retry time (how much time should elapse, since the last time the job was queued, and without an acknowledge about the job delivery, before the job is re-queued again for delivery).
  4. The expire time (how much time should elapse for the job to be deleted regardless of the fact it was successfully delivered, i.e. acknowledged, or not).

Finally, Disque supports optional disk persistence, which is not enabled by default, but that can be handy in single data center setups and during restarts.

猜你喜欢

转载自gaojingsong.iteye.com/blog/2392419