Comparison of the properties of the individual messages queue

This article from the comprehensive comparison, Kafka, RabbitMQ, ZeroMQ, RocketMQ, ActiveMQ 17 as Differences in the aspect of the use of message queues.

First, the information document

Kafka: in. Kafka books written by the author himself, there are some online information. rabbitmq: more. There are some good books, online information and more. zeromq: less. Zeromq not specifically written books, online information and more is to achieve some code and a brief introduction. rocketmq: less. No specific written book rocketmq of good and bad information on the Internet, the official documentation is very simple, but not too much of a description of the technical details. activemq: more. Activemq not specifically written books, online information and more.

Second, the development of language

Kafka:Scala rabbitmq:Erlang zeromq:c rocketmq:java activemq:java

Third, the supported protocols

Kafka: their definition of a set ... (based on TCP) rabbitmq: AMQP zeromq: TCP, UDP rocketmq: their definition of a set ... activemq: OpenWire, STOMP, REST, XMPP, AMQP

Fourth, the message store

Kafka: memory, disk, database. It supports a lot of accumulation.

kafka partition is the smallest storage unit, comprising a plurality of partitions topic, theme when creating kafka, these partitions are allocated on multiple servers, one server is typically a broker. Zoning chief evenly distributed across different servers, partition copy will be evenly distributed across different servers, load balancing and high availability to ensure that, when adding a new broker cluster, partial copy will be moved to the new broker on. According directory listing profile, kafka new partition will be assigned to the fewest number of directory listing in the directory partition. By default, the partition algorithm uses a polling message evenly distributed in different partitions of the same subject matter, for the case when delivering the key, will be stored according to the corresponding partition value of the key modulo hashcode.

rabbitmq: memory, disk. Support small amount of accumulation.

rabbitmq messages into persistent and non-persistent message is a message, whether it is persistent or non-persistent message is a message can be written to disk. Persistent message reaches the queue is written to the disk, and if so, persistent message will save a backup copy in memory, which can improve certain performance will be cleared from memory when memory is tight when. Non-persistent messages are generally only exists in memory, the memory in the tight time will be swapped to disk, to save memory.

Mirroring the introduction of queuing mechanism can be important Queue "copied" to other broker in the cluster, ensure these queues messages are not lost. Mirrored configuration queue contains a master node and a plurality of master, if the master fails, the longest added slave promoted to a new master, all actions except for sending the message to the slave nodes are transmitted from the master, then the the master broadcasts a command execution result to each slave, master RabbitMQ will uniformly distributed on different servers, while also the same queue of the slave evenly distributed on the different servers, to ensure load balancing and high availability.

zeromq: message transmitting side memory or disk. It does not support persistent.

rocketmq: disk. It supports a lot of accumulation.

commitLog actual files stored message data, each commitLog upper limit is 1G, will automatically create a new data file is saved after commitLog full. ConsumeQueue queue store only offset, size, tagcode, very small, distributed over a plurality of broker. ConsumeQueue equivalent CommitLog index file, looks for messages offset in commitLog in from consumeQueue when consumer spending and look for metadata commitLog go in.

Characteristics ConsumeQueue storage format to ensure that the process of sequential write write disk (write CommitLog file), IO write large amounts of data with a commitLog in order to write a new full 1G. Plus rocketmq cumulative 4K was forced into PageCache brush from disk (cache), so high concurrent write outstanding performance.

activemq: memory, disk, database. Support small amount of accumulation.

Fifth, the message transaction

Kafka: Support rabbitmq: support. The client channel is set to transaction mode, only if the message is rabbitMq to receive, in order to submit a successful transaction, or roll back after the catch exceptions. Use transaction results in performance decrease zeromq: does not support rocketmq: Support activemq: Support

Sixth, load balancing

Kafka: support load balancing.

1> a broker is usually a server node. For the same Topic different partitions, Kafka'll try to be distributed to different partitions Broker server, zookeeper save metadata broker, themes and partition information. Partition leader will handle production requests from clients, kafka partition leader will be assigned to a different broker server, so that different broker server shared task.

Each broker are cached metadata information, the client can get a broker metadata information from any and cached know where to send the request based on metadata information.

2> kafka consumer groups subscribe to the same topic, as much as possible will make every consumer assigned to the same number of partitions, share the load.

3> When consumers join or quit when the consumer group, will trigger rebalancing, reallocation of partitions for each consumer, share the load.

kafka most load balancing is done automatically, create a partition also kafka completed, hide a lot of details, to avoid the tedious configuration and load problems caused by human negligence.

4> transmission side to determine which key topic and the message to the partition, if the key is null, then the algorithm will be used to poll a balanced message sent to the same topic in different partitions. If the key is not null, then the calculated modulo will be sent to the key partition according hashcode.

rabbitmq: good support for load balancing.

1> queue to which the message is delivered is determined by the key and the switch, switches, routing keys, queues need to be created manually.

rabbitmq and sent from a client to establish a connection broker needs to know in advance what the broker exchanger, which queue. Usually declare the destination queue to be sent, if there is no target queue, creates a queue on the broker, if any, would be nothing processed, then send messages to the queue. Most assume the arduous task queues are created on the same broker, then the broker of the load will be too large. (You can pre-create a queue in front of the line, no need to declare a queue to be sent, but does not attempt to create a send queue, the queue can not find the problem may occur, rabbitmq backup switch will not find saved to a message queue special queue for later use query)

Mirroring queuing mechanism established rabbitmq cluster can solve this problem, forming a master-slave architecture, master node will be evenly distributed on different servers, so each server share the load. slave node only responsible for forwarding failure in the master will choose to join the longest slave becomes master.

When a new node joins the mirror queue, the queue of messages are not synchronized to the new slave, unless synchronization command invocation, but after calling the command, the queue will be blocked, you can not call synchronization command in a production environment.

2> When rabbitmq queue has a plurality of consumers, the consumer transmits a received message queue in a round robin distribution methods. Each message is sent only to subscribe to a list of customers, will not be repeated.

This approach is very suitable for expansion, and is specifically designed for concurrent programs.

If some consumers task more arduous, you can set basicQos limit the maximum number of unconfirmed news channel consumer can keep, the upper limit is reached, rabbitmq not send any messages to the consumer.

3> For rabbitmq, the client and cluster established TCP connection is not all nodes in the cluster to establish the connection, but the selection of one of the nodes to establish a connection.

However, clusters may rabbitmq means HAProxy, LVS techniques, or using an algorithm implemented in the client load balancing, load balancing after introduction, each client connections can be spread into the nodes of a cluster.

Client balancing algorithm:

1) polling method. A return address of the next connection server in order.

2) weighted round robin method. A high-profile, low-load machine configuration higher weight, let more requests; arranged low, high load machine, to lower their weight distribution, reducing the system load.

3) a random method. Connected to a randomly selected address of the server.

4) The weighted random method. Connection address randomly selected according to the probability.

5) source address hashing. Obtained by calculating a hash function value, the size of the modulo operation with the value of the server list.

6) the minimum number of connection method. Dynamically selecting a minimum number of connections a server is connected to the current address.

zeromq: to the center, it does not support load balancing. Itself is just a multithreaded network library.

rocketmq: support load balancing.

A broker server is generally a node, the data is divided into master and slave broker, as master and slave storage, slave synchronization data from the master.

1> nameserver and to keep each cluster member heartbeat, holds Topic-Broker routing information, the same topic queue will be distributed on different servers.

2> Send message transmitted by way of polling queues, each queue message received average amount. Send message specifies topic, tags, keys, which can not be delivered to the designated queue (does not make sense, clusters consumption and consumption with broadcast messages stored in the queue does not matter which).

Optional tags similar to tag each message Gmail provided to facilitate the use of filtering servers. Currently, only a set each message tag, so it can be compared to the concept Notify the MessageType.

keys Optional, this message on behalf of the business keyword, the server will create a hash index based on keys, after setting, you can query messages Console system according Topic, Keys, as is the hash index, please ensure that the only possible key, for example, order number, and other merchandise Id.

3> load balancing strategy of provisions rocketmq: Consumer number should be less than equal to the number Queue, if more than the number of Consumer Queue, then the excess Consumer will not be able to consume messages. This is consistent and kafka, rocketmq will be possible for each Consumer allocate the same number of queues, share the load.

activemq: support load balancing. Based zookeeper can achieve load balancing.

Seven, the cluster approach

Kafka: Natural 'Leader-Slave' stateless cluster, each server is both a Master Slave.

Zoning chief evenly distributed across different kafka servers, partition copy is also evenly distributed over different kafka server, each server contains both partitions kafka leader, but also contains a copy of the partition, each server is a kafka Slave station kafka server, but also the leader of a Taiwan kafka server.

kafka cluster depends on the zookeeper, zookeeper support hot expansion, all the broker, consumers, partitions can be dynamically added to remove without shutting down the service, compared to not rely zookeeper cluster mq, which is the biggest advantage.

rabbitmq: supports simple clustering, 'Copy' mode, advanced clustering mode support is not good.

Each node rabbitmq, whether it is a single-node system or part of a cluster, the nodes either memory or disk node in the cluster must have at least one disk node.

Rabbitmq create a queue in the cluster, the cluster will create a queue queue process and complete information (metadata, status, content) in a single node, instead of creating on all nodes.

Mirroring the introduction of the queue, to avoid single points of failure, ensure the availability of services, but requires manual configuration image for some important queue.

zeromq: to the center, do not support cluster.

rocketmq: common multiple of 'Master-Slave' mode, for an open-source version manually switched into Master Slave

Name Server is a virtually stateless nodes, clusters can be deployed without any information synchronization between nodes.

Broker relatively complex deployment, Broker into Master and Slave, Master may correspond to a plurality of Slave, a Slave but corresponds to only one Master, Master and Slave in a correspondence relationship by specifying the same BrokerName, different BrokerId defined, BrokerId 0 represents Master, nonzero if Slave. Master can also deploy multiple. Each Broker Name Server build all nodes in the cluster long connection, the timing information to all registered Topic Name Server.

Producer Name Server and wherein a node cluster (random selection) to establish long connection route taken regularly Topic Name Server information, and provide long Topic Service Master established connection, and the timer sends a heartbeat to the Master. Producer completely stateless, can be clustered deployment.

Consumer Name Server cluster with one of the nodes (randomly selected) long connection established, from the routing information taken periodically Topic Name Server, and to provide services Topic Master, Slave long established connections, and the timing of sending a heartbeat to the Master, Slave. Consumer can either subscribe to the Master, you can subscribe to news from the Slave, subscription rules by the Broker configuration decisions.

The client first find NameServer, and then find the Broker by NameServer.

A topic with a plurality of queues, the queues will be evenly distributed on different servers broker. The concept of zoning concepts and kafka rocketmq queue is basically the same, kafka same topic as much as possible partition distributed on different broker, partition copy will be distributed on a different broker.

slave rocketmq cluster will pull data from the backup master, master distributed on different broker.

activemq: supports simple cluster models, such as 'master - standby' mode for advanced cluster support is not good.

Eight, management interface

Kafka: General rabbitmq: Good zeromq: None rocketmq: None activemq: General

Nine, availability

Kafka: very high (distributed) rabbitmq: High (master-slave) zeromq: high. rocketmq: very high (distributed) activemq: High (master and slave)

Ten, the message is repeated

Kafka:支持at least once、at most once

rabbitmq:支持at least once、at most once

zeromq: Only retransmission mechanism, but no persistent message retransmission of lost no avail. Neither at least once, nor is it at most once, but not exactly only once

rocketmq:支持at least once

activemq:支持at least once

XI throughput TPS

Kafka: Kafka great send messages and consume messages in batches. Transmit end a plurality of small combined message, sent to the bulk Broker, the consumer takes out the end of the message batch process a batch. rabbitmq: large zeromq: great rocketmq: large quantities rocketMQ receiver can consume messages, you can configure the number of consumption of each message, but the sender is not sent in bulk. activemq: large

XII subscription form and message distribution

Kafka: Based on topic and publish regular subscription model matching according to topic.

【send】

Send and end by topic partition key to determine which messages are sent, if the key is null, then the algorithm will be used to poll a balanced message sent to the same topic in different partitions. If the key is not null, then the calculated modulo will be sent to the key partition according hashcode.

【receive】

1> consumer sends a heartbeat to the group coordinator broker to maintain their affiliation and ownership group and their relationship to the partition, once ownership is assigned will not change unless rebalancing occurs (such as a consumer joins or leaves the consumer group), consumer only reads from the message the corresponding partition.

2> kafka consumer limit the number of partitions to be smaller than the number, each message will only be a Consumer Group same consumption of a consumer (non-broadcast).

3> kafka the same Topic subscription Consumer Group, will as far as possible that each consumer assigned to the same number of partitions, with a different theme Consumer Group Feed independently, the same message is different Consumer Group process.

rabbitmq: offers four kinds: direct, topic, Headers and fanout.

【send】

Must first declare a queue, the queue will be created or have been created, the queue is the basic storage unit.

Determined by the exchange and the key messages which are stored in the queue.

direct> bindingKey exact match and sent to a queue.

String topic> containing routing key ".", And sent to contain "*", "#" fuzzy matching bingKey corresponding queue.

fanout> regardless of the key, it is sent to all queues and exchange bound

headers> regardless of the key, headers attribute of the message content (key-value pair) and a binding key for exact matches will be sent to this queue. This way low performance generally do not

【receive】

rabbitmq queue is the basic storage unit, is no longer partition or slice, we have created for the queue, the consumer side to specify which receiving a message queue.

When rabbitmq queue with multiple consumers, is sent to the consumer will receive a message queue polling distribution methods. Each message is sent only to subscribe to a list of customers, will not be repeated.

This approach is very suitable for expansion, and is specifically designed for concurrent programs.

If some consumers task more arduous, you can set basicQos limit the maximum number of unconfirmed news channel consumer can keep, the upper limit is reached, rabbitmq not send any messages to the consumer.

zeromq: peer to peer (p2p)

rocketmq: Based topic / messageTag and in accordance with the message type, the matching property of the regular pattern publish-subscribe

【send】

Sending a message sent by way of polling queues, each queue message received average amount. Send message specifies topic, tags, keys, which can not be delivered to the designated queue (does not make sense, clusters consumption and consumption with broadcast messages stored in the queue does not matter which).

Optional tags similar to tag each message Gmail provided to facilitate the use of filtering servers. Currently, only a set each message tag, so it can be compared to the concept Notify the MessageType.

keys Optional, this message on behalf of the business keyword, the server will create a hash index based on keys, after setting, you can query messages Console system according Topic, Keys, as is the hash index, please ensure that the only possible key, for example, order number, and other merchandise Id.

【receive】

1> Broadcast consumption. A message is consumed more Consumer, Consumer even belong to the same ConsumerGroup, the message will be ConsumerGroup each Consumer can consume time.

2> cluster consumption. Consumer example of a Consumer Group split evenly consume messages. Topic nine e.g. a message, wherein a Consumer Group has three instances, each instance consuming only three of these messages. Each message queue that is in turn distributed to regard each consumer.

activemq: peer to peer (p2p), broadcast (publish - subscribe)

Point to point mode, each message is only a consumer;

Publish / subscribe model, each message can have multiple consumers.

【send】

Point to Point mode: We will specify a queue, the queue will be created or have been created.

Publish / subscribe model: We will specify a topic, the topic will be created or have been created.

【receive】

Point to Point mode: For already created a queue, the consumer side from which you want to specify a queue to receive messages.

Publish / subscribe model: For already created topic, the consumer side to specify which news topic subscription.

XIII, order news

Kafka: support.

1 is provided max.in.flight.requests.per.connection producers, it can be written to ensure that the message server according to the transmission order even if a retry occurs.

kafka guarantee the same partition where the news is ordered, but ordered that the two cases

1> key one by one is null, the message is written to different hosts partitions, each partition but still ordered for the

2> key is not null, a message is written to the same partition, the partition information is ordered.

rabbitmq: not supported

zeromq: not supported

rocketmq: Support

activemq: not supported

XIV message acknowledgment

Kafka: support.

1> transmission confirmation mechanism

ack = 0, regardless of whether the message is successfully written to the partition

After the ack = 1, the message is successfully written to the leader partition, return success

After the ack = all, the message is successfully written to all partitions, return success.

2> receiver acknowledgment mechanism

Submit partition offset automatically or manually, an earlier version of kafka Zookeeper submitted to offset, such that relatively large pressure zookeeper, updated version of the offset kafka kafka is submitted to the server, no longer dependent on zookeeper performance group, a cluster of more stable.

rabbitmq: support.

After 1> transmission confirmation mechanism, the message is delivered to all matching queue, return success. If the message queue and are persistent, then after written to disk, return success. Support batch and asynchronous confirmation confirmation.

2> receiver acknowledgment mechanism autoAck set to false, require explicit acknowledgment, autoAck set to true, the automatic confirmation.

When autoAck is false, RabbitMQ queue will be divided into two parts, one message waiting for delivery to the consumer, but the part has not received the delivery acknowledgment message. If there has been no acknowledgment is received, and the consumer has been disconnected, rabbitmq will arrange for the news to re-enter the queue, and delivered to the original consumer or the next consumer.

Unconfirmed message does not have an expiration time, if there has been no confirmation and no disconnect, rabbitmq waits, rabbitmq allow a message processing time can be long, long time.

zeromq: support.

rocketmq: support.

activemq: support.

XV messages back

Kafka: Support backtracking specified partition offset location. rabbitmq: does not support zeromq: does not support rocketmq: Support backtracking given point in time. activemq: not supported

XVI message retry

Kafka: not supported, but can be achieved.

kafka offset specified partition back support position, can be achieved retry message.

rabbitmq: not supported, but you can use message acknowledgment mechanism to achieve.

rabbitmq receiver acknowledgment mechanism autoAck set to false.

When autoAck is false, RabbitMQ queue will be divided into two parts, one message waiting for delivery to the consumer, but the part has not received the delivery acknowledgment message. If there has been no acknowledgment is received, and the consumer has been disconnected, rabbitmq will arrange for the news to re-enter the queue, and delivered to the original consumer or the next consumer.

zeromq: it does not support,

rocketmq: support.

In most scenarios news consumption failed to immediately retry 99% will fail, so rocketmq strategy is timed retry when the consumer fails same each time interval.

1> send method of transmitting end itself supports internal retry, retry logic is as follows:

a) to 3 retried;

b) If it fails, the next rotation Broker;

Total time c) does not exceed the value of this method sendMsgTimeout set default 10s, the retry time is not exceeded.

2> the receiving end.

Consumer consumption after news failed to provide a retry mechanism, so that the message once again consumption. Consumer Consumer failure message can usually be divided into the following two cases:

  1. Due to the message itself, e.g. deserialized failure, the message data itself can not be processed (e.g. prepaid recharge, this is the phone number of the message

Log out, you can not recharge) and so on. A timing retry mechanism, such as over 10s seconds and try again.

  1. Since the downstream application dependent service is unavailable, e.g. db connection is unavailable, and the like outside the system network is unreachable.

Even if the current message is skipped failed, a message will also consume other error. This situation can sleep 30s, and then the next message consumer, to reduce the pressure Broker retry message.

activemq: not supported

XVII concurrency

Kafka: High

A thread a consumer, the consumer's number limitation Kafka less than or equal to the number of partitions, if you want to increase the degree of parallelism, multi-threading can then turn the consumer, the consumer or to increase the number of instances.

rabbitmq: high

Itself is written in Erlang, high concurrent performance.

The consumer can open the multi-threading, the most common approach is a channel corresponding to a consumer, a grasping each thread channel, a plurality of connecting threads tcp connection multiplexing, reducing performance overhead.

When rabbitmq queue with multiple consumers, is sent to the consumer will receive a message queue polling distribution methods. Each message is sent only to subscribe to a list of customers, will not be repeated.

This approach is very suitable for expansion, and is specifically designed for concurrent programs.

If some consumers task more arduous, you can set basicQos limit the maximum number of unconfirmed news channel consumer can keep, the upper limit is reached, rabbitmq not send any messages to the consumer.

zeromq: High

rocketmq: High

1> rocketmq number less than or equal to limit consumer queue number, but can be re-opened multithreaded among consumers, and it is consistent kafka, increase the degree of parallelism in the same manner.

Modify consumption parallelism method

Under a) the same ConsumerGroup, by increasing the degree of parallelism to increase the number of Consumer instances, more than the number of queues Consumer instances subscription is invalid.

b) Increased consumption Consumer parallel single threads, by modifying the parameters consumeThreadMin, consumeThreadMax

2> Connection with a network connection, the client may send a request plurality of threads simultaneously, the connection is multiplexed, reducing performance overhead.

activemq: High

Receiving a message and consumption speed of 10,000 individual ActiveMQ pen / sec (persistence typically 1-2 million, or 20,000 or more non-persistent), 10 Activemq deployed in a production environment can reach 100K / sec performance, deploy the more activemq broker latency is also lower in the MQ, the higher system throughput.

Published 400 original articles · won praise 940 · views 490 000 +

Guess you like

Origin blog.csdn.net/A_BlackMoon/article/details/104382559