Comparison between RocketMQ and Kafka

      Taobao's internal transaction system uses the Notify message middleware independently developed by Taobao, and uses Mysql as the message storage medium, which can be completely expanded horizontally. In order to further reduce costs, we believe that the storage part can be further optimized. At the beginning of 2011, Linkin open sourced Kafka, which is excellent Kafka's message middleware, after the Taobao middleware team has done a full review of Kafka, Kafka's infinite message accumulation and efficient persistence speed attracted us, but at the same time, we found that this message system is mainly located in log transmission, for use in Taobao transactions, There are still many features that are not satisfied in scenarios such as orders and recharges. For this reason, we re-written RocketMQ in Java language, aiming at reliable message transmission without logs (log scenarios are also OK). At present, RocketMQ is widely used in orders in Alibaba Group. Transaction, recharge, stream computing, message push, log stream processing, binglog distribution and other scenarios.

data reliability

  • RocketMQ supports asynchronous real-time flushing, synchronous flushing, synchronous replication, and asynchronous replication
  • Kafka uses asynchronous brushing, asynchronous Replication

Summary: RocketMQ's synchronous flushing is more reliable than Kafka on a single machine, and will not cause data loss due to operating system crashes. At the same time, synchronous replication is also more reliable than Kafka asynchronous replication, and the data has no single point at all. In addition, Kafka's Replication is based on topic, which supports host downtime and automatic switching of standby machines. However, there is a problem here. Since it is asynchronous replication, data will be lost after switching. At the same time, if the leader restarts, it will be different from the existing leader. A data conflict occurs. The open source version of RocketMQ does not support the downtime of the Master, and the Slave automatically switches to the Master. The Alibaba Cloud version of RocketMQ supports the automatic switching feature.

Performance comparison

  • Kafka's single-machine write to TPS is about one million per second, and the message size is 10 bytes
  • RocketMQ single machine writes about 70,000 messages/second to TPS single instance, and deploys 3 brokers on a single machine, which can run up to 120,000 messages/second, and the message size is 10 bytes.

Summary: Kafka's TPS runs to one million on a single machine, mainly because the Producer side merges multiple small messages and sends them to the Broker in batches.

Why didn't RocketMQ do this?

  1. Producer usually uses Java language, cache too many messages, GC is a very serious problem
  2. The Producer calls the send message interface, but the message is not sent to the Broker, and returns success to the business. At this time, the Producer is down, which will result in message loss and business errors.
  3. Producer is usually a distributed system, and each machine is sent by multiple threads. We believe that the amount of data generated by a single Producer per second in an online system is limited, and it is impossible to tens of thousands.
  4. The function of caching can be completely completed by the upper-layer business.

Number of queues supported by a single machine

  • If there are more than 64 queues/partitions in a single Kafka machine, the load will increase significantly. The more queues, the higher the load, and the longer the response time for sending messages.
  • RocketMQ single machine supports up to 50,000 queues, and Load will not change significantly

What are the benefits of having more queues?

  1. A single machine can create more topics, because each topic is composed of a batch of queues
  2. The cluster size of the Consumer is proportional to the number of queues. The more queues, the larger the Consumer cluster can be.

real-time message delivery

  • Kafka uses a short polling method, and the real-time performance depends on the polling interval
  • RocketMQ uses long polling, which is consistent with the real-time push method. The delivery delay of messages is usually a few milliseconds.

Consumption failure retry

  • Kafka consumption failure does not support retry
  • RocketMQ consumption failure supports scheduled retry, and the interval between each retry is extended

Summary: For example, a recharge application calls the operator's gateway at the current moment. If the recharge fails, it may be because the other party is under too much pressure. The call will succeed later. For example, Alipay debits the bank for a similar requirement.

The retry here requires reliable retry, that is, the message of the failed retry is not lost because the Consumer is down.

Strict message order

  • Kafka supports message order, but when a Broker goes down, it will generate message out of order
  • RocketMQ supports strict message order. In the sequential message scenario, after a Broker goes down, sending messages will fail, but it will not be out of order.

Mysql Binlog distribution requires strict message ordering

timed message

  • Kafka does not support scheduled messages
  • RocketMQ supports two types of timing messages
    • The open source version of RocketMQ only supports timing levels
    • Alibaba Cloud ONS supports timing Level and specified millisecond-level delay time

Distributed Transactional Messaging

  • Kafka does not support distributed transactional messages
  • 阿里云ONS支持分布式定时消息,未来开源版本的RocketMQ也有计划支持分布式事务消息

消息查询

  • Kafka不支持消息查询
  • RocketMQ支持根据Message Id查询消息,也支持根据消息内容查询消息(发送消息时指定一个Message Key,任意字符串,例如指定为订单Id)

总结:消息查询对于定位消息丢失问题非常有帮助,例如某个订单处理失败,是消息没收到还是收到处理出错了。

消息回溯

  • Kafka理论上可以按照Offset来回溯消息
  • RocketMQ支持按照时间来回溯消息,精度毫秒,例如从一天之前的某时某分某秒开始重新消费消息

总结:典型业务场景如consumer做订单分析,但是由于程序逻辑或者依赖的系统发生故障等原因,导致今天消费的消息全部无效,需要重新从昨天零点开始消费,那么以时间为起点的消息重放功能对于业务非常有帮助。

消费并行度

  • Kafka的消费并行度依赖Topic配置的分区数,如分区数为10,那么最多10台机器来并行消费(每台机器只能开启一个线程),或者一台机器消费(10个线程并行消费)。即消费并行度和分区数一致。

  • RocketMQ消费并行度分两种情况

    • 顺序消费方式并行度同Kafka完全一致
    • 乱序方式并行度取决于Consumer的线程数,如Topic配置10个队列,10台机器消费,每台机器100个线程,那么并行度为1000。

消息轨迹

  • Kafka不支持消息轨迹
  • 阿里云ONS支持消息轨迹

开发语言友好性

  • Kafka采用Scala编写
  • RocketMQ采用Java语言编写

Broker端消息过滤

  • Kafka不支持Broker端的消息过滤
  • RocketMQ支持两种Broker端消息过滤方式
    • 根据Message Tag来过滤,相当于子topic概念
    • 向服务器上传一段Java代码,可以对消息做任意形式的过滤,甚至可以做Message Body的过滤拆分。

消息堆积能力

理论上Kafka要比RocketMQ的堆积能力更强,不过RocketMQ单机也可以支持亿级的消息堆积能力,我们认为这个堆积能力已经完全可以满足业务需求。

开源社区活跃度

  • Kafka社区更新较慢
  • RocketMQ的github社区相对完善积极

商业支持

  • Kafka原开发团队成立新公司,目前暂没有相关产品看到
  • RocketMQ在阿里云上已经开放公测近半年,目前以云服务形式免费供大家商用,并向用户承诺99.99%的可靠性,同时彻底解决了用户自己搭建MQ产品的运维复杂性问题

成熟度

  • Kafka在日志领域比较成熟
  • RocketMQ在阿里集团内部有大量的应用在使用,每天都产生海量的消息,并且顺利支持了多次天猫双十一海量消息考验,是数据削峰填谷的利器。

转:http://www.cnblogs.com/zhaoyan001/p/8435203.html

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325341481&siteId=291194637