Kafka vs ActiveMQ

Kafka is a high-performance, distributed messaging system developed by LinkedIn, which is widely used in scenarios such as log collection, streaming data processing, and online and offline message distribution. Although not designed as a traditional MQ, in most cases, Kafka can also replace traditional messaging systems such as the original ActiveMQ.

Kafka organizes message streams by topics, the server that saves messages is called Broker, and consumers can subscribe to one or more topics. In order to balance the load, the messages of a topic can be divided into multiple partitions. The more partitions, the higher the parallelism and throughput of Kafka.

The Kafka cluster needs zookeeper support to realize the cluster. The latest kafka distribution already includes zookeeper. When deploying, you can start a zookeeper server and a Kafka server on one server at the same time, or you can use other existing zookeeper clusters.

Different from traditional MQ, consumers need to keep an offset by themselves. When getting messages from kafka, only the messages after the current offset are pulled. The client of Kafka's scala/java version has implemented this part of the logic and saves the offset to zookeeper. Each consumer can choose an id, and consumers with the same id will only receive the same message once. If consumers of a topic all use the same id, it is a traditional Queue; if each consumer uses a different id, it is a traditional pub-sub.

If in the MQ scenario, compare Kafka with ActiveMQ:

Kafka Advantages

Distributed and highly scalable. Kafka clusters can be transparently scaled, adding new servers into the cluster.

high performance. The performance of Kafka is much higher than that of traditional MQ implementations such as ActiveMQ and RabbitMQ, especially Kafka also supports batch operations. The following figure is the result of linkedin's consumer performance stress test:





fault tolerance. The data of each Partition in Kafka is replicated to several servers. When a Broker fails, the ZooKeeper service will notify producers and consumers, and the producers and consumers will switch to other Brokers.


Disadvantage of Kafka

Duplicate messages. Kafka only guarantees that each message will be delivered at least once, although the probability is small, a message may be delivered multiple times.
Messages are out of order. Although messages within a Partition are guaranteed to be ordered, if a Topic has multiple Partitions, the delivery of messages between Partitions is not guaranteed to be ordered.
Complexity. Kafka needs the support of the zookeeper cluster. Topic usually needs to be manually created, deployed and maintained at a higher cost than general message queues.

Original link: http://www.dongliu.net/post/622449

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326426854&siteId=291194637