Message Queue learning - Kafka Concepts Understanding

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/ydm19891101/article/details/90759929

Earlier we in turn learn RabbitMQ, ActiveMQ middleware, today we went to learn about Kafka.

I. Overview

Kafka is a distributed based publish / subscribe messaging system, has the following characteristics

  • At the same time publish / subscribe provide a high amount of spit. kafka design goal is O (1) time complexity provides message persistence
  • Message persistence. Support persistent messages to disk.
  • distributed. Support messages partitions between servers and distributed consumption, while ensuring sequential transmission of messages for each partition. Its internal Producer, Broker, Consumer is a distributed architecture, it is easier to horizontal expansion
  • Consuming messages using pull mode. Kafka push message data to the producer, the consumer data from Kafka pull message, a push-pull.
  • Support for offline, online scene. At the same time support for offline data processing and real-time data processing

Second, the basic concept

Before the beginning of further study, we start to understand some basic concepts (many of the concepts are universal in message queue)

Broker (Agent): Kafka cluster of one or more servers, stateless Broker, supports the horizontal extension, the more servers in the cluster, the greater the throughput

Topic: Kafka published to each message has a category

Partition: physical partitions on topic, a topic can be divided into a plurality of Partition, each Partition is an ordered queue

Producer: Producers message can be understood as a client

Consumer: Consumers message, you can specify the consumer groups for consumers. If not specified, the default is part of the consumer group. In addition, status messages are processed is maintained at Consumer end, rather than the server, the stateless Broker, the Consumer your saved offset.

Consumer Group: Consumer groups, each consumer belongs to a specific group of customers, a Topic can be consumed by multiple consumer groups.

Third, architecture

Said so much, we need to understand from a macro look at the architecture of the Kakfa.

 

                                                                                               figure 1

 

The following table describes each of the components shown in the image.

S.No Components and instructions
1

Broker (agent)

Kafka Broker cluster is commonly composed of a plurality of high availability to ensure and increase throughput. Broker is stateless, easier to horizontal expansion, they use ZooKeeper cluster to maintain their status. Kafka a proxy instance can handle thousands of times per second read and write, each TB Broker can process messages, without affecting performance. Kafka broker leadership election can be done by ZooKeeper.

2

ZooKeeper

ZooKeeper for the management and coordination Broker. ZooKeeper services mainly for the presence of any new agency or Kafka system agent failed to inform producers and consumers Kafka system. According Zookeeper received notification of the presence of the agent or failure, then the product and consumers to take decisions and coordinate their tasks and start some other agent.

3

Producers ( Producer )

Producers will push data to brokers. When a new Agent starts, all producers search it and automatically send a message to the new agency. Kafka producer does not wait for an acknowledgment from the agent, and the agent sending the message can be as fast speed processing.

4

Consumers ( Consumers )

Because Kafka agent is stateless, which means that consumers must be offset by the use of zoning to maintain how many messages have been consumed. If consumers shift that particular message, it means that consumers have consumed all previous messages. Issued to the consumer pulling agent asynchronous request to have a byte buffer ready for consumption. Consumers simply by providing an offset value to rewind or skip at any point in the partition. Consumers offset value notified by ZooKeeper.

Fourth, application scenarios

When the beginning of the article, we talk about Kafka support for offline, online scene, here we come together to look at the two scenarios and other scenarios.

Explain the first two scenarios are based on Kafka advantage of high throughput.

1, user behavior data acquisition (online)

User behavior data collection refers to the complete user behavior information collected from the front end required for data analysis and other services.

Kafka's first scene is a user using a user behavior tracking reconstruction of the pipeline, the user's web browsing behavior (news) posted to the central theme by Kafka, usually has a theme for each type of behavior. These subsequent messages can be processed in real time, or loaded into the real-time monitoring Hadoop cluster or offline data warehouse.

Each user is browsing the Web activity generates a lot of information, so the amount of data is very large, and Kafka's high throughput and message Pull mode is more suitable for such a scenario.

2, log collection (offline)

In the company, we should have a unified platform to manage log log files generated by the project.

Based on Kafka's high throughput can be unified log output to Kafka, Kafka and then open the form of a unified interface and services to various customers.

At present, many companies do log unified platform program is to collect important system logs to Kafka, and then import the ES, consumer HDFS, Storm and other specific log data to, for real-time monitoring, real-time search analysis, off-line statistics, big data analytics , data mining.

 

V. Other

Here are my two questions point:

  • Whether Kafka ensure reliable delivery of the message it?
  • Release - under a subscription message mode, how to ensure reliable delivery of the message, such as a theme to be more consumer subscription is required for each subscriber sends ack message to Broker do?

reference:

 

Guess you like

Origin blog.csdn.net/ydm19891101/article/details/90759929