Interview ask Kafka to read this article is enough

First, introduce Kafka

Insert picture description here
Reasons for high throughput:
1. Each consumer reads data in batches
2. Each consumer can consume partition messages at the same time in the consumer cluster
3. The producer sends messages and puts them in the buffer. The disadvantage is that the messages It is easy to lose
4, 0 copies, NIO itself supports, reducing the number of copies
5, to achieve data compression, reduce bandwidth transmission

1. The first time: read the disk file into the kernel buffer of the operating system;
2. The second time: copy the data of the kernel buffer to the buffer of the application application;
3. The third step: copy the application application The data in the buffer is copied to the socket network sending buffer (the buffer belonging to the operating system kernel);
4. The fourth time: copy the data of the socket buffer to the network card, and the network card performs network transmission.

1. Persistent network data to disk (Producer to Broker)
2. Disk files are sent over the network (Broker to Consumer)

The role of brokeid
does not follow the JMS specification Java Message Service (Java Message Server), only publish and subscribe
fast, scalable
partition

There are multiple producers, multiple brokers, and multiple consumers in a Kafka architecture. Each producer can correspond to multiple topics, and each consumer can only correspond to one consumer group.

Author: Big Data Division Chief Data
link: https: //www.jianshu.com/p/4bf007885116
Source: Jane books
are copyrighted by the author. For commercial reproduction, please contact the author for authorization, and for non-commercial reproduction, please indicate the source.

2. Why is RabbitMQ not clustered?

Similar to redis sharding (16384 card slots) cluster is the best, there will be no redundant data
in case of downtime? Copy storage

3. Will queues and switches persist messages?

4. Nouns

Broker: Broker means that one MQ server side, multiple Broker means that multiple different MQ server sides form a group;

Topic: A MQ server on the topic directory can store multiple different topics. Each topic is actually a message classification
: It is the parameter passed by asynchronous communication.

Partition: How large a group of partitions is to divide 10 million pieces of data in a partitioned database into 10 table card slots

Partitioning is implemented in Kafka: a Broker means a region

Producer: Producer delivers messages to MQ.
Consumer: MQ pushes messages to consumers.
Consumer Group: Grouping our consumers.
Offset: The offset is actually the index position of our message.

Kafaka has more partitions and topics, no queues, only topics for publish and subscribe

Five, distributed transactions

Solving distributed transactions is a kind of idea that has nothing to do with the framework.
Common solutions are 2PC / 3PC / MQ

6. Why Kafaka relies on Zookeeper

Register all brokers to zk, so that it is clustered, use node event notification to tell brokers which are groups

Because:
1. Kafka will store MQ information on zk, and consumers will also register on zk. Consumers do not pay attention to a few brokes to get information directly from zk.
2. In order to facilitate the expansion of the entire cluster, use zk Time notification mutual perception

Kafka only needs to modify broke.id to distinguish

7. Is cluster registration registered on each node or just one node?

There is a single node from the master to the slave,
such as Eureka is each node

Eight, the difference between kafka and rabbitmq

Nine, how does kafka guarantee message order

Ten, the difference between the queue and the theme

Queue: First-in-first-out
Topic: Encapsulation of queues

11. Why does MQ have the problem of message sequence?

Background:
1. Consumer cluster
2. MQ server cluster

The situation that the message will not be disrupted:
1. The message delivered by the producer is in the same broker and consumed by the same consumer

How to solve:
1. The key is the same, the hash is the same, at this time the message will be delivered in the same Broke, provided there is only one consumer
2.

Why does MQ cause the problem of message sequence?
1. If the producer sends messages to MQ with the same behavior, there is no need to consider the sequence. When the behavior is different, the sequence of messages must be considered.
2. MQ stores messages by default. It has a certain order by itself, following advanced First-out principle (in the case of a single MQ)
3. If there are multiple consumers subscribing to the same queue, the order of consumers may be disrupted.
4. If the broker is a cluster, because of different Messages may be stored in different brokers, and the order of individual consumers may be disrupted when getting messages

How to solve?
Ensure that the messages delivered by the producer are placed in the same broker, corresponding to only one consumer consumption (kafka sets the same key for the message, you can guarantee delivery to the same broker)
but the throughput of a single consumer is very low, you can use consumption The user obtains messages in batches, and then uses a memory queue to store (also calculate which memory queue is stored according to the key), each memory queue will only be processed by a corresponding thread

Each broker in kafka corresponds to a consumer.
In the same group, only one consumer will eventually consume the same message

12. Reasons for Kafka's high throughput

1. Use sequential writing to store my data
2. Both producers and consumers support batch processing. Producers deliver messages through asynchronous forms + buffers. Disadvantages: data may be lost
3. Zero copy of data, NIO itself Support
4. Store Topic partition
5. Compress data and reduce bandwidth transmission

thirteen,

fourteen,

fifteen,

sixteen,

17.

eighteen,

19.

twenty,

twenty one,

twenty two,

twenty-three,

twenty four,

twenty five,

Twenty-six,

Twenty-seven,

Twenty-eight,

Twenty-nine,

thirty,

Published 52 original articles · Likes2 · Visits 1864

Guess you like

Origin blog.csdn.net/qq_42972645/article/details/104766567
Recommended