20,000+ words, fully understand Kafka!

1. Why there is a message system

1. Decoupling
2. Asynchronous processing

For example, e-commerce platforms, spike activities.

The general process will be divided into:

  1. risk control
  2. stock lock
  3. Generate orders
  4. SMS notification
  5. update data

Separate the seckill activity business through the message system, and put the business that does not need to be dealt with urgently behind and process it slowly;

The process is changed to:

  1. risk control
  2. stock lock
  3. message system
  4. Generate orders
  5. SMS notification
  6. update data
3. Flow control

3.1 After the gateway receives the request, it puts the request into the message queue

3.2 The back-end service gets the request from the message queue and completes the subsequent seckill process. Then return the result to the user.

  • Pros: Controlled flow
  • Cons: slows down the process

Recommend an open source and free Spring Boot practical project:

https://github.com/javastacks/spring-boot-best-practice

2. The core concept of Kafka

  • Producer: Producer generates data to Kafka cluster
  • Consumers: Consumers go to Kafka to obtain data, process data, and consume data

Kafka's data is pulled by consumers themselves to the data in Kafka

  • subject: topic
  • Partition: partition

By default, a topic has a partition (partition), and you can set multiple partitions (partitions are stored on different nodes of the server)

Solved the problem of how to store massive data

For example: there is 2T of data, one server has 1T, a topic can be divided into multiple areas, and stored on multiple servers respectively, to solve the problem of massive data storage

3. Kafka's cluster architecture

In a Kafka cluster, a Kafka server is a broker, a topic is just a logical concept, and a partition is represented as a directory on the disk.

Consumer Group: consumer group, when consuming data, must specify a group id, specify a group id

Assuming that the group id numbers specified by program A and program B are the same, then the two programs belong to the same consumer group

special:

  • For example, there is a topic topicA, and program A consumes this topicA, then program B can no longer consume topicA (program A and program B belong to a consumption group)
  • Another example is that program A has already consumed the data in topicA, and it is not allowed to consume the data of topicA again now, but after redesignating a group id number, it can be consumed.

There is no effect between different consumer groups. The consumer group needs to be customized, and the consumer name program is automatically generated (unique).

Controller: A master node in the Kafka node. With zookeeper

4. Kafka disk sequential write guarantees data write performance

kafka write data:

Sequential write, when writing data to the disk, is to append data, there is no random write operation.

experience:

If a server has a certain number of disks and a certain number of revolutions, the speed of sequentially writing (additional writing) data to the disk is similar to the speed of writing to memory.

Producers produce messages, which are first written to the os cache memory through the kafka service, and then written to the disk sequentially through sync

5. Kafka's zero-copy mechanism ensures high performance in reading data

Consumer read data process:

  1. The consumer sends a request to the kafka service
  2. The kafka service goes to the os cache to read data (if there is no cache, go to the disk to read data)
  3. Read data from disk into os cache cache
  4. os cache copies data to kafka application
  5. Kafka sends data (replication) to socket cache
  6. The socket cache is transmitted to the consumer through the network card

Kafka linux sendfile technology - zero copy

  1. The consumer sends a request to the kafka service
  2. The kafka service goes to the os cache to read data (if there is no cache, go to the disk to read data)
  3. Read data from disk into os cache cache
  4. os cache sends data directly to the network card
  5. Data transmission to consumers via network card

6. Kafka logs are stored in segments

A topic in Kafka generally sets up partitions; for example, a topic_a is created, and then three partitions are specified for this topic when it is created.

In fact, on the three servers, three directories will be created.

Server 1 (kafka1):

  • Create directory topic_a-0:
  • Below the directory is our file (store data), kafka data is message, and the data is stored in the log file
  • The end of .log is the log file. In Kafka, the data file is called the log file.

There are n multiple log files (segmented storage) under a partition by default, and a log file defaults to 1G

Server 2 (kafka2):

  • Create directory topic_a-1:

Server 3 (kafka3):

  • Create directory topic_a-2:

7. Kafka binary search location data

Each message in Kafka has its own offset (relative offset), stored on the physical disk, at position

Position: physical location (the place above the disk)

That is to say, a message has two positions:

  • offset: relative offset (relative position)
  • position: the physical location of the disk

Sparse index:

  • Kafka uses a sparse index to read the index. Whenever Kafka writes a 4k log (.log), it writes a record index into the index.

which uses binary search

8. High concurrency network design (understand NIO first)

The network design part is the best designed part of Kafka, which is also the reason to ensure high concurrency and high performance of Kafka

To tune kafka, you must have a better understanding of kafka principles, especially the network design part

Reactor network design pattern 1:

Reactor network design pattern 2:

Reactor network design pattern 3:

Kafka ultra-high concurrent network design:

9. Kafka redundant copies ensure high availability

Partitions in Kafka have copies. Note: Before 0.8, there was no copy mechanism. When creating a topic, you can specify the partition or the number of replicas. Replicas have roles:

leader partition:

  • Write data and read data operations are all performed from the leader partition.
  • An ISR (in-sync-replica) list will be maintained, but the values ​​in the ISR list will be deleted according to certain rules

The producer sends a message, and the message must first be written to the leader partition

After writing, the message must be written to other partitions in the ISR list, and the message will be submitted after writing

follower partition: Synchronize data from the leader partition.

10. Excellent architectural thinking - summary

Kafka — high concurrency, high availability, high performance

  • High availability: multiple copy mechanism
  • High concurrency: network architecture design three-tier architecture: multi-selector -> multi-threading -> queue design (NIO)
  • high performance:

Write data:

  1. Write data to OS Cache first
  2. Writing to the disk is written sequentially, with high performance

Read data:

  1. According to the sparse index, quickly locate the data to be consumed

  2. zero copy mechanism

    • Reduce copying of data
    • Reduced application and operating system context switching

11. Kafka production environment construction

11.1 Demand Scenario Analysis

The e-commerce platform needs to send 1 billion requests to the Kafka cluster every day. Twenty-eight anyway, the general assessment is not a big problem.

1 billion requests -> 24. Under normal circumstances, there is not much data during the period from 12:00 to 8:00 every day. 80% of the requests took another 16 hours to process. 16 hours processing -> 800 million requests. 16 * 0.2 = 80% of the 800 million requests processed in 3 hours

In other words, 600 million data is processed in 3 hours. Let's simply calculate the qps during the peak period

6亿/3小时 =5.5万/s qps=5.5万

10亿请求 * 50kb = 46T46T of data needs to be stored every day

Under normal circumstances, we will set up two copies 46T * 2 = 92T. The data in Kafka has a retention time period, and the data of the last 3 days is retained.

92T * 3天 = 276T

What I’m talking about here is 50kb. I don’t mean that one message is 50kb (the logs are merged, and multiple logs are merged together). Usually, a message is only a few bytes, or it may be several hundred bytes.

11.2 Evaluation of the number of physical machines

1) First analyze whether you need a virtual machine or a physical machine

When clusters like Kafka mysql hadoop are built, we use physical machines in production.

2) The total number of requests to be processed during the peak period is 55,000 per second. In fact, one or two physical machines can definitely resist it. Under normal circumstances, when we evaluate the machine, we evaluate it according to 4 times the peak period.

If it is 4 times, the capacity of our cluster should be prepared to 200,000 qps. Such a cluster is a relatively safe cluster. About 5 physical machines are needed. Each handles 40,000 requests.

Scenario summary:

  • To handle 1 billion requests, 55,000 qps and 276T data in the peak period, 5 physical machines are required.

11.3 Disk selection

To handle 1 billion requests, 55,000 qps and 276T data in the peak period, 5 physical machines are required.

1) SSD solid state drive, or ordinary mechanical hard drive

  • SSD hard disk: the performance is better, but the price is expensive
  • SAS disk: The performance is not very good in some respects, but it is relatively cheap.

The performance of the SSD hard disk is relatively good, which means that its random read and write performance is relatively good. Suitable for clusters like MySQL.

But in fact, its sequential write performance is similar to that of a SAS disk.

Kafka's understanding: it is written in the order in which it is written. So we can use ordinary [mechanical hard disk].

2) We need to evaluate how many disks each server needs

5 servers need a total of 276T, and each server needs to store 60T of data. The configuration of the server in our company uses 11 hard drives, each with 7T.11 * 7T = 77T

77T * 5 台服务器 = 385T

Scenario summary:

  • To handle 1 billion requests, 5 physical machines are required, 11 (SAS) * 7T

11.4 Memory evaluation To handle 1 billion requests, 5 physical machines are required.11(SAS) * 7T

We found that the process of reading and writing data in Kafka is based on the os cache. In other words, assuming that our os cashe is infinite, then the entire Kafka is equivalent to operating based on memory. If it is based on memory, the performance must be very good. Memory is limited.

  • As much memory resources as possible should be given to os cache
  • Kafka's code is written in scala for the core code, and java for the client. Both are based on jvm. So we have to give part of the memory to jvm.

The design of Kafka does not put many data structures in the jvm. So our jvm does not need too much memory. According to experience, give a 10G on it.

NameNode: There is also metadata (tens of gigabytes) in the jvm, and the JVM must give it a lot. For example, give a 100G.

Suppose our 10-request project has a total of 100 topics.100 topic * 5 partition * 2 = 1000 partition

A partition is actually a directory on the physical machine, and there will be many .log files under this directory.

  • .log is to store data files. By default, the size of a .log file is 1G.

If we want to ensure that the data of the latest .log files of 1000 partitions are all in memory, the performance is the best at this time. 1000 * 1G = 1000GMemory.

We only need to ensure that 25% of the latest data in the latest log is in memory. 250M * 1000 = 0.25 G* 1000 =250Gof memory.

  • 250内存 / 5 = 50GMemory
  • 50G+10G = 60GMemory

64G of memory, another 4G, does the operating system also need memory. In fact, Kafka's jvm does not need to be given as much as 10G. It is possible to evaluate 64G. Of course, it would be best if a server with 128G memory can be provided.

When I just evaluated, I used a topic with 5 partitions, but if it is a topic with a relatively large amount of data, there may be 10 partitions.

Summarize:

  • To handle 1 billion requests, 5 physical machines are required, 11(SAS) * 7Tand 64G of memory is required (128G is better)

11.5 CPU Pressure Evaluation

Estimate how many cpu cores each server needs (resources are limited)

We evaluate how many cpus are needed based on how many threads are running in our service. Threads run on the cpu. If we have more threads but fewer cpu cores, our machine load will be very high, and the performance will not be good.

Evaluate, how many threads will there be after a kafka server starts?

  • Acceptor thread 1
  • Processor thread 3 6~9 threads
  • Processing request threads 8 32 threads
  • Threads for regular cleaning, threads for pulling data, mechanisms for regularly checking the ISR list, etc.

So after a Kafka service is started, there will be more than one hundred threads.

  • cpu core = 4, once again, dozens of threads will definitely fill up the cpu.
  • cpu core = 8, it should be easy to support dozens of threads.

If our threads are more than 100, or almost 200, then 8 cpu cores can't handle it.

So here we suggest:

  • CPU cores = 16. If possible, it would be best to have 32 cpu cores.

in conclusion:

  • The kafka cluster requires a minimum of 16 cpu cores, and it would be even better if 32 cpu cores can be provided.
  • 2cpu * 8 =16 cpu core
  • 4cpu * 8 = 32cpu core

Summarize:

  • To handle 1 billion requests, 5 physical machines are required, 11(SAS) * 7T64G of memory (128G is better), and 16 CPU cores (32 is better)

11.6 Network needs assessment

Assess what kind of network card we need?

Generally, it is either a Gigabit network card (1G/s), or a 10G network card (10G/s)

During the peak period, there will be an influx of 55,000 requests per second, 5.5/5 = approximately 10,000 requests per server. As we said before, 10000 * 50kb = 488Meach server must accept 488M data per second. The data also needs to have copies, and the synchronization between copies is also a request from the network.488 * 2 = 976m/s

Explain:

  • The data of many companies is not as large as 50kb in one request. Our company encapsulates the data on the production side and then merges multiple pieces of data together, so one of our requests is so large.
  • Under normal circumstances, the bandwidth of the network card cannot reach the limit. If it is a Gigabit network card, we can generally use about 700M. But in the best case, we still use 10 Gigabit network cards.
  • If you are using 10 Gigabit, it is very easy.

11.7 Cluster Planning

  • Request volume
  • Plan the number of physical machines
  • Analyze the number of disks and choose what kind of disk to use
  • Memory
  • cpu core
  • network card

I just want to tell everyone that if there is any need in the company in the future, we will evaluate resources and servers, and everyone will evaluate according to my thinking.

the size of a message50kb -> 1kb 500byte 1M

ip hostname

  • 192.168.0.100 hadoop1
  • 192.168.0.101 hadoop2
  • 192.168.0.102 hadoop3

Host planning: kafka cluster architecture: master-slave architecture:

  • controller -> Manage the metadata of the entire cluster through the zk cluster.

zookeeper cluster

  • hadoop1
  • hadoop2
  • hadoop3

kafka cluster

  • In theory, we should not install Kafka's services serving zk together.
  • But we have limited servers here. So our kafka cluster is also installed in hadoop1 haadoop2 hadoop3

11.8 Zookeeper cluster construction

11.9 Detailed Explanation of Core Parameters

11.10 Cluster stress test

12. Kafka operation and maintenance

12.1 Introduction to common operation and maintenance tools

KafkaManager — page management tool

12.2 Common operation and maintenance commands

Scenario 1: The amount of topic data is too large, and the number of topics needs to be increased

When the topic was first created, the amount of data was not large, and the number of partitions given was not large.

kafka-topics.sh --create --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 --replication-factor 1 --partitions 1 --topic test6

kafka-topics.sh --alter --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 --partitions 3 --topic test6

broker id:

  • hadoop1:0
  • hadoop2:1
  • hadoop3:2

Suppose a partition has three copies: partition0:

a,b,c

  • a:leader partition
  • b,c:follower partition
ISR:{a,b,c}

If a follower partition does not pull data from the leader partition for more than 10 seconds, the partition is removed from the ISR list.

Scenario 2: The core topic increases the replication factor

If you need to increase the replication factor for core business data

vim test.json script, save the following line of json script

{“version”:1,“partitions”:[{“topic”:“test6”,“partition”:0,“replicas”:[0,1,2]},{“topic”:“test6”,“partition”:1,“replicas”:[0,1,2]},{“topic”:“test6”,“partition”:2,“replicas”:[0,1,2]}]}

Execute the above json script:

kafka-reassign-partitions.sh --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 --reassignment-json-file test.json --execute

Scenario 3: topic with unbalanced load, manual migration

vi topics-to-move.json

{“topics”: [{“topic”: “test01”}, {“topic”: “test02”}], “version”: 1}
// 把你所有的topic都写在这里

kafka-reassgin-partitions.sh --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 --topics-to-move-json-file topics-to-move.json --broker-list “5,6” --generate
// 把你所有的包括新加入的broker机器都写在这里,就会说是把所有的partition均匀的分散在各个broker上,包括新进来的broker

At this point, a migration plan will be generated, which can be saved in a file: expand-cluster-reassignment.json

kafka-reassign-partitions.sh --zookeeper hadoop01:2181,hadoop02:2181,hadoop03:2181 --reassignment-json-file expand-cluster-reassignment.json --execute

kafka-reassign-partitions.sh --zookeeper hadoop01:2181,hadoop02:2181,hadoop03:2181 --reassignment-json-file expand-cluster-reassignment.json --verify

This kind of data migration operation must be done at low peak hours at night, because it will migrate data between machines, which consumes a lot of bandwidth resources

  • generate: Generate a migration plan based on the given Topic list and Broker list. generate does not actually perform message migration, but calculates the message migration plan for use by the execute command.
  • execute: Migrate according to the given message migration plan.
  • verify: Check whether the message has been migrated.

Scenario 4: If there are too many broker leader partitions

Normally, our leader partition is load-balanced between servers.

  • hadoop1 4
  • hadoop2 1
  • hadoop3 1

Now each business party can apply to create a topic by itself. The number of partitions is automatically allocated and subsequently dynamically adjusted. Kafka itself will automatically distribute the leader partition evenly on each machine, which can ensure that the read and write throughput of each machine is even. of.

But there are exceptions, that is, if some brokers are down, the leader partition will be too concentrated on a few other brokers, which will lead to a high pressure of read and write requests on a few brokers, and other downtime brokers will restart. They are all folloer partitions, and the read and write requests are very low.

There is a parameter that causes cluster load imbalance, auto.leader.rebalance.enable, the default is true, every 300 seconds ( leader.imbalance.check.interval.seconds) to check whether the leader load is balanced

If the unbalanced leader on a broker exceeds 10%, leader.imbalance.per.broker.percentagethe broker will be elected.

Configuration parameters:

  • auto.leader.rebalance.enableThe default is true
  • leader.imbalance.per.broker.percentage: The ratio of unbalanced leaders allowed per broker. If each broker exceeds this value, the controller will trigger the leader's balance. This value represents a percentage. 10%
  • leader.imbalance.check.interval.seconds: Default 300 seconds

13. Kafka producer

13.1 The principle of consumers sending messages

13.2 The principle of consumers sending messages - basic case demonstration

13.3 How to improve throughput

How to improve throughput: Parameter 1: buffer.memory:

Set the buffer for sending messages, the default value is 33554432, which is 32MB

Parameter two: compression.type:

The default is none, no compression, but lz4 compression can also be used, the efficiency is still good, after compression, the amount of data can be reduced and the throughput can be improved, but it will increase the cpu overhead on the producer side

Parameter three: batch.size:

  • Set the size of the batch. If the batch is too small, it will cause frequent network requests and decrease the throughput;
  • If the batch is too large, it will cause a message to wait for a long time before it can be sent out, and it will put a lot of pressure on the memory buffer, too much data is buffered in the memory, the default value is: 16384, which is 16kb, that is, a batch is full It will be sent out after 16kb. Generally, in the actual production environment, the value of this batch can be increased to improve throughput. If a batch is set too large, there will be a delay. Generally set according to the size of a message.
  • If we have less news. The parameter linger.ms used in conjunction with this value is 0 by default, which means that the message must be sent immediately, but this is wrong. Generally, it is set to 100 milliseconds or the like. This means that after the message is sent out, it enters a batch, if the batch is full of 16kb within 100 milliseconds, it will be sent out naturally.

13.4 How to handle exceptions

1、LeaderNotAvailableException:

This means that if a certain machine hangs up, the leader copy is unavailable at this time, which will cause you to fail to write. You have to wait for other follower copies to switch to the leader copy before you can continue to write. At this time, you can retry sending; if you say If you usually restart the broker process of Kafka, it will definitely cause the leader to switch, and it will definitely cause you to write an error, yes LeaderNotAvailableException.

2、NotControllerException:

This is also the same. If the Broker where the Controller is located hangs up, there will be a problem at this time, and you need to wait for the Controller to be re-elected. At this time, it is the same and just retry.

3. NetworkException: Network exception timeout

  • Configure the retries parameter, he will automatically retry
  • But if it still doesn't work after several retries, Exception will be provided for us to deal with. After we get the exception, we will process the message separately. We will have backup links. Unsuccessful messages are sent to Redis or written to the file system, or even discarded.

13.5 Retry mechanism

Retrying can cause some problems:

message repeat

Sometimes some problems like leader switching need to be retried, just set retries, but the message retry will lead to the problem of repeated sending, for example, the network jitter makes him think that it is not successful, so he retries, in fact Everyone has succeeded.

Message out-of-order message retry may lead to message out-of-order, because the messages that may be behind you are all sent out. So you can use the " max.in.flight.requests.per.connection" parameter to be set to 1, which can ensure that the producer can only send one message at a time.

The default interval between two retries is 100 milliseconds. Use " retry.backoff.ms" to set it. Basically, during the development process, 95% of abnormal problems can be solved by the retry mechanism.

13.6 Detailed explanation of ACK parameters

Producer side

request.required.acks=0;
  • As long as the request has been sent, even if it is sent, it does not care whether the writing is successful or not.
  • The performance is very good. If you are analyzing some logs, you can tolerate data loss. With this parameter, the performance will be very good.
request.required.acks=1;
  • Send a message. When the leader partition is successfully written, the write is considered successful.
  • However, this method also has the possibility of losing data.
request.required.acks=-1;
  • In the ISR list, after all copies are written, this message is considered to be successfully written.
  • ISR: 1 copy. 1 leader partition 1 follower partition

Kafka server:

min.insync.replicas:1

If we do not set it, the default value is 1. A leader partition will maintain an ISR list. This value is to limit the number of copies in the ISR list. For example, if this value is 2, then when there is only one copy in the ISR list when. When inserting data into this partition, an error will be reported.

Design a solution that does not lose data:

  • Partition copy >=2
  • acks = -1
  • min.insync.replicas >=2

It is also possible that there is an exception in the sending: handle the exception

13.7 Custom partition

Partition:

  • no key set

Our messages will be sent to different partitions by rotation.

  • set the key

The partitioner that comes with Kafka will calculate a hash value based on the key, and this hash value will correspond to a certain partition.

If the keys are the same, the hash values ​​must be the same, and the values ​​with the same key must be sent to the same partition.

But in some special cases, we need to customize the partition

public class HotDataPartitioner implements Partitioner {
private Random random;
@Override
public void configure(Map<String, ?> configs) {
random = new Random();
}
@Override
public int partition(String topic, Object keyObj, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
String key = (String)keyObj;
List partitionInfoList = cluster.availablePartitionsForTopic(topic);
//获取到分区的个数 0,1,2
int partitionCount = partitionInfoList.size();
//最后一个分区
int hotDataPartition = partitionCount - 1;
return !key.contains(“hot_data”) ? random.nextInt(partitionCount - 1) : hotDataPartition;
}
}

how to use:

Just configure this class:props.put(”partitioner.class”, “com.zhss.HotDataPartitioner”);

13.8 Comprehensive Case Demonstration

demand analysis:

E-commerce background-》Second-hand e-commerce platform

[Happy gift] items, after the user purchases something, there will be [stars], and the stars can be exchanged for items. A dollar a star.

Order system (message production), send a message (pay order, cancel order) -> Kafka <- Membership system, consume data from kafak, find the corresponding amount consumed by the user, and then update the number of stars for the user.

Analyze it:

When sending a message, you can specify a key or not specify a key.

1) If no key is specified

  • zhangsan -> place an order -> 100 -> +100
  • zhangsan -> cancel order -> -100 -> -100
  • When the member system consumes data, it is possible to consume the data of canceling the order first.

2) If key is specified, key -> hash (number) -> corresponding partition number -> send to the corresponding partition.

  • If the key is the same -> data will definitely be sent to the same partition (ordered)

This project needs to specify the key, and specify the user's id as the key.

14. Kafka consumer

14.1 Consumer group concept

The same groupid belongs to the same consumer group

1) Each consumer must belong to a consumer.group, which is a consumer group. A partition of a topic will only be assigned to a consumer under a consumer group for processing. Each consumer may be assigned multiple partitions, or a certain Consumers are not assigned to any partitions.

2) If you want to achieve a broadcast effect, you only need to use different group ids to consume.

topicA:

  • partition0、partition1

groupA:

  • consumer1: consumption partition0
  • consumemr2: consumption partition1
  • consumemr3: Cannot consume data

groupB:

  • consumemr3: Consume to partition0 and partition1

3) If a consumer in the consumer group hangs up, the partition allocated to him will be automatically handed over to other consumers at this time. If he restarts again, some partitions will be returned to him again

14.2 Base Case Demonstration

14.3 Offset Management

The data structure in the memory of each consumer stores the consumption offset for each partition of each topic, and the offset is submitted periodically. The old version is written to zk, but such a high concurrent request zk is an unreasonable architectural design, and zk is distributed The coordinated, lightweight metadata storage of the system, which cannot be responsible for high concurrent read and write, is used as data storage.

Now the new version submits the offset and sends it to kafka's internal topic: __consumer_offsets, when submitting in the past, the key is group.id+topic+分区号, and the value is the value of the current offset. Every once in a while, kafka will compact (merge) this topic internally, that is, each group .id+topic+partition number will keep the latest data.

__consumer_offsetsIt may receive highly concurrent requests, so the default partition is 50 (leader partitiron -> 50 kafka), so if your kafka deploys a large cluster, such as 50 machines, you can use 50 machines to resist offset submission request pressure.

  • Consumer -> data on the broker side
  • message -> disk -> offset is incremented sequentially
  • Where to start spending? -> offset
  • consumer (offset)

14.4 Introduction of Offset Monitoring Tool

A management software for web page management (kafka Manager)

  • Modify the bin/kafka-run-class.sh script, add the first lineJMX_PORT=9988
  • Restart the kafka process

Another software: the offset of the consumer that is mainly monitored.

It is a jar packagejava -cp KafkaOffsetMonitor-assembly-0.3.0-SNAPSHOT.jar

com.quantifind.kafka.offsetapp.OffsetGetterWeb
  • offsetStorage kafka \ (according to the version: fill in kafka if the offset exists in kafka, fill in zookeeper if it exists in zookeeper)
  • zk hadoop1:2181
  • port 9004
  • refresh 15.seconds
  • retain 2.days

I wrote a program to consume the data in Kafka (consumer, processing data -> business code) -> How does Kafka judge that your code is really consumed in real time?

Delay hundreds of millions of data -> threshold (an alarm is sent when 200,000.)

14.5 Consumption Abnormal Perception

heartbeat.interval.ms

  • The heartbeat time interval of the consumer must keep the heartbeat with the coordinator to know whether the consumer is faulty.
  • Then if there is a failure, a rebalance command will be sent to other consumers through the heartbeat to notify them to perform the rebalance operation

session.timeout.ms

  • How long does kafka think that a consumer is faulty if it does not perceive a consumer, the default is 10 seconds

max.poll.interval.ms

  • If this time is exceeded between two poll operations, it will be considered that the consumption processing capacity is too weak, and it will be kicked out of the consumption group, and the partition will be allocated to others for consumption. Generally speaking, it is set in combination with the performance of business processing That's it.

14.6 Explanation of core parameters

fetch.max.bytes

To obtain the maximum number of bytes for a message, it is generally recommended to set it larger, and the default is 1M. In fact, we have seen this similar parameter in many places before, which means that the maximum size of a message can be?

  1. Producer: The data sent, the maximum size of a message, -> 10M
  2. Broker: Store data, the maximum size a message can accept -> 10M
  3. Consumer:

max.poll.records:

The maximum number of messages returned by one poll, the default is 500

connection.max.idle.ms

If the socket connection between the consumer and the broker is idle for a certain period of time, the connection will be automatically reclaimed at this time, but the socket connection will be re-established for the next consumption. This suggestion is set to -1, do not recycle

enable.auto.commit:

Enable autocommit offsets

auto.commit.interval.ms:

How often to submit the offset, the default value is 5000 milliseconds

auto.offset.reset

  • earliest: When there is a committed offset under each partition, consume from the committed offset; when there is no committed offset, consume from the beginning
  • latest: When there is a submitted offset under each partition, start consumption from the submitted offset; when there is no submitted offset, consume the newly generated data under the partition
  • none: When there is a committed offset in each partition of the topic, consumption will start after the offset; as long as there is a partition without a committed offset, an exception will be thrown

14.7 Comprehensive Case Demonstration

Introduction case: the second-hand e-commerce platform (Happy Send), accumulates user stars according to the amount of user consumption.

  • Order system (producer) -> Kafka cluster sent a message.
  • Membership system (consumer) -> Kafak cluster consumes messages and processes them.

14.8 Principle of group coordinator

Interview question: How do consumers achieve rebalance? — Implemented according to the coordinator

what is coordinator

Each consumer group will choose a broker as its coordinator, who is responsible for monitoring the heartbeat of each consumer in this consumer group, and judging whether it is down, and then enabling rebalance

How to choose a coordinator machine

First, hash (number) the groupId, and then __consumer_offsetstake the modulus of the number of partitions, the default is 50, and _consumer_offsetsthe number of partitions can be offsets.topic.num.partitionsset through. After finding the partition, the broker machine where the partition is located is the coordinator machine.

For example: groupId, "myconsumer_group" -> hash value (number) -> take modulo 50 -> which __consumer_offsetsbroker is the No. 8 partition of the topic 8 on, and that one is the coordinator to know all the brokers under this consumer group When the consumer submits the offset, which partition does the consumer submit the offset to?

run process

  • Each consumer sends a JoinGroup request to the Coordinator,
  • Then the Coordinator selects a consumer from a consumer group as the leader,
  • Send the consumer group situation to the leader,
  • Then the leader will be responsible for formulating the consumption plan,
  • Send to Coordinator through SyncGroup
  • Then the Coordinator sends the consumption plan to each consumer, and they will start from the specified partition

The leader broker starts socket connection and consumes messages

14.9 rebalance strategy

Consumer group realizes Rebalance by coordinator

There are three rebalance strategies here: range, round-robin, sticky

For example, a topic we consume has 12 partitions:

p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11

Suppose there are three consumers in our consumer group

range strategy

  • The range strategy is according to the serial number range of the partiton
  • p0~3 consumer1
  • p4~7 consumer2
  • p8~11 consumer3
  • The default is this strategy;

round-robin strategy

  • round robin allocation
  • consumer1:0,3,6,9
  • consumer2:1,4,7,10
  • consumer3:2,5,8,11

But there is a problem with the previous two schemes: 12 -> 2 Each consumer will consume 6 partitions

Suppose consumemr1 is down: p0-5 is allocated to consumer2, and p6-11 is allocated to consumer3. In this case, the p6 and p7 partitions originally on consumer2 are allocated to consumer3.

sticky strategy

The latest sticky strategy is to ensure as much as possible that when rebalancing, the partitions that originally belonged to the consumer still belong to them, and then distribute the redundant partitions evenly, so as to maintain the original partition allocation strategy as much as possible

  • consumer1:0-3
  • consumer2: 4-7
  • consumer3: 8-11

Suppose consumer3 hangs up

  • consumer1:0-3,+8,9
  • consumer2: 4-7,+10,11

15. Broker management

15.1 Meaning of Leo and hw

  • The core principle of Kafka
  • How to evaluate a cluster resource
  • Built a set of kafka clusters-"Introduced some simple operation and maintenance management operations.
  • Producer (use, core parameter)
  • Consumer (principle, use, core parameters)
  • Some internal principles of the broker, the core concept: LEO, HW

LEO: It is related to the offset offset.

LEO:

In Kafka, both the leader partition and the follower partition are called replicas.

Every time a partition receives a message, it will update its own LEO, which is the log end offset. LEO is actually the latest offset + 1

HW: high water mark

A very important function of LEO is to update HW. If the LEO of the follower and leader are synchronized, HW can be updated at this time.

The data before HW is visible to consumers, and the message belongs to the commit state. Message consumers after HW cannot consume.

15.2 Leo update

15.3 hw update

15.4 How does the controller manage the entire cluster

1: Competition controller

  • /controller/id

2: The directory where the controller service listens:

  • /broker/ids/Used to perceive the broker going online and offline
  • /broker/topics/To create a topic, we created a topic command at that time, provided parameters, and ZK address.
  • /admin/reassign_partitionspartition reallocation

15.5 Delayed tasks

Kafka's delayed scheduling mechanism (extended knowledge)

Let's first look at which places in kafka need to have tasks that need to be delayed.

The first type of delayed tasks:

For example, if the producer's acks=-1, it must wait for both the leader and the follower to finish writing before returning a response.

There is a timeout, the default is 30 seconds (request.timeout.ms).

Therefore, after writing a piece of data to the leader disk, there must be a delayed task with an expiration time of 30 seconds. The delayed task is placed in the DelayedOperationPurgatory (delay manager).

If all followers are copied to the local disk before 30 seconds, the task will be automatically triggered to wake up, and the response result can be returned to the client. Otherwise, the delayed task itself specifies a maximum of 30 seconds. Seconds expire, if the timeout is not reached, it will directly timeout and return an exception.

The second type of delayed tasks:

When the follower pulls the message from the leader, if it is found to be empty, a delayed pull task will be created at this time

After the delay time is up (for example, 100ms), an empty data is returned to the follower, and then the follower sends a request to read the message again, but if the leader writes the message during the delay (not yet 100ms), this The task will automatically wake up and automatically execute the pull task.

A large number of delayed tasks need to be scheduled.

15.6 Time Wheel Mechanism

1. Why would you want to design a time wheel?

There are many delay tasks inside Kafka, which are not implemented based on JDK Timer. The time complexity of the insertion and deletion tasks is O(nlogn), but based on the time wheel written by myself. The time complexity is O(1 ), relying on the time wheel mechanism, delayed task insertion and deletion, O(1)

2. What is the time wheel?

In fact, the time wheel is actually an array.

  • tickMs: time wheel interval 1ms

  • wheelSize: time wheel size 20

  • interval: timckMS * wheelSize, the total time span of a time wheel. 20ms

  • currentTime: A pointer to the current time.

    • a: Because the time wheel is an array, when you want to get the data inside, you rely on the index, and the time complexity is O(1)
    • b: The task corresponding to a certain position in the array is stored in a doubly linked list, inserting and deleting tasks into the doubly linked list, the time complexity is also O(1)

3. Multi-level time wheel

For example: to insert a task to run after 110 milliseconds.

  • tickMs: time wheel interval 20ms

  • wheelSize: time wheel size 20

  • interval: timckMS * wheelSize, the total time span of a time wheel. 20ms

  • currentTime: A pointer to the current time.

    • The first layer of time wheel: 1ms * 20
    • The second layer of time wheel: 20ms * 20
    • The third layer of time wheel: 400ms * 20

Author: erainm
Source: blog.csdn.net/eraining/article/details/115860664

Recent hot article recommendation:

1. 1,000+ Java interview questions and answers (2022 latest version)

2. Brilliant! Java coroutines are coming. . .

3. Spring Boot 2.x tutorial, too comprehensive!

4. Don't fill the screen with explosions and explosions, try the decorator mode, this is the elegant way! !

5. The latest release of "Java Development Manual (Songshan Edition)", download quickly!

Feel good, don't forget to like + forward!

Guess you like

Origin blog.csdn.net/youanyyou/article/details/132495603