[Kafka] Usage analysis of kafka script kafka-configs.sh

Insert picture description here

1 Overview

The quoted blog comes from Li Zhitao: https://www.cnblogs.com/lizherui/p/12275193.html

2. Introduction

There are also various articles about the usage of the script kafka-configs.sh on the Internet, but they are not systematic and incomplete. The content of the introduction is missing, which always makes people look very understandable and difficult to use, such as: dynamic configuration of internal relations Unclear, some key configuration parameters of master-slave synchronization quota current limit are not explained clearly, unless you look at the code. So I hope that readers can use this script to solve the problems encountered in actual operation and maintenance and development more conveniently by reading this article in depth, and at the same time save everyone learning time.

3. Script syntax analysis

Kafka-configs.sh parameter analysis

Insert picture description here

4. Grammar format

4.1 Add configuration items

A topic configuration object

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type topics --entity-name topicName  --add-config 'k1=v1, k2=v2, k3=v3' 

Configuration objects of all clientIds

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type clients --entity-default --add-config 'k1=v1, k2=v2, k3=v3' 

example

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type topics --entity-name topicName  --add-config 'max.message.bytes=50000000, flush.messages=50000, flush.ms=5000'

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type topics --entity-name topicName  --add-config 'max.message.bytes=50000000' --add-config 'flush.messages=50000'

4.2 Delete configuration items

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type topics --entity-name topicName --delete-config ‘k1,k2,k3’

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type clients --entity-name clientId --delete-config ‘k1,k2,k3’

bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type brokers --entity-name $brokerId --delete-config ‘k1,k2,k3’

bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type brokers --entity-default --delete-config ‘k1,k2,k3’

example

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type topics --entity-name test-cqy --delete-config 'segment.bytes'

4.3 Modify configuration items

Modifying the configuration items is the same as adding the syntax format, and the same parameters are directly covered by the backend

List entity configuration description

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --entity-type topics --entity-name topicName --describe

bin/kafka-configs.sh--bootstrap-server localhost:9092 --entity-type brokers --entity-name $brokerId --describe

bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --entity-default --describe

bin/kafka-configs.sh  --zookeeper localhost:2181/kafkacluster --entity-type users --entity-name user1 --entity-type clients --entity-name clientA --describe

Others are analogized in turn, not listed one by one

5. Configuration Management Usage

5.1 Client quota current limit

Kafka supports quota management, which can limit the flow of Producer and Consumer's produce&fetch operations to prevent individual businesses from overwhelming servers. This article mainly introduces how to use Kafka's quota management function

Introduction to quota limit

Kafka quota current limit is configured by 3 granularities:

users + clients
users
clients

The above three methods are all methods of identifying the identity of the accessed client. Among them clientidis an identity mark of each client connected to the Kafka cluster, which needs to be carried in the ProduceRequest and FetchRequest; users are only available in the Kafka cluster with identity authentication enabled. The default values ​​of clientid for producer and consumer are respectively

producer-自增序号、groupid

Configuration priority

The above three granular configurations will be combined into 8 configuration objects, the scope of the same configuration item is different, high priority overrides low priority, the following is the configuration priority

Insert picture description here

List of configuration items

Insert picture description here

Placement example

1. Configure users + clients

bin/kafka-configs.sh  --zookeeper localhost:2181/kafkacluster --alter  --entity-type users --entity-name user1 --entity-type clients --entity-name clientA --add-config 'producer_byte_rate=20971520,consumer_byte_rate=20971520' 

bin/kafka-configs.sh  --zookeeper localhost:2181/kafkacluster --alter  --entity-type users --entity-name user1 --entity-type clients --entity-default --add-config 'producer_byte_rate=20971520,consumer_byte_rate=20971520' 

bin/kafka-configs.sh  --zookeeper localhost:2181/kafkacluster --alter  --entity-type users --entity-default --entity-type clients --entity-default --add-config 'producer_byte_rate=20971520,consumer_byte_rate=20971520' 

2. Configure users

The cumulative sum of all users in the broker, the maximum producer production & consumption rate is 20MB/sec

bin/kafka-configs.sh  --zookeeper localhost:2181/kafkacluster --entity-type users --entity-default --alter --add-config 'producer_byte_rate=20971520,consumer_byte_rate=20971520'

broker内userA的最大producer生产&消费速率为20MB/sec

bin/kafka-configs.sh  --zookeeper localhost:2181/kafkacluster --entity-type users --entity-name userA --alter --add-config 'producer_byte_rate=20971520,consumer_byte_rate=20971520'

3. Configure clients

broker内所有clientId累加总和最大producer生产速率为20MB/sec

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type clients --entity-default  --add-config 'producer_byte_rate=20971520'

broker内clientA的最大producer生产速率为20MB/sec

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type clients  --entity-name clientA  --add-config 'producer_byte_rate=20971520'

Problems beyond current limit

What will Kafka do if the producer and consumer exceed the traffic limit?

  1. For Producer. If the Producer exceeds the current limit, 先把数据append到log文件calculate the delay time and respond to the Producer after waiting for the ThrottleTime. Kafka does not have a client feedback mechanism, so the producer will resend the write timeout, and the write message will be repeated.
  2. For Consumer. If the Consumer exceeds the current limit 先计算延时时间, and after waiting for ThrottleTime, Kafka reads data from the log and responds to the Consumer. If the consumer's QequestTimeout <ThrottleTime, the consumer will continue to re-send the fetch request within the ThrottleTime time, and Kafka will accumulate a large number of invalid requests and occupy resources.

5.2 Brokers type configuration

Insert picture description here

The configuration of brokers is complicated and there are many configuration items. Kafka internally divides the configuration of brokers into 7 modules, as shown in the following table:

Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

The brokers type configuration does not support all configuration items. For example: Broker upgrade related protocols and group, zookeeper, built-in Transaction, Controlled, and built-in offset cannot be changed dynamically. Configure brokers can only specify -bootstrap-server, zk does not support

5.3 Add configuration items

bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type brokers --entity-default  --add-config 'max.connections.per.ip=200,max.connections.per.ip.overrides=[ip1:100,ip2:120]' 

bin/kafka-configs.sh --bootstrap-server localhost:9092--alter --entity-type brokers --entity-name  $brokerId --add-config 'max.connections.per.ip=200,max.connections.per.ip.overrides=[ip1:100]' 

5.4 Delete configuration items

bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type brokers --entity-default  --delete-config  'max.connections.per.ip,max.connections.per.ip.overrides' 

bin/kafka-configs.sh --bootstrap-server localhost:9092 --alter --entity-type brokers --entity-name  $brokerId --delete-config  'max.connections.per.ip,max.connections.per.ip.overrides' 

5.5 List configuration description

bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --entity-default --describe

bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --entity-name $brokerId --describe

5.6 Topics type configuration

The Topics type configuration is a subset of the Brokers type configuration. The Brokers type contains all the configurations of the Topics type. The brokers just add a prefix to the topics configuration item. There is also a special case difference that the parameter message.format.version is temporarily not supported in the dynamic configuration of brokers.
Insert picture description here
Insert picture description here

Add configuration items

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type topics --entity-name test-cqy --add-config 'max.message.bytes=50000000,flush.messages=50000,flush.ms=5000'

Delete configuration item

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type topics --entity-name test-cqy --delete-config 'max.message.bytes,flush.messages,flush.ms'

List configuration description

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --entity-type topics --entity-name test-cqy --describe

6. Brokers quota limit flow

Copy data quotas between brokers

Insert picture description here

Kafka provides a traffic limit function for replication and transmission between brokers, which limits the upper limit of the bandwidth of partitions data replication from one broker node to another broker node. When rebalancing the cluster, it is useful to guide new brokers to add or remove old brokers. The configuration notes are as follows:

  1. Topic is the carrier of DynamicReplicationQuota's current limit. Only when it acts on a specific topic, the quota current limit is effective.
  2. Both leader|follower.xxx.throttled.replicas and leader|follower.xxx.throttled.rate are both configured to be effective
  3. The configuration only limits the current in the range of xxx.throttled.replicas, and does not limit the current in other topics

Kafka replication quota limit is quite flexible. Two parameters are used as the pre-limit for rate and replicas to ensure that it is only valid for configuration topics. For example, in a scenario where a cluster expands and adds brokers, it is necessary to migrate and allocate the IO pressure on the specified topics. If all topics are restricted, it will affect the normal operation of the business.

Set the syntax of xxx.throttled.rate (only brokerId can be set, setting -entity-default is invalid)

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster --alter --entity-type brokers --entity-name $brokerId  --add-config  'leader.replication.throttled.rate=10485760' 

Quota flow 2 two ways

Way 1

bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --entity-name 105 --alter --add-config 'leader.replication.throttled.rate=10485760,follower.replication.throttled.rate=10485760'

bin/kafka-configs.sh --zookeeper localhost:2181/kafkacluster  --entity-type topics --entity-name topicA --alter --add-config 'leader.replication.throttled.replicas=*,follower.replication.throttled.replicas=*'

Way 2

Use the reassign script to set the leader&follower.xxx.throttled.rate current limit, and set the current limit at the same time when operating the partitions migration to avoid the excessive IO network card explosion. The following throttle actually generates leader&follower.xxx.throttled.rate=31457280. At the bottom, reassign still calls the API implementation of kafka-configs.sh, and configures the brokers covered by move.json one by one.

bin/kafka-reassign-partitions.sh --zookeeper localhost:2181/kafkacluster --reassignment-json-file move.json --throttle 31457280 --execute

When the above partitions data migration is completed, execute the following script to delete the –throttle parameter configuration

bin/kafka-reassign-partitions.sh --zookeeper localhost:2181/kafkacluster --reassignment-json-file reassign.json --verify

Both fetchRequest and fetchResponse function:

  1. The fetchRequest initiates a replication request to the leader for follower.replication.throttled.rate flow limit. When the follower request flow is greater than the threshold, it is not allowed to restrict topics from sending this fetchRequest request. The next request can be sent successfully if it does not reach the threshold.
  2. The leader responds to the follower with the content fetchResponse, which is used for leader.replication.throttled.rate to limit the current. When the leader's response flow is greater than the threshold, it is not allowed to limit topics to respond to the fetchResponse response. The next response does not reach the threshold and can successfully respond

Data migration quota limit for partitions directory in the broker

Why is there a directory migration? The main reason is that with the rapid development of hardware, CPU performance has greatly improved, a physical opportunity to mount multiple disks, and cluster expansion may be added to different models, the number of mounts and performance are also different, so kafka provides the broker internal The function of limiting the data traffic of migration between directories, restricts the upper limit of the bandwidth of data copy from one directory to another. Commonly used to balance the number of partitions between mount points in the broker and reduce IO pressure

Insert picture description here

When migrating data between the current time, specific partitions will be set, and these partitions are the carriers of current limiting. The specific operations are as follows:

For specific usage of the partitions migration script in the broker, please check the partitions directory data migration

bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --entity-name 105 --alter --add-config 'replica.alter.log.dirs.io.max.bytes.per.second=104857600'

Directory data migration is set as an independent FutureLocalReplica role, which is not affected by the inter-broker replication quota current limit function

7. Summary

  1. The detailed syntax format of kafka-configs.sh is listed at the beginning of the article, which is convenient for readers to read and use
  2. Client quota current limit 3 granular configurations generate 8 priority combinations
  3. The 7 configuration modules of the Brokers type have 2 priority levels. Except for DynamicReplicationQuota, which is only a local scope of brokerId, other modules can be used in a global scope.
  4. Replication limit between brokers requires two types of combination configuration to achieve, namely brokers and topics
  5. The migration limit of the partitions directory in the broker requires the combination of the kafka-reassign-partitions.sh script to configure the replicas and migration directories of specific partitions

Quoted blog from Li Zhitao: https://www.cnblogs.com/lizherui/p/12275193.html

Guess you like

Origin blog.csdn.net/qq_21383435/article/details/108649965