kafka parameter parsing + startup parameter analysis

Kafka arguments detailed

Each kafka broker server.properties default configuration file must be configured with the following attributes:

broker.id=0  
num.network.threads=2  
num.io.threads=8  
socket.send.buffer.bytes=1048576  
socket.receive.buffer.bytes=1048576  
socket.request.max.bytes=104857600  
log.dirs=/tmp/kafka-logs  
num.partitions=2  
log.retention.hours=168  
  
log.segment.bytes=536870912  
log.retention.check.interval.ms=60000  
log.cleaner.enable=false  
  
zookeeper.connect=localhost:2181  

System parameters

# Uniquely identifies the cluster ID, requirement is a positive number.
= 0 broker.id
# service port, default 9092
Port = 9092
# listen address
host.name = debugo01

# Processing network requests the maximum number of threads
num.network.threads = 2
# processing disk I / O number of threads
num.io.threads. 8 =
# number of background threads
background.threads =. 4
# thread waiting for IO processing request queue maximum number of
queued.max.requests = 500

# The socket transmit buffer (the SO_SNDBUF)
socket.send.buffer.bytes = 1048576
# socket receive buffer (the SO_RCVBUF)
socket.receive.buffer.bytes = 1048576
# socket maximum number of bytes requested. To prevent memory overflow, message.max.bytes bound to less than
socket.request.max.bytes = 104857600

Topic parameters

# The number of partitions for each topic, the more partition will generate more File segment
num.partitions = 2
# is allowed to automatically create topic, if false, you need to create topic command
auto.create.topics.enable = true
# a Topic, the number of the default partition replication can not be larger than the number of broker cluster.
default.replication.factor = 1
Maximum size # of the message body, in bytes
message.max.bytes = 1000000

ZooKeeper parameters

# Zookeeper quorum settings. If there are a plurality of comma separated
zookeeper.connect = debugo01: 2181, debugo02, debugo03
# zk connection timeout
zookeeper.connection.timeout.ms = 1000000
synchronization between the actual cluster # ZooKeeper leader and follower
zookeeper.sync.time .ms = 2000

Log parameters

# Log storage directory, a plurality of directories separated by commas
log.dirs = / var / log / kafka

# Log cleanup strategy (the Delete | Compact)
log.cleanup.policy the Delete =
# save log time (hours | minutes), the default is 7 days (168 hours). More than this time will be processed according to policy data. bytes and minutes will trigger whichever is reached first.
= 168 log.retention.hours
# log data stored in the maximum number of bytes. More than this time will be processed according to policy data.
# log.retention.bytes = 1073741824

# Control segment size of the log file, the size of which is appended to exceed a new log file segment (-1 means no limit)
log.segment.bytes = 536870912
# when the time reaches below, will force a new segment
log.roll. hours = 24 * 7
inspection cycle # log file fragments to see if they reached a set of removal policies (log.retention.hours or log.retention.bytes)
log.retention.check.interval.ms = 60000

# Whether to open compressed
log.cleaner.enable = false
# For the longest time reserved compressed log
log.cleaner.delete.retention.ms = 1 day

For the index file # log segment size limit
log.index.size.max.bytes = 10 * 1024 * 1024
#y a buffer index calculation, generally need not be provided.
log.index.interval.bytes = 4096

A copy of the parameters

# Whether to automatically balance the allocation policy between the Broker
auto.leader.rebalance.enable = false
unbalanced ratio # leader, and if more than this value, the partition will be re-balanced
leader.imbalance.per.broker.percentage = 10
# check whether the leader imbalance interval
leader.imbalance.check.interval.seconds = 300
# clients to retain the maximum offset information space
offset.metadata.max.bytes = 1024

Consumers parameters

# Consumer end of the core configuration is the group.id, zookeeper.connect
# determines a unique group ID for the Consumer's home, The Same By Setting Processes Indicate that Group ID Multiple Part All They are Consumer Group of The Same.
The group.id
# consumer the ID, if not set, will increment
consumer.id
# for a follow-ID, preferably with the same group.id
client.id = <group_id>

# Socket timeout, the actual timeout period is socket.timeout.ms + max.fetch.wait.
Socket.timeout.ms = 30 * 1000
receives the # socket buffer space size
socket.receive.buffer.bytes = 64 * 1024
from message # limits the size of each partition fetch
fetch.message.max.bytes = 1024 * 1024

When # true, Consumer sync to zookeeper after offset consumer news, such as the Consumer fails, the new consumer will be able to obtain the latest offset from the zookeeper
auto.commit.enable = to true
time interval automatically submitted #
auto.commit. interval.ms = 60 * 1000

# The maximum number of messages for consumption buffer size blocks, each block may be equivalent to the value fetch.message.max.bytes
queued.max.message.chunks = 10

# When a new consumer added to group, and will attempt reblance, consumer side partitions to migrate to the new consumer, the number of attempts is provided
rebalance.max.retries. 4 =
# reblance each interval
rebalance.backoff 2000 = .ms
# re-elected leader of each time
refresh.leader.backoff.ms

# Server sends the data to the minimum consumer side, if it does not satisfy this value will wait until meet the specified size. The default is 1 represents the reception immediately.
=. 1 fetch.min.bytes
# If not satisfied when fetch.min.bytes, waiting for the end consumer requests the maximum waiting time
fetch.wait.max.ms = 100
# If no new messages can be used for consumption within the specified time, to throw an abnormality default -1 indicates unlimited
consumer.timeout.ms = -1

Producers parameters

# Consumers get the message meta information (topics, partitions and replicas) address, configuration format is: host1: port1, host2: port2 , also can set up a vip outside
metadata.broker.list

Message acknowledgment mode #
# 0: confirmation is not guaranteed to arrive messages, just sent, but the loss of low-latency message will appear, in the case of a server failure, a bit like TCP
# 1: send a message and waits leader close after the confirmation, some reliability
# -1: after sending a message, waits for receipt of acknowledgment leader, and the copy operation, only to return, maximum reliability
request.required.acks = 0

The maximum time buffer data in asynchronous mode #. After transmitting the message, for example, set to 100 will be set within 100ms, this will increase throughput, but will increase the delay message transmitted
queue.buffering.max.ms = 5000
maximum number of messages in the buffer # asynchronous mode, supra
queue. = 10000 buffering.max.messages
# asynchronous mode, the latency to enter the message queue. If set to 0, then no message is waiting, if not enter the queue, is discarded directly
queue.enqueue.timeout.ms = -1
at # asynchronous mode, the number of messages sent each time, or when queue.buffering.max.messages queue.buffering.max.ms meet one of the conditions will be triggered when the producer sent.
batch.num.messages = 200



Server.properties all configuration parameters (for explanation) the list below:

 

parameter

Description (explain)

broker.id =0

Each broker uniquely represented in the cluster, the requirement is a positive number. When the IP address of the server is changed, broker.id does not change, the situation will not affect consumers of news

log.dirs=/data/kafka-logs

Kafka storage address data, a plurality of address words separated by commas / data / kafka-logs-1, / data / kafka-logs-2

port =9092

broker server service port

message.max.bytes =6525000

It represents the maximum size of the message body, in bytes

num.network.threads =4

The maximum number of threads broker processing a message, generally do not need to modify

num.io.threads =8

The number of threads broker disk IO processing, value should be greater than the number of your hard drive

background.threads =4

Some background task processing threads, for example, delete outdated message file, under normal circumstances you do not need to modify

queued.max.requests =500

IO Threading waiting queue maximum number of requests, if the request is waiting for IO exceeds this value, it will stop accepting external message, it should be a self-protection mechanism.

host.name

broker host address, if set up, it will bind to this address, if not, it will bind to all interfaces, and send one of them to ZK, generally do not set

socket.send.buffer.bytes=100*1024

The socket transmit buffer, socket tuning parameters SO_SNDBUFF

socket.receive.buffer.bytes =100*1024

the socket receive buffer, socket tuning parameters SO_RCVBUFF

socket.request.max.bytes =100*1024*1024

socket requested maximum value, when the cover to prevent the specified parameters serverOOM, message.max.bytes bound to less than socket.request.max.bytes, will be created topic

log.segment.bytes =1024*1024*1024

topic partition based on a bunch of files stored in the segment, the control of the size of each segment, covering the specified parameters will be created when the topic

log.roll.hours =24*7

This parameter does not reach the size of log.segment.bytes set in the log segment, covering the time specified parameters will be forced to create a new segment will be created topic

log.cleanup.policy = delete

Log cleanup strategy choose: delete expired and compact mainly for data processing, or log file reaches the limit on the amount of specified parameters will be created when the topic covered

log.retention.minutes=3days

The maximum data storage time exceeds this time will be set according to the policy process data log.cleanup.policy, that is, how long can the consumer end to consumption data

log.retention.bytes and log.retention.minutes any one to meet the requirements, will perform the deletion, specify the parameters will be created when the topic covered

log.retention.bytes=-1

The maximum file size for each partition of the topic, a topic size limit = number of partitions * log.retention.bytes. -1 no size limit log.retention.bytes and log.retention.minutes any one to meet the requirements, will perform the removal, will be covered by the parameters specified when the topic is created

log.retention.check.interval.ms=5minutes

Check the file size of the cycle time, whether the punishment set in the policy log.cleanup.policy

log.cleaner.enable=false

Log compression is turned on

log.cleaner.threads = 2

The number of threads running log compression

log.cleaner.io.max.bytes.per.second=None

Log Compression dealing with the maximum size

log.cleaner.dedupe.buffer.size=500*1024*1024

Log Compression deduplication cache space when, in the case of space permits, the bigger the better

log.cleaner.io.buffer.size=512*1024

General IO block sizes when used without modifying the log cleanup

log.cleaner.io.buffer.load.factor =0.9

日志清理中hash表的扩大因子一般不需要修改

log.cleaner.backoff.ms =15000

检查是否处罚日志清理的间隔

log.cleaner.min.cleanable.ratio=0.5

日志清理的频率控制,越大意味着更高效的清理,同时会存在一些空间上的浪费,会被topic创建时的指定参数覆盖

log.cleaner.delete.retention.ms =1day

对于压缩的日志保留的最长时间,也是客户端消费消息的最长时间,同log.retention.minutes的区别在于一个控制未压缩数据,一个控制压缩后的数据。会被topic创建时的指定参数覆盖

log.index.size.max.bytes =10*1024*1024

对于segment日志的索引文件大小限制,会被topic创建时的指定参数覆盖

log.index.interval.bytes =4096

当执行一个fetch操作后,需要一定的空间来扫描最近的offset大小,设置越大,代表扫描速度越快,但是也更好内存,一般情况下不需要搭理这个参数

log.flush.interval.messages=None

log文件”sync”到磁盘之前累积的消息条数,因为磁盘IO操作是一个慢操作,但又是一个”数据可靠性"的必要手段,所以此参数的设置,需要在"数据可靠性"与"性能"之间做必要的权衡.如果此值过大,将会导致每次"fsync"的时间较长(IO阻塞),如果此值过小,将会导致"fsync"的次数较多,这也意味着整体的client请求有一定的延迟.物理server故障,将会导致没有fsync的消息丢失.

log.flush.scheduler.interval.ms =3000

检查是否需要固化到硬盘的时间间隔

log.flush.interval.ms = None

仅仅通过interval来控制消息的磁盘写入时机,是不足的.此参数用于控制"fsync"的时间间隔,如果消息量始终没有达到阀值,但是离上一次磁盘同步的时间间隔达到阀值,也将触发.

log.delete.delay.ms =60000

文件在索引中清除后保留的时间一般不需要去修改

log.flush.offset.checkpoint.interval.ms =60000

控制上次固化硬盘的时间点,以便于数据恢复一般不需要去修改

auto.create.topics.enable =true

是否允许自动创建topic,若是false,就需要通过命令创建topic

default.replication.factor =1

是否允许自动创建topic,若是false,就需要通过命令创建topic

num.partitions =1

每个topic的分区个数,若是在topic创建时候没有指定的话会被topic创建时的指定参数覆盖

 

 

以下是kafka中Leader,replicas配置参数

 

controller.socket.timeout.ms =30000

partition leader与replicas之间通讯时,socket的超时时间

controller.message.queue.size=10

partition leader与replicas数据同步时,消息的队列尺寸

replica.lag.time.max.ms =10000

replicas响应partition leader的最长等待时间,若是超过这个时间,就将replicas列入ISR(in-sync replicas),并认为它是死的,不会再加入管理中

replica.lag.max.messages =4000

如果follower落后与leader太多,将会认为此follower[或者说partition relicas]已经失效

##通常,在follower与leader通讯时,因为网络延迟或者链接断开,总会导致replicas中消息同步滞后

##如果消息之后太多,leader将认为此follower网络延迟较大或者消息吞吐能力有限,将会把此replicas迁移

##到其他follower中.

##在broker数量较少,或者网络不足的环境中,建议提高此值.

replica.socket.timeout.ms=30*1000

follower与leader之间的socket超时时间

replica.socket.receive.buffer.bytes=64*1024

leader复制时候的socket缓存大小

replica.fetch.max.bytes =1024*1024

replicas每次获取数据的最大大小

replica.fetch.wait.max.ms =500

replicas同leader之间通信的最大等待时间,失败了会重试

replica.fetch.min.bytes =1

fetch的最小数据尺寸,如果leader中尚未同步的数据不足此值,将会阻塞,直到满足条件

num.replica.fetchers=1

leader进行复制的线程数,增大这个数值会增加follower的IO

replica.high.watermark.checkpoint.interval.ms =5000

每个replica检查是否将最高水位进行固化的频率

controlled.shutdown.enable =false

是否允许控制器关闭broker ,若是设置为true,会关闭所有在这个broker上的leader,并转移到其他broker

controlled.shutdown.max.retries =3

控制器关闭的尝试次数

controlled.shutdown.retry.backoff.ms =5000

每次关闭尝试的时间间隔

leader.imbalance.per.broker.percentage =10

leader的不平衡比例,若是超过这个数值,会对分区进行重新的平衡

leader.imbalance.check.interval.seconds =300

检查leader是否不平衡的时间间隔

offset.metadata.max.bytes

客户端保留offset信息的最大空间大小

kafka中zookeeper参数配置

 

zookeeper.connect = localhost:2181

zookeeper集群的地址,可以是多个,多个之间用逗号分割hostname1:port1,hostname2:port2,hostname3:port3

zookeeper.session.timeout.ms=6000

ZooKeeper的最大超时时间,就是心跳的间隔,若是没有反映,那么认为已经死了,不易过大

zookeeper.connection.timeout.ms =6000

ZooKeeper的连接超时时间

zookeeper.sync.time.ms =2000

ZooKeeper集群中leader和follower之间的同步实际那

 

 

kafka命令和参数

1.查看topic的详细信息 
./kafka-topics.sh -zookeeper 127.0.0.1:2181 -describe -topic testKJ1 
  
2、为topic增加副本 
./kafka-reassign-partitions.sh -zookeeper 127.0.0.1:2181 -reassignment-json-file json/partitions-to-move.json -execute 
  
3、创建topic 
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic testKJ1 
  
4、为topic增加partition 
./bin/kafka-topics.sh –zookeeper 127.0.0.1:2181 –alter –partitions 20 –topic testKJ1 
  
5、kafka生产者客户端命令 
./kafka-console-producer.sh --broker-list localhost:9092 --topic testKJ1 
  
6、kafka消费者客户端命令 
./kafka-console-consumer.sh -zookeeper localhost:2181 --from-beginning --topic testKJ1 
  
7、kafka服务启动 
./kafka-server-start.sh -daemon ../config/server.properties  
  
8、下线broker 
./kafka-run-class.sh kafka.admin.ShutdownBroker --zookeeper 127.0.0.1:2181 --broker #brokerId# --num.retries 3 --retry.interval.ms 60 
shutdown broker 
  
9、删除topic 
./kafka-run-class.sh kafka.admin.DeleteTopicCommand --topic testKJ1 --zookeeper 127.0.0.1:2181 
./kafka-topics.sh --zookeeper localhost:2181 --delete --topic testKJ1 
  
10、查看consumer组内消费的offset 
./kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zookeeper localhost:2181 --group test --topic testKJ1

 


1,删除操作:

$bin/kafka-topics.sh --zookeeper hadoop108:2181 --delete --topic first

  默认是标记删除,如果要真的删除,需要设置:delete.topic.enable=true

2,创建一个主题,该主题有三个分区,有两个副本

$bin/kafka-topics.sh --zookeeper hadoop108:2181 --create --topic first --partitions 3 --replication-factor 2


3,查看当前有多少主题:

$ bin/kafka-topics.sh --zookeeper hadoop108:2181 --list 
first

  因为主题相关的信息都存储在zookeeper中,所以我们需要连接到zookeeper集群,获取到主题相关的数据

4,查看某个topic的详情:

$ bin/kafka-topics.sh --zookeeper hadoop108:2181 --describe --topic first
Topic:first    PartitionCount:3    ReplicationFactor:2    Configs:
Topic: first    Partition: 0    Leader: 9    Replicas: 9,10    Isr: 9,10
Topic: first    Partition: 1    Leader: 10    Replicas: 10,8    Isr: 10,8
Topic: first    Partition: 2    Leader: 8    Replicas: 8,9    Isr: 8,9

  这里,主题是名称是first,分区0,的leader在9上,副本在9 和10 上,分区1的leader在10上,副本在8和10 上;分区2的leader在8上,副本在8,9 上;ISR:in sync replication,正在同步的副本;

  如此此时我们kill掉10;

bin/kafka-topics.sh --zookeeper hadoop108:2181 --describe --topic first
Topic:first    PartitionCount:3    ReplicationFactor:2    Configs:
Topic: first    Partition: 0    Leader: 9    Replicas: 9,10    Isr: 9
Topic: first    Partition: 1    Leader: 8    Replicas: 10,8    Isr: 8
Topic: first    Partition: 2    Leader: 8    Replicas: 8,9    Isr: 8,9

  此时,分区1的leader变为8等待一会时间之后,分区会自平衡,所谓的自平衡就是leader均匀的分布,在本题中就是partition2的leader会重新恢复为10


5,生产数据:数据的生产和zookeeper没有关系

$ bin/kafka-console-producer.sh --broker-list hadoop108:9092 --topic first
>hello
>kafka

  生产数据的时候需要获得消息队列集群的broker,这样才能知道生产完毕的数据放在哪里,所以需要指定kafka消息队列的集群在集群通信的端口号。c除此之外,还需要指定该数据是在哪个分区的。

6,数据的消费,需要连接到zookeeper,这样才能获取上次消费的offset,从而决定从哪里消费,还要指定消费的主题

[isea@hadoop108 kafka]$ bin/kafka-console-consumer.sh --zookeeper hadoop108:2181 --from-beginning -topic first
hello
kafka

  在生产者中生产数据之后,在消费者端能够接收到数据

  我们在查看一下数据:

[isea@hadoop108 kafka]$ bin/kafka-topics.sh --zookeeper hadoop108:2181 --describe --topic first
Topic:first    PartitionCount:3    ReplicationFactor:2    Configs:
Topic: first    Partition: 0    Leader: 9    Replicas: 9,10    Isr: 9,10
Topic: first    Partition: 1    Leader: 10    Replicas: 10,8    Isr: 8,10
Topic: first    Partition: 2    Leader: 8    Replicas: 8,9    Isr: 8,9

 

  当前的机器是8,分区2的数据,保存了两份,在8 9 两台机器上,leader在8,所以我们查看一下logs下的first2,下面是有数据的,为kafka;

[isea@hadoop108 first-2]$ strings 00000000000000000000.log 
kafka

 

  对于分区0 ,在9 10 机器上有数据信息,leader在9上,所以查看一下logs下面的first0,有数据为hello

hello[isea@hadoop109 first-0]$ strings 00000000000000000000.log 
hello

 

  分析:我们产生的消息是 hello 和 kafka,这是一份完整的数据被标记为first主题,该first主题有三个分区,被标记为first主题的每一粒数据,进入消息队列的时候将会被存储到这三个分区中的一个,这里hello被存储到了分区0,分区0在9号机器上,同时分区0还有一个备份在10号机器上;而kafka被存储到了分区1,在10号机器上,同时分区1在8号机器上还有一个备份


  在kafka0.9及其之后,kafka消费者做了相应的变化,将原本放在zookeeper中的offset数据放在了kafka的集群本地,于是我们还可以使用下面的客户端命令来进行消费:

[isea@hadoop101 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server hadoop101:9092 --topic first 
hello
kafka

 

  此时我们在查看一下log日志的:

[isea@hadoop101 logs]$ tree
.
├── cleaner-offset-checkpoint
├── __consumer_offsets-0
│   ├── 00000000000000000000.index
│   ├── 00000000000000000000.log
│   ├── 00000000000000000000.timeindex
│   └── leader-epoch-checkpoint


  发现offset的信息存储在了log中,而且偏移量的信息保存在了集群中,也即所有的机器中,但是整个集群合起来是所有的偏移量的信息。



参考:

https://www.jianshu.com/p/49f23183a6a3
https://www.jb51.net/article/99923.htm

https://blog.csdn.net/qq_31807385/article/details/84948701

Guess you like

Origin www.cnblogs.com/51python/p/10966757.html