Kafka-broker main configuration

Kafka-broker arrangement

General Configuration

broker.id

Each broker needs to have an identifier, use broker.id to represent. Its default value is 0 , may be provided any other integer. This value throughout kafka must be unique within the cluster.

port

The default configuration, Kafka used to live listens 9092 port. port can be changed to any available port.

Note that if you use 1024 following ports need root privileges to start Kafka used to live , is not recommended to use 1024 ports below.

zookeeper.connect

For storing broker metadata zookeeper address

Is set to localhost: 2181, if there are multiple, semicolon stitching. The following format: hostname: Port / path

hostname is a zookeeper machine name or server IP address

port is a zookeeper client connection port

/ path is optional zookeeper path, as kafka cluster chroot environment. If not specified, the default root path. If the specified chroot path does not exist, Broker will create it at startup.

log.dirs

kafka all messages are stored on disk to store these logs directory fragments by log.dirs specified. It is a set of local file system path separated by commas. If a plurality of paths specified, the broker will be used in accordance with the principle of least to save the partition sections of the same log into the same path.

To note, Broker will have a minimum data partition to the new partition path, not the path to have a minimum of additional disk space partition.

num.recovery.threads.per.data.dir

The following three cases, Kafka uses the thread pool may be configured to process log segment

server starts normally open log for each partition segment

server crashes after the restart, the log segment for checking each partition and the truncated

server is normally closed, to close the log segment

By default, each log directory to use only one thread. These threads will be used only to start and shut down when the server, so can provide a large number of concurrent threads to achieve the purpose of the operation. Especially for the partition contains a large number of servers, once a crash occurs, the use of parallel operation may save hours of time during recovery.

This number corresponds to the configuration log.dirs single log directory specified. For example: This value is set to 8 , log.dir designated 3 paths, then there will be a total of 24- threads.

auto.create.topics.enable

The following three cases, Kafka used to live will automatically create a theme

When a producer began to write news topics

When a consumer reads the message from the beginning theme

When any client sends a request to the metadata relating to

 

The default configuration topic

num.partitions

Specifies the newly created topics include how many partitions. When you set this value to consider how much throughput topic, read data from a single partition of maximum throughput

If the message is written in a different key to the partition, then the new partition to an existing theme will be very difficult.

Single broker on the number of partitions is limited, because the more partitions, the more memory used, completion time leader election also need longer

The number of partitions = themes throughput / consumer throughput. If the writing and reading on the topic per second from 1GB of data, each consumer can handle per second 50MB , you need at least 20 partitions.

The size of the partition is limited to 25GB less than ideal results can be obtained

log.retention.ms

Decide how long data can be retained, default log.retention.hours to set the time, the default is 168 hours one week . There log.retention.minutes , Kafka used to live with the values of the minimum time.

The data retention time is the time last modified by checking the disk log file segments to achieve. In general, the closing time last modified time refers to the log segment, which is the timestamp of the last message file.

If the administration tool between servers move partitions, last modified time is not accurate. Time errors can cause these partitions excessive retention data

log.retention.types

It is determined by the number of bytes reserved message message has expired. This value acts on each partition.

If there is a contains 8 theme partitions, and this value is set to 1GB , then the theme can retain up to 8GB of data. So when the subject of increased number of partitions, you can retain the data for the entire theme also increases

 

If both log.retention.ms and log.retention.types , as long as any one condition, the message will be deleted.

log.segment.bytes

When the message reaches the broker , they are appended to the current log partition segment. When the log segment size reaches log.segment.bytes specified on the line (default 1GB ), the current log segment will be closed, a new log segment is opened. If a log segment is closed, we began to wait expire.

The smaller this value is, the more frequent closing and assign a new file, thus reducing the overall efficiency of disk writes.

If a topic is received every day only 100MB news, according to log.segment.bytes configure the default 1GB , it will need 10 days to fill a log segment. Because before the log segment is to close the message does not expire, so if log.retention.ms is set to 168 Xiaoshi 1 week, then log fragment take up to 17 days to expire. Close the log segment which requires 10 days, according to the configuration of the expiration time, still need to retain the 7 -day period (to be deleted until the log segment in the last message before expiration).

It affects the size of the log segments acquired using a time stamp offset. When using a time stamp obtaining logs offset, Kafka checks the partition is greater than the last modification time stamp specified log segment (is turned off) before a file was last modified the log segment is less than the specified timestamp. Then, Kafka returns the log segment (i.e., file name) offset from the beginning.

For operation with a timestamp acquired offset, the smaller the log segment, the more accurate the results.

log.segment.ms

Fragments may be configured to log off time by this parameter, which specifies how long after the log segment is turned off.

log.segment.bytes and log.segment.ms mutually exclusive issues. Log segment will reach the upper limit to be closed, as long as there is a value to meet, it will close the clip in the log magnitude or time.

No default, which is only closed log fragment thereof according to the size of the log segment

Attention to the problem: an important consideration in close parallel segments of the effects of multiple log disk performance. If the log segment of the plurality of partitions can never reach the maximum size, it will happen, since the broker expiration time counted after the start of the log segment, the small amount of data for the partition, the closing operation of the log segment always occur simultaneously.

message.max.bytes

broker by providing message.max.bytes to limit the size of the single message parameters, the default value is one million , which is 1MB .

If a producer attempts to send a message exceeds this value, not only the message will not be received, also receive broker error message is returned.

This value refers to the size of the compressed message, the message may be much larger than the actual size value.

This value has a significant impact on performance. The higher the value, then responsible for handling connection requests and thread the network will need to spend more time to deal with these requests. It will also increase the size of the disk write block, thus affecting the IO throughput

Note: Consumers client settings fetch.message.max.bytes must be coordinated with this value. If fetch.message.max.bytes less than message.max.bytes , then the consumer will not be able to read the larger message in the case of consumers appear to be blocked.

Guess you like

Origin www.cnblogs.com/EnzoDin/p/12578067.html