Kafka key parameter settings

  Production environments Kafka, parameter tuning is very important, and many Kafka parameters, our Configuration Code java, the parameter is often set as follows:

Properties props = new Properties();

props.put("bootstrap.servers", "localhost:9092");

props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");

props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

props.put("buffer.memory", 67108864);

props.put("batch.size", 131072);

props.put("linger.ms", 100);

props.put("max.request.size", 10485760);

props.put("retries", 10);

props.put("retry.backoff.ms", 500);

props.put("acks", "1"); 

KafkaProducer<String, String> producer = new KafkaProducer<String, String>(props);

 

  • buffer.memory

  Kafka's client sends data to the server, not by a hair on one, but after the buffer, that is, the message sent out by KafkaProducer are the first to enter into the client's local memory buffer, and then put a lot of information collection Batch into a one, then sent to the Broker up, so that it may be a high performance.

  It is used to constrain the nature buffer.memory KafkaProducer buffer memory size can be used, the default value of 32MB.

  If buffer.memory set is too small, the problem may be caused by: fast write messages in memory buffer, but too late to Sender thread Send Request to Kafka server, will be a memory buffer will soon be filled. And once filled, will block users thread, do not let Kafka continued to write news. 

  Therefore, "buffer.memory" actual traffic situation parameters required pressure measurement, the user needs to measure the number of threads per frequency message will be written to memory buffers in a production environment. After pressure testing, debugging out a reasonable value.

 

  • batch.size

  To store the data after each Batch batch.size size, you can send out. For example batch.size default is 16KB, which will then send lobbied 16KB of data.

  In theory, enhance the size batch.size may allow more data buffering inside, then the amount of data Request to send out even more, this throughput may be improved.

  But batch.size can not be too large, if the data buffer is always there in the Batch delay in sending out, then delayed sending the message will be high.

  Generally this parameter can try to adjust a little bigger, a message using the production environment load test.

 

  • linger.ms

  After a Batch is created, most too long, regardless of the Batch has not filled, must be sent out.

  For example batch.size is 16KB, but now a low peak periods, sending a message is very small. After that might occur if Batch is created, the news came in, but they could not lobbied 16KB, has been waiting for this time do it?

  Of course not, it is assumed to set "linger.ms" is 50ms, so long as the Batch creation from the beginning to now has been 50ms, even if he does not have full 16KB, it will be sent. 

  So "linger.ms" decided to write a message once Batch, wait up so much time, he will be sent along to Batch. 

  linger.ms batch.size together with the setting, you can avoid a delay in Minato Batch discontent, resulting in backlog of messages have been sent in memory does not go out.

  

  • max.request.size

  It determines the maximum size of each server transmits the request message to Kafka.

  If the message is large packets transmitted messages, each data is large, for example, a message may be 20KB. At this time, need to adjust batch.size larger, such as setting 512KB, buffer.memory also need to transfer larger, such as setting 128MB. 

  Only in this way can the big news at the scene, but also use the Batch pack multiple messages mechanism.

  At this point "max.request.size" also have increased simultaneously.

 

  • retries and retries.backoff.ms

  Retry mechanism, that is, if a request fails can be retried several times, each retry interval is how many milliseconds, according to business need to set the scene.

 

  • acks

acks

meaning
0  Producer cluster to send data without waiting for the return of the cluster, ensure that the message is not sent successfully. But the least secure maximum efficiency.
1  Producer sends data to a cluster as long as Leader reply can be sent next, only to ensure the Leader reception was successful.
-1 or all  Producer 往集群发送数据需要所有的ISR Follower 都完成从 Leader 的同步才会发送下一条,确保 Leader 发送成功和所有的副本都成功接收。安全性最高,但是效率最低。

 

 

Guess you like

Origin www.cnblogs.com/wwcom123/p/11181680.html