Article directory
Production Experience_Producers improve throughput
Core parameters
Code
package com.artisan.pc;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;
import java.util.Properties;
/**
* @author 小工匠
* @version 1.0
* @mark: show me the code , change the world
*/
public class CustomProducerParameters {
public static void main(String[] args) throws InterruptedException {
// 1. 创建kafka生产者的配置对象
Properties properties = new Properties();
// 2. 给kafka配置对象添加配置信息:bootstrap.servers
properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.126.171:9092");
// key,value序列化(必须):key.serializer,value.serializer
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// batch.size:批次大小,默认16K
properties.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
// linger.ms:等待时间,默认0
properties.put(ProducerConfig.LINGER_MS_CONFIG, 1);
// RecordAccumulator:缓冲区大小,默认32M:buffer.memory
properties.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
// compression.type:压缩,默认none,可配置值gzip、snappy、lz4和zstd
properties.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");
// 3. 创建kafka生产者对象
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<String, String>(properties);
// 4. 调用send方法,发送消息
for (int i = 0; i < 10; i++) {
kafkaProducer.send(new ProducerRecord<>("artisan", "art-msg-" + i));
}
// 5. 关闭资源
kafkaProducer.close();
}
}
Production experience_data reliability
Message sending process
Review the message sending process as follows:
ACK response mechanism
background | Solutions provided by Kafka |
---|---|
The Leader receives the data and all Followers begin to synchronize the data. However, one Follower cannot synchronize due to a failure, causing the Leader to wait until synchronization is completed before sending an ACK. | - The Leader maintains a dynamic In-Sync Replica Set (ISR) and a Follower set that is synchronized with the Leader. - When the Follower in the ISR completes data synchronization, the Leader sends ACK to the Producer. - If a Follower does not synchronize data with the Leader for a long time (replica.lag.time.max.ms), the Follower will be removed from the ISR. - When the Leader fails, a new Leader will be elected from the ISR. |
ack response level
For some less important data, the reliability requirements for the data are not very high and a small amount of data loss can be tolerated, so there is no need to wait for all followers in the ISR to receive successfully.
Therefore, Kafka provides three reliability levels for users. Users can choose the following configuration according to their requirements for reliability and latency.
acks | describe |
---|---|
0 | Providing the lowest latency, the Leader replica returns ack after receiving the message, which has not yet been written to the disk. May cause data loss, especially if the Leader fails. |
1 | The Leader replica returns ack after writing the message to disk, but if the Leader fails before the Follower replica synchronizes the data, data may be lost. |
-1 | Or (all), the Leader and all Follower replicas write the message to disk before returning ack. If after the follower copy synchronization is completed, the leader copy fails before sending the ack, it may cause data duplication. |
Summary of response mechanisms
Code
package com.artisan.pc;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;
import java.util.Properties;
/**
* @author 小工匠
* @version 1.0
* @mark: show me the code , change the world
*/
public class CustomProducerAck {
public static void main(String[] args) throws InterruptedException {
// 1. 创建kafka生产者的配置对象
Properties properties = new Properties();
// 2. 给kafka配置对象添加配置信息:bootstrap.servers
properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.126.171:9092");
// key,value序列化(必须):key.serializer,value.serializer
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// 设置acks
properties.put(ProducerConfig.ACKS_CONFIG, "all");
// 重试次数retries,默认是int最大值,2147483647
properties.put(ProducerConfig.RETRIES_CONFIG, 3);
// 3. 创建kafka生产者对象
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<String, String>(properties);
// 4. 调用send方法,发送消息
for (int i = 0; i < 10; i++) {
kafkaProducer.send(new ProducerRecord<>("artisan", "art-msg-ack" + i));
}
// 5. 关闭资源
kafkaProducer.close();
}
}
Production experience_data deduplication
Data transfer semantics
Idempotence
idempotence principle
Enable idempotent configuration (enabled by default)
In the configuration object of prudocer, add parameters enable.idempotence
. The parameter value defaults to true. If it is set to false, it will be closed.
producer transaction
kafka transaction principle
Transaction code flow
// 1初始化事务
void initTransactions();
// 2开启事务
void beginTransaction() throws ProducerFencedException;
// 3在事务内提交已经消费的偏移量(主要用于消费者)
void sendOffsetsToTransaction(Map<TopicPartition, OffsetAndMetadata> offsets,
String consumerGroupId) throws ProducerFencedException;
// 4提交事务
void commitTransaction() throws ProducerFencedException;
// 5放弃事务(类似于回滚事务的操作)
void abortTransaction() throws ProducerFencedException;