Kafka reply ack mode
• 0: The data sent by the producer does not need to wait for the data to be placed on the disk to respond.
• 1: The data sent by the producer, the leader will reply after receiving the data.
Possible situation: After the response, the leader hangs up and re-election is made. At this time, the data of the previous response will not be sent.
• -1 (all): The data sent by the producer, all nodes in the Leader+ and isr queues will respond after collecting the data. -1 is equivalent to all.
Possible situations: When a follower hangs up, it responds late, so a dynamic isr is maintained internally. When sending a request or synchronizing data, the isr will be raised if there is no response within 30s by default.
Guarantee data reliability
If the partition replica is set to 1, or the minimum number of replicas in the ISR (min.insync.replicas defaults to 1) is set to 1, the effect is the same as ack=1, but there is still a risk of losing the number (leader: 0 , isr:0).
• Data is completely reliable condition = ACK level is set to -1 + partition replica is greater than or equal to 2 + the minimum number of replicas acknowledged in ISR is greater than or equal to 2
Code
public class CustomProducer {
public static void main(String[] args) throws ExecutionException, InterruptedException {
//属性配置
Properties properties = new Properties();
//连接集群
properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.6.101:9092");
//指定k、v序列化类型
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());
// 设置 acks
properties.put(ProducerConfig.ACKS_CONFIG, "1");
// 重试次数 retries,默认是 int 最大值,2147483647
properties.put(ProducerConfig.RETRIES_CONFIG, 3);
//创建生产者对象
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<>(properties);
//像first主题发送数据
kafkaProducer.send(new ProducerRecord<>("first", 1,"","lzq"),new Callback() {
@Override
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
if (e==null){
System.out.println("发送成功,主题"+recordMetadata.topic()+"分区"+recordMetadata.partition());
}
}
}).get();
//关闭资源
kafkaProducer.close();
}
}
Data duplication problem
When sending data, the leader receives it, like a follower synchronously, and hangs up when responding
idempotency
Idempotency means that no matter how many times the Producer sends repeated data to the Broker, the Broker side will only persist one piece of data, ensuring non-duplication. Exactly Once = idempotent + at least once (ack=-1 + number of partition replicas>=2 + minimum number of replicas in ISR>=2)
Judgment criteria for duplicate data: When a message with the same primary key is submitted, the Broker will only persist one message. The PID is a new one that Kafka will assign every time it restarts; Partition represents the partition number; Sequence Number is monotonically increasing. So idempotency can only guarantee that there is no repetition within a single partition and single session.
The open parameter enable.idempotence defaults to true, false to close
working process
Code
public class CustomProducer {
public static void main(String[] args) throws ExecutionException, InterruptedException {
//属性配置
Properties properties = new Properties();
//连接集群
properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.6.101:9092");
//指定k、v序列化类型
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());
//关联自定义分区
properties.put(ProducerConfig.PARTITIONER_CLASS_CONFIG,"com.lzq.producer.MyPartitioner");
// 设置 acks
properties.put(ProducerConfig.ACKS_CONFIG, "all");
// 重试次数 retries,默认是 int 最大值,2147483647
properties.put(ProducerConfig.RETRIES_CONFIG, 3);
//指定事务id
properties.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG,"t1");
//创建生产者对象
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<>(properties);
//初始化、开启时候
kafkaProducer.initTransactions();
kafkaProducer.beginTransaction();
try {
//像first主题发送数据
kafkaProducer.send(new ProducerRecord<>("first", 1,"","lzq"),new Callback() {
@Override
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
if (e==null){
System.out.println("发送成功,主题"+recordMetadata.topic()+"分区"+recordMetadata.partition());
}
}
});
//提交事务
kafkaProducer.commitTransaction();
}catch (Exception e){
//回滚事务
kafkaProducer.abortTransaction();
}finally {
//关闭资源
kafkaProducer.close();
}
}
}
data out of order
Kafka in 1.x and later versions guarantees that the single partition of data is ordered, and the conditions are as follows:
(1) If idempotency is not enabled, max.in.flight.requests.per.connection needs to be set to 1.
(2) To enable idempotency, max.in.flight.requests.per.connection needs to be set to be less than or equal to 5.
Reason: Because after kafka1.x, after enabling idempotency, the kafka server will cache the metadata of the last five requests sent by the producer, so in any case, the data of the last five requests can be guaranteed to be in order. , which is to cache the request first in the reordering
load balancing
Create a josn file vim topics-to-move.json
{
"topics": [
{"topic": "first"}
],
"version": 1
}
Generate a load balanced plan
bin/kafka-reassign-partitions.sh -- bootstrap-server 192.168.6.100:9092 --topics-to-move-json-file topics-to-move.json --broker-list "0,1,2,3" --gener
A load balancing plan is automatically generated
After creating a josn file, copy the corresponding plan
vim increase-replication-factor.json
Implementation plan
bin/kafka-reassign-partitions.sh --bootstrap-server 192.168.6.100:9092 --reassignment-json-file increase-replication-factor.json --execute
verification plan
bin/kafka-reassign-partitions.sh --bootstrap-server 192.168.6.100:9092 --reassignment-json-file increase-replication-factor.json --verify
Decommission old nodes
Regenerate the execution plan
bin/kafka-reassign-partitions.sh --bootstrap-server 192.168.6.100:9092 --topics-to-move-json-file topics-to-move.json --broker-list "0,1,2" --generate
startup script
#!/bin/bash
case $1 in
"start")
for i in ip address
do
ssh $i "absolute path"
;;
"stop")
;;
esac