[Kafka] Java client and Springboot integration kafka case

(1) Java client test Kafka case

【1】Producer

(1) Create a maven project and introduce dependencies

<dependencies>
  <dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka-clients</artifactId>
    <version>3.1.0</version>
  </dependency>
  <dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>fastjson</artifactId>
    <version>1.2.54</version>
  </dependency>
  <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-simple</artifactId>
    <version>1.7.25</version>
  </dependency>
  <dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>4.11</version>
    <scope>test</scope>
  </dependency>
</dependencies>

(2) Create an entity class

public class User {
    
    
    private Integer id;

    private String nickname;

    private String password;

    private Integer sex;

    private String birthday;

    public Integer getId() {
    
    
        return id;
    }

    public void setId(Integer id) {
    
    
        this.id = id;
    }

    public String getNickname() {
    
    
        return nickname;
    }

    public void setNickname(String nickname) {
    
    
        this.nickname = nickname;
    }

    public String getPassword() {
    
    
        return password;
    }

    public void setPassword(String password) {
    
    
        this.password = password;
    }

    public Integer getSex() {
    
    
        return sex;
    }

    public void setSex(Integer sex) {
    
    
        this.sex = sex;
    }

    public String getBirthday() {
    
    
        return birthday;
    }

    public void setBirthday(String birthday) {
    
    
        this.birthday = birthday;
    }
}

(3) Basic implementation of producer sending messages

package com.allen.kafka;

import com.allen.entity.User;
import com.alibaba.fastjson.JSON;
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;
import java.util.concurrent.ExecutionException;

/**
 * @ClassName: MySimpleProducer
 * @Author: AllenSun
 * @Date: 2022/12/23 下午1:01
 */
public class MySimpleProducer {
    
    
    private final static String TOPIC_NAME="replicatedTopic";

    public static void main(String[] args) throws ExecutionException, InterruptedException {
    
    
        // (1)设置参数
        Properties properties=new Properties();            
        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.19.11:9092,192.168.19.11:9093,192.168.19.11:9094");

        // 把发送的key从字符串序列化为字节数组
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        // 把发送消息value从字符串序列化为字节数组
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());

        // (2)创建生产消息的客户端,传入参数
        Producer<String,String> producer=new KafkaProducer<String, String>(properties);

        User user=new User();
        user.setId(1);
        user.setNickname("zhangsan");
        user.setPassword("admin");
        user.setSex(1);
        user.setBirthday("2018-11-11");

        // (3)创建消息
        // key:作用是决定了往哪个分区上发,value:具体要发送的消息内容
        // 如果未指定分区,就会通过业务key的hash运算,算出消息往哪个分区上发
        ProducerRecord<String,String> producerRecord=new ProducerRecord<String,String>(TOPIC_NAME,
                user.getId().toString(), JSON.toJSONString(user));

        // (4)发送消息,得到消息发送的元数据并输出
        // 等待消息发送成功的同步阻塞方法
        RecordMetadata metadata=producer.send(producerRecord).get();
        System.out.println("同步方法发送消息结果:"+"topic->"+metadata.topic()+"|partition->"+metadata.partition()+"|offset->"+metadata.offset());

    }
}

The above method does not specify a partition, so the hash calculation will be performed according to the parameter key value to figure out which partition to send to.
insert image description hereinsert image description here
If we modify the key value at will

ProducerRecord<String,String> producerRecord=new ProducerRecord<String,String>(TOPIC_NAME,
                "keyString", "valueString");

Then execute the test again, and you can see that all the messages have been sent to partition 0 this time.
insert image description hereinsert image description here

(4) The producer specifies the partition to send the message

The method without specifying the partition

ProducerRecord<String,String> producerRecord=new ProducerRecord<String,String>(TOPIC_NAME,
                "keyString", "valueString");

How to specify partitions

ProducerRecord<String,String> producerRecord=new ProducerRecord<String,String>(TOPIC_NAME,0,
                "keyString", "valueString");

In this way, all messages will be sent to partition0, and the default hash and key will not be used to calculate the partition number.

(5) Send synchronously

The producer sends the message synchronously. Before receiving the ack from Kafka to inform that the sending is successful, it will be in a blocked state. If the message has not been received for more than 3 seconds, the producer will retry sending the message. The number of retries is 3 times.
insert image description here

package com.allen.kafka;

import com.alibaba.fastjson.JSON;
import com.allen.entity.User;
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;
import java.util.concurrent.ExecutionException;

/**
 * @ClassName: MySimpleProducer
 * @Author: AllenSun
 * @Date: 2022/12/23 下午1:01
 */
public class MySimpleProducer02 {
    
    
    private final static String TOPIC_NAME="replicatedTopic";

    public static void main(String[] args) throws ExecutionException, InterruptedException {
    
    
        // (1)设置参数
        Properties properties=new Properties();
        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.19.11:9092,192.168.19.11:9093,192.168.19" +
                ".11:9094");

        // 把发送的key从字符串序列化为字节数组
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        // 把发送消息value从字符串序列化为字节数组
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());

        // (2)创建生产消息的客户端,传入参数
        Producer<String,String> producer=new KafkaProducer<String, String>(properties);

        User user=new User();
        user.setId(1);
        user.setNickname("zhangsan");
        user.setPassword("admin");
        user.setSex(1);
        user.setBirthday("2018-11-11");

        // (3)创建消息
        // key:作用是决定了往哪个分区上发,value:具体要发送的消息内容
        // 如果未指定分区,就会通过业务key的hash运算,算出消息往哪个分区上发
        ProducerRecord<String,String> producerRecord=new ProducerRecord<String,String>(TOPIC_NAME,
                user.getId().toString(), JSON.toJSONString(user));

        // (4)发送消息,得到消息发送的元数据并输出--同步发消息
        // 等待消息发送成功的同步阻塞方法
        try {
    
    
            // 等待消息发送成功的同步阻塞方法
            RecordMetadata metadata=producer.send(producerRecord).get();
            // 阻塞
            System.out.println("同步方法发送消息结果:"+"topic->"+metadata.topic()+"|partition->"+metadata.partition()+"|offset->"+metadata.offset());
        } catch (InterruptedException e) {
    
    
            e.printStackTrace();
            // 1、记录日志,预警系统 +1
            // 2、设置时间间隔1s,同步的方式再次发送,如果还不行,就日志预警+人工介入
            Thread.sleep(1000);
            try {
    
    
                // 等待消息发送成功的同步阻塞方法
                RecordMetadata metadata=producer.send(producerRecord).get();
            } catch (Exception e1) {
    
    
                // 人工介入了
            }
        } catch (ExecutionException e) {
    
    
            e.printStackTrace();
        }
    }
}

(6) Send asynchronously

Asynchronous sending, the producer can execute the subsequent business after sending the message, and the broker calls the callback callback method provided by the producer asynchronously after receiving the message.

package com.allen.kafka;

import com.alibaba.fastjson.JSON;
import com.allen.entity.User;
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;
import java.util.concurrent.ExecutionException;

/**
 * @ClassName: MySimpleProducer
 * @Author: AllenSun
 * @Date: 2022/12/23 下午1:01
 */
public class MySimpleProducer02 {
    
    
    private final static String TOPIC_NAME="replicatedTopic";

    public static void main(String[] args) throws ExecutionException, InterruptedException {
    
    
        // (1)设置参数
        Properties properties=new Properties();
        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.19.11:9092,192.168.19.11:9093,192.168.19" +
                ".11:9094");

        // 把发送的key从字符串序列化为字节数组
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        // 把发送消息value从字符串序列化为字节数组
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());

        // (2)创建生产消息的客户端,传入参数
        Producer<String,String> producer=new KafkaProducer<String, String>(properties);

        User user=new User();
        user.setId(1);
        user.setNickname("zhangsan");
        user.setPassword("admin");
        user.setSex(1);
        user.setBirthday("2018-11-11");

        // (3)创建消息
        // key:作用是决定了往哪个分区上发,value:具体要发送的消息内容
        // 如果未指定分区,就会通过业务key的hash运算,算出消息往哪个分区上发
        ProducerRecord<String,String> producerRecord=new ProducerRecord<String,String>(TOPIC_NAME,
                user.getId().toString(), JSON.toJSONString(user));

        // (5)异步发送消息
        producer.send(producerRecord, new Callback() {
    
    
            @Override
            public void onCompletion(RecordMetadata recordMetadata, Exception e) {
    
    
                if(e!=null){
    
    
                    System.err.println("发送消息失败:"+e.getStackTrace());
                }
                if(recordMetadata!=null){
    
    
                    System.out.println("异步方法发送消息结果:"+"topic->"+recordMetadata.topic()+"|partition->"+recordMetadata.partition()+"|offset->"+recordMetadata.offset());
                }
            }
        });
        Thread.sleep(1000000000L);
    }
}

In actual use, more messages are sent synchronously. When a message is sent asynchronously, it is executed directly without waiting for the ack to return, and messages may be lost.

(7) ack information configuration in the producer

If the producer does not receive an ack after sending the message, the producer will block for 3 seconds. If the message has not been received, it will retry, and the number of retries is 3 times. There will be three parameter configurations:
(1) -ack=0: After the producer sends a message, it can return ack directly without waiting for any partition to read the information. This kind of efficiency is the highest, but it is also the easiest to lose messages
(2) -ack=1: After one partition receives the message, that is, the leader among multiple copies receives the message and writes the message to the local log file, the ack will be returned to the producer. Performance and security are the most balanced
(3) -ack=-1 / all: After all partitions receive the message, that is, wait for the leader to receive the message and write it to the local log, and all the followers have synchronized the message and write it to the local log, can return ack. There is a default configuration of min.insync.replicas=2 (the default is 1, and the recommended configuration is greater than or equal to 2). This mode has the highest security, but the efficiency is also the slowest

insert image description here

// 发出消息持久化机制参数
properties.put(ProducerConfig.ACKS_CONFIG,"1");
// 发送失败会重试,重试3次
properties.put(ProducerConfig.RETRIES_CONFIG,"3");
// 重试间隔设置,默认重试间隔100ms
properties.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG,300);

(8) Producer local buffer configuration information

insert image description here

// 设置发送消息的本地缓冲区,消息会先发送到本地缓冲区,可以提高消息发送性能,默认值是33554432,即32MB
properties.put(ProducerConfig.BUFFER_MEMORY_CONFIG,33554432);
// kafka本地线程会从缓冲区取数据,批量发送到broker,批量发送大小默认是16384,即16k,batch满16kb就发送出去
properties.put(ProducerConfig.BATCH_SIZE_CONFIG,16384);
// 默认是0,就是消息必须立即被发送,但这样会影响性能
// 一般设置10毫秒左右,就是说这个消息发送完后会进入本地的一个batch
// 如果10毫秒内,这个batch满了16kb就会随batch一起被发送出去
// 如果10毫秒内,batch没满,那么也必须把消息发送出去,不能让消息的发送延迟时间太长
properties.put(ProducerConfig.LINGER_MS_CONFIG,10);

(9) Complete asynchronous sending case

package com.allen.kafka;

import com.alibaba.fastjson.JSON;
import com.allen.entity.User;
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;

/**
 * @ClassName: MySimpleProducer
 * @Author: AllenSun
 * @Date: 2022/12/23 下午1:01
 */
public class MySimpleProducer03 {
    
    
    private final static String TOPIC_NAME="replicatedTopic";

    public static void main(String[] args) throws ExecutionException, InterruptedException {
    
    
        // (1)设置参数
        Properties properties=new Properties();
        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.19.11:9092,192.168.19.11:9093,192.168.19" +
                ".11:9094");

        // 发出消息持久化机制参数
        properties.put(ProducerConfig.ACKS_CONFIG,"1");
        // 发送失败会重试,默认重试间隔100ms
        properties.put(ProducerConfig.RETRIES_CONFIG,"3");
        // 重试间隔设置
        properties.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG,300);

        // 设置发送消息的本地缓冲区,消息会先发送到本地缓冲区,可以提高消息发送性能,默认值是33554432,即32MB
        properties.put(ProducerConfig.BUFFER_MEMORY_CONFIG,33554432);
        // kafka本地线程会从缓冲区取数据,批量发送到broker,批量发送大小默认是16384,即16k,batch满16kb就发送出去
        properties.put(ProducerConfig.BATCH_SIZE_CONFIG,16384);
        // 默认是0,就是消息必须立即被发送,但这样会影响性能
        properties.put(ProducerConfig.LINGER_MS_CONFIG,10);

        // 把发送的key从字符串序列化为字节数组
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        // 把发送消息value从字符串序列化为字节数组
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());

        // (2)创建生产消息的客户端,传入参数
        Producer<String,String> producer=new KafkaProducer<String, String>(properties);

        int msgNum = 5;
        final CountDownLatch countDownLatch = new CountDownLatch(msgNum);
        for (int i = 1; i <= 5; i++) {
    
    
            User user=new User();
            user.setId(1);
            user.setNickname("zhangsan");
            user.setPassword("admin");
            user.setSex(1);
            user.setBirthday("2018-11-11");

            // (3)创建消息
            // key:作用是决定了往哪个分区上发,value:具体要发送的消息内容
            // 如果未指定分区,就会通过业务key的hash运算,算出消息往哪个分区上发
            ProducerRecord<String,String> producerRecord=new ProducerRecord<String,String>(TOPIC_NAME,
                    user.getId().toString(), JSON.toJSONString(user));

            // (5)异步发送消息
            producer.send(producerRecord, new Callback() {
    
    
                @Override
                public void onCompletion(RecordMetadata recordMetadata, Exception e) {
    
    
                    if(e!=null){
    
    
                        System.err.println("发送消息失败:"+e.getStackTrace());
                    }
                    if(recordMetadata!=null){
    
    
                        System.out.println("异步方法发送消息结果:"+"topic->"+recordMetadata.topic()+"|partition->"+recordMetadata.partition()+"|offset->"+recordMetadata.offset());
                    }
                    countDownLatch.countDown();
                }
            });
        }

        countDownLatch.await(5, TimeUnit.SECONDS);
        producer.close();
    }
}

insert image description here

【2】Consumer

(1) Basic implementation of consumer consumption news

package com.allen.kafka;

import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;

import java.time.Duration;
import java.util.Arrays;
import java.util.Properties;

/**
 * @ClassName: MyConsumer
 * @Author: AllenSun
 * @Date: 2022/12/23 下午8:52
 */
public class MySimpleConsumer01 {
    
    
    private final static String TOPIC_NAME = "replicatedTopic";
    private final static String CONSUMER_GROUP_NAME = "replicatedGroup";

    public static void main(String[] args) {
    
    
        Properties properties = new Properties();
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.19.11:9092,192.168.19.11:9093,192.168.19" +
                ".11:9904");

        // 消费分组名
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,CONSUMER_GROUP_NAME);
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class.getName());
        KafkaConsumer<String,String> consumer = new KafkaConsumer<String, String>(properties);
        // 消费者订阅主题列表
        consumer.subscribe(Arrays.asList(TOPIC_NAME));

        while (true) {
    
    
            ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
            for (ConsumerRecord<String,String> record:records) {
    
    
                System.out.printf("收到消息:partition=%d, offset=%d, key=%s, value=%s%n",record.partition(),
                        record.offset(),record.key(),record.value());
            }
        }
    }
}

Start the consumer first, then start the producer to send 5 pieces of data to kafka

You can see that the producer sent 5 messages, 3 to partition1, and 2 to partition0
insert image description here
insert image description here

(2) Automatically submit offset

Set autocommit parameters

// 是否自动提交offset,默认就是true
properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,"true");
// 自动提交offset的间隔时间
properties.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,"1000");

After the consumer polls the message, by default, it will automatically submit the current topic-partition consumption offset to the broker's _consumer_offsets topic.

Automatic submission will lose messages: because if the consumer cannot automatically submit the offset before consuming the polled messages, the test consumer hangs up, so the next consumer will start consuming from the next position of the submitted offsets information. Messages that have not been consumed before are lost
insert image description here

(3) Submit offset manually

(1) Set manual submission parameters

properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,"false");

(2) Manual submission after consuming the message
1-manual synchronous submission
After consuming the message, call the method of synchronous submission. When the cluster returns ack, it will be blocked until the cluster returns ack. After returning ack, it means that the submission is successful. The logic after execution

while (true) {
    
    
    ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
    for (ConsumerRecord<String,String> record:records) {
    
    
        System.out.printf("收到消息:partition=%d, offset=%d, key=%s, value=%s%n",record.partition(),
                record.offset(),record.key(),record.value());
    }

    // 上面所有的消息消费完
    if(records.count()>0){
    
    
        // 手动同步提交offset,当前线程会阻塞直到offset提交成功
        // 一般使用同步提交,因为提交之后一般也没有什么逻辑代码了
        consumer.commitAsync()// 阻塞,提交成功
    }
}

2-Manual asynchronous submission
Submit after the message is consumed, do not need to wait until the cluster ack, directly execute the subsequent logic, you can set a callback method for the cluster to call

while (true) {
    
    
    ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
    for (ConsumerRecord<String,String> record:records) {
    
    
        System.out.printf("收到消息:partition=%d, offset=%d, key=%s, value=%s%n",record.partition(),
                record.offset(),record.key(),record.value());
    }

    // 上面所有的消息消费完
    if(records.count()>0){
    
    
        // 手动异步提交offset,当前线程提交offset不会阻塞,可以继续处理后面程序逻辑
        consumer.commitAsync(new OffsetCommitCallback() {
    
    
            @Override
            public void onComplete(Map<TopicPartition, OffsetAndMetadata> map, Exception e) {
    
    
                if(e!=null){
    
    
                    System.err.println("Commit failed for "+ map);
                    System.err.println("Commit failed exception:"+ e.getStackTrace());
                }
            }
        });
    }
}

(4) Long poll poll message

(1) By default, the consumer will poll 500 messages at a time

// 消费者建立了与broker之间的长连接,开始poll消息,默认一次poll 500 条消息
properties.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG,500);

(2) The long polling time set in the code is 1000 milliseconds

while (true) {
    
    
    // 如果1s内每隔1s内没有poll到任何消息,则继续去poll消息,循环往复,直到poll到消息
    // 如果超出了1s,则此次长轮询结束
    ConsumerRecords<String,String> records = consumer.poll(Duration.ofMillis(1000));
    for (ConsumerRecord<String,String> record:records) {
    
    
        System.out.printf("收到消息:partition=%d, offset=%d, key=%s, value=%s%n",record.partition(),
                record.offset(),record.key(),record.value());
    }

(3) Process description
1 - If a poll reaches 500 messages, execute a for loop directly to traverse all messages
2 - If there is no poll to 500 messages this time, and the time is within 1s, then the long polling continues to poll, or to 500 messages, or until 1s
3-If multiple polls do not reach 500 messages, and the 1s time is up, then directly execute the for loop

(4) Kick out a weaker consumer from the consumption group.
If the interval between two polls exceeds 30s, the cluster will consider the consumer's consumption ability to be too weak, and the consumer will be kicked out of the consumption group, triggering the rebalance mechanism, which will cause Performance overhead, you can reduce the number of messages in a poll by setting this parameter

// 可以根据消费速度的快慢来设置,因为如果两次poll的时间如果超过了30s的时间间隔
// kafka会认为其消费能力过弱,将其踢出消费组
properties.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG,30*1000);

(5) Check the health status of the consumer.
The consumer sends a heartbeat to the Kafka cluster every 1s. If the cluster finds that there is a consumer who has not renewed the contract for more than 10s, it will be kicked out of the consumer group, and the rebalance mechanism of the consumer will be triggered. The partitions are handed over to other consumers in the consumer group for consumption.

// 消费者发送心跳的时间间隔
properties.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG,1000);
// kafka如果超过10s没有收到消费者的心跳,则会把消费者踢出消费组,进行rebalance,把分区分配给其他消费者
properties.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG,10*1000);

(5) Specified partition consumption

Start consumption from the most recent offset of the partition

// 消费者指定分区
consumer.assign(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));

insert image description here

(6) Message backtracking consumption

// 消费者指定分区
consumer.assign(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));
// 消费者指定分区,并且从头开始消费
consumer.seekToBeginning(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));

(7) Specify offset consumption

// 消费者指定分区
consumer.assign(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));
// 指定offset消费
consumer.seek(new TopicPartition(TOPIC_NAME,0),10);

(8) Consumption from a specified time point

// 消费者指定分区
consumer.assign(Arrays.asList(new TopicPartition(TOPIC_NAME,0)));
// 指定offset消费
consumer.seek(new TopicPartition(TOPIC_NAME,0),10);

(9) Consumption offset rules for new consumer groups

After the consumer in the new consumer group is started, it will start to consume (consume new messages) from the offset+1 of the last message in the current partition by default. The following settings can be used to allow new consumers to consume from scratch for the first time, and then start to consume new messages (the offset of the last consumed position +1)

properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");

(1) Latest: By default, consume new messages
(2) earliest: Consume from the beginning for the first time, and then start to consume new messages (the offset of the last consumed position + 1)

(2) Case of Springboot integrating Kafka

【1】Basic implementation case

(1) Introduce dependencies

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.7</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.allen</groupId>
    <artifactId>kafka_springboot</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>kafka_springboot</name>
    <description>Demo project for Spring Boot</description>
    <properties>
        <java.version>1.8</java.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka-clients</artifactId>
            <version>3.1.0</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka</artifactId>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.54</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-simple</artifactId>
            <version>1.7.25</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.11</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
            <version>RELEASE</version>
            <scope>compile</scope>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

(2) yml configuration file

server:
  port: 8088
spring:
  kafka:
    bootstrap-servers: 192.168.19.11:9092,192.168.19.11:9093,192.168.19.11:9904
    producer:
      retries: 3 #设置大于0的值,则客户端会把发送失败的记录重新发送
      batch-size: 16384
      buffer-memory: 33554432
      acks: 1
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
    consumer:
      group-id: default-group
      enable-auto-commit: false #是否为自动提交offset
      auto-offset-reset: earliest
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      max-poll-records: 500
    listener:
      ack-mode: manual_immediate #listner负责ack,每调用一次,就立即commit
#  redis:
#    host: 192.168.19.11

(3) Producer Controller

package com.allen.kafka_springboot.controller;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

/**
 * @ClassName: MyKafkaController
 * @Author: AllenSun
 * @Date: 2022/12/24 上午12:42
 */
@RestController
@RequestMapping("/msg")
public class MyKafkaController {
    
    
    private final static String TOPIC_NAME="springbootTopic";

    @Autowired
    private KafkaTemplate<String,String> kafkaTemplate;

    @RequestMapping("/send")
    public String sendMessage() {
    
    
        kafkaTemplate.send(TOPIC_NAME,0,"key","this is message");
        return "send success!";
    }
}

(4) Consumers

package com.allen.kafka_springboot.consumer;

import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.kafka.support.Acknowledgment;
import org.springframework.stereotype.Component;

/**
 * @ClassName: MyConsumer
 * @Author: AllenSun
 * @Date: 2022/12/24 上午12:48
 */
@Component
public class MyConsumer {
    
    

    @KafkaListener(topics = "javaTopic",groupId = "replicatedGroup")
    public void listenGroup(ConsumerRecord<String,String> record, Acknowledgment ack) {
    
    
        String value = record.value();
        System.out.println(value);
        System.out.println(record);
        // 手动提交offset
        ack.acknowledge();

    }
}

【2】Consumer configuration details

package com.allen.kafka_springboot.consumer;

import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.kafka.annotation.PartitionOffset;
import org.springframework.kafka.annotation.TopicPartition;
import org.springframework.kafka.support.Acknowledgment;
import org.springframework.stereotype.Component;

/**
 * @ClassName: MyConsumer
 * @Author: AllenSun
 * @Date: 2022/12/24 上午12:48
 */
@Component
public class MyConsumer {
    
    

    @KafkaListener(groupId = "replicatedGroup",topicPartitions = {
    
    
            @TopicPartition(topic = "javaTopic",partitions = {
    
    "0","1"}),
            @TopicPartition(topic = "replicatedTopic",partitions = "0",partitionOffsets = @PartitionOffset(partition = "1"
                    ,initialOffset = "100"))
    },concurrency = "3")//concurrency就是同组下的消费者个数,就是并发消费者数,建议小于等于分区总数
    public void listenGroup01(ConsumerRecord<String,String> record, Acknowledgment ack) {
    
    
        String value = record.value();
        System.out.println(value);
        System.out.println(record);
        // 手动提交offset
        ack.acknowledge();
    }
}

(3) Kafka cluster

【1】Controller

The broker in the Kafka cluster creates a temporary serial number node in zk, and the node with the smallest serial number (the first created node) will serve as the controller of the cluster, responsible for managing the status of all partitions and copies in the entire cluster: (1) When a
partition When the leader copy fails, the controller is responsible for electing a new leader copy for the partition
(2) When detecting a change in the ISR set of a partition, the controller is responsible for notifying all brokers to update their metadata information
(3) When using the kafka-topics.sh script to increase the number of partitions for a topic, the controller is also responsible for making the new partitions aware of other nodes.

【2】Rebalance

The premise is: the consumer does not specify partition consumption. When the relationship between consumers and partitions in the consumer group changes, the rebalance mechanism will be triggered. This mechanism readjusts which partitions consumers consume.

Before triggering the rebalance mechanism, there are three strategies for which partition a consumer consumes:
(1) range: calculate which partition a consumer consumes through publicity
(2) polling: everyone consumes in turn
(3) sticky: after triggering After rebalance, make adjustments on the basis that the original partition consumed by the consumer remains unchanged

【3】HW and LEO mechanism

HW is commonly known as high water mark, the abbreviation of HighWatermark, taking the smallest LEO (log-end-offset) in the ISR corresponding to a partition as HW, the consumer can only consume up to the position where HW is located, and each replica has HW, leader and Followers are responsible for updating the status of their own HW. For messages newly written by the leader, the consumer cannot

(4) Kafka online problem optimization

【1】How to prevent message loss

(1) Sender: ack is 1 or -1/all to prevent message loss. If 99.9999% is to be achieved, ack is set to all, and min.insync.replicas is configured as partition backup number (2) Consumer: set
automatic Submit to manual submission

【2】How to prevent repeated consumption of messages

A message is consumed multiple times by consumers. If the retry mechanism on the producer side is turned off and the manual submission on the consumer side is changed to automatic submission for the purpose of non-repetitive consumption of messages, then messages will be lost instead, then you can directly add the method of consuming messages to the means of preventing message loss The idempotence guarantee can solve the problem of repeated consumption of messages.

How to ensure idempotency:
(1) MySQL inserts the business id as the primary key, and the primary key is unique, so only one can be inserted at a time
(2) Use redis or zk distributed lock (mainstream solution)

【3】How to achieve sequential consumption

(1) Sender: set the ack to 0 when sending, turn off retry, use synchronous sending, and wait until the sending is successful before sending the next one. Make sure messages are sent in order.
(2) Receiver: The message is sent to a partition, and only one consumer group can receive the message. Therefore, Kafka's sequential consumption will sacrifice performance.

[4] Solve the problem of message backlog

The backlog of messages will cause many problems. If the disk is full and the performance of Kafka is too slow due to messages sent by the production end, service avalanches will easily occur. Corresponding means are required: (1) Solution 1: Start
multiple Thread, allowing multiple threads to consume at the same time. (Increase the consumption capacity of a consumer)
(2) Option 2: If Option 1 is not enough, multiple consumers can be started at this time, and multiple consumers are deployed on different servers. In fact, deploying multiple consumers on the same server can also improve consumption capacity (make full use of the CPU resources of the server)
(3) Solution 3: Let one consumer send the received message to another topic, and another topic Set up multiple partitions and multiple consumers for specific business consumption

【5】Delay Queue

The application scenario of the delay queue: If there is no payment for more than 30 minutes after the order is successfully created, the order needs to be canceled. At this time, the delay queue can be used to implement (1)
Create multiple topics, each topic represents the delay interval
1-topic_5s: 2-topic_1m queue executed with a delay of 5s
: queue executed with a delay of 1 minute
3-topic_30m: queue executed with a delay of 30 minutes
(2) The message sender sends a message to the corresponding topic with the sending time of the message
(3 ) The consumer subscribes to the corresponding topic, polls and consumes the messages in the entire topic when consuming
1-if the sending time of the message, and the current time of consumption exceeds the preset value, such as 30 minutes
2-if the sending time of the message, and the consumption If the current time does not exceed the preset value, all messages that do not consume the current offset and subsequent offsets will be consumed.
3-Continue to consume messages at this offset next time, and judge whether the time has met the preset value

(5) Construction and use of the monitoring platform kafkaeagle

Guess you like

Origin blog.csdn.net/weixin_44823875/article/details/128429673