How Spring-Kafka implements batch consumption of messages without losing data

Give the answer first:

		// 批量消费配置: 1批量, 2手动提交
		factory.setBatchListener(true);
		factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);

		// 调大fetch的相关参数, 以便于提升吞吐量, 但会增大延时
		// 一次poll操作最大获取的记录数量
        propsMap.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, maxPollRecords); // max.poll.records, 缺省是500
		// 一次fetch操作最小的字节数, 如果低于这个字节数, 就会等待, 直到超时后才返回给消费者. 这里给100kB, 缺省是1B
		propsMap.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1024 * 100); // fetch.min.bytes
		// 一次fetch操作的最大等待时间，“最大等待时间”与“最小字节”任何一个先满足了就立即返回给消费者
		// 需要注意：“最大等待时间”不能超过 session.timeout.ms 和 request.timeout.ms
		propsMap.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 10000); // fetch.max.wait.ms, 缺省是500

		// 在消费者方法中注入acknowledgment并在执行完业务逻辑后手动调用确认方法
		acknowledgment.acknowledge();

1. Background:

A certain business object is associated with multiple tables. To create the object, you need to insert data into multiple tables. Monitoring based on canal will have multiple change records of the object, and Kafka will also process the same change multiple times when consuming. For an object (different tables, but different parts of the same object), the original Kafka consumer processes one object at a time, which will cause repeated processing of the same object. In fact, you only need to process the object once after all tables have been inserted.

2. Existing technical architecture:

mysql --> canal --> Kafka --> Spring-Kafka consumer --> downstream interface

3. Solution:

Optimize the Spring-Kafka consumer, change from single processing to multiple processing, and then merge the same objects in the consumer, so as to achieve the purpose of processing an object only once (at most twice). Why is it possible twice? Because when batching, it is easy to divide multiple messages of the same object into upper and lower batches, so that this object will be processed twice; then why not be processed three times? In fact, the situation last time can also be created, that is, if the batch size is small, it will easily occur three times or even more times. So usually we set the batch size to be larger than the number of tables.

For example: an object is created with 10 tables of inserted messages, and the batch size is set to 4. At this time, the 10 messages are split into 3 batches with a batch size of 4, and the object is processed once in each batch, then the The object will be processed 3 times. Therefore, the batch size is an important parameter, and its value is usually set to a large value, but no matter how large it is, it will inevitably be divided into two batches.

4. Implementation steps

Spring-kafka has supported batch consumption since version 1.1. You need to set batchListener= in the ContainerFactory true
and set the consumer parameters max.poll.recordsto control the maximum number of records in a batch. The default value of this parameter is 500.

AbstractKafkaListenerContainerFactoryThe source code of the class is as follows:

	/**
	 * Set to true if this endpoint should create a batch listener.
	 * @param batchListener true for a batch listener.
	 * @since 1.1
	 */
	public void setBatchListener(Boolean batchListener) {
    
    
		this.batchListener = batchListener;
	}

You can learn from the javaDoc @sincethat this function has existed since version 1.1, so as long as the version of Spring-Kafka is higher than 1.1, it supports the batch consumption function. This parameter is @KafkaListenerused in combination. The writing method of single consumption is as follows:

@KafkaListener(id = "", groupId = "", topics = {
    
    })
public void listen(ConsumerRecord<String, String> data) {
    
    
}

The above can only process one message at a time, and cannot process multiple messages in an overall manner, so the following batch consumption writing method is used:

@KafkaListener(id = "", groupId = "", topics = {
    
    }, containerFactory = "")
public void listen(List<ConsumerRecord<String, String>> datas) {
    
    
}

As can be seen from the above code, batch consumption only requires ConsumerRecordchanging the type of parameters List<ConsumerRecord>.
Is this enough? If the batch is larger, the processing time for this batch of data will be longer, which will easily cause data loss. The scenario is as follows:
1. Turn on automatic offset submission,
2. The time interval for submitting offsets is 1s,
3. It takes 2s to process this batch of data.
The following conditions must also occur to cause the message to be lost:
1. The "automatic offset submission time" has arrived and the offset submission is successfully executed.
2. A serious error occurs in the program at almost the same time, causing the process to exit (pay attention to the consumer's code logic at this time). has not been executed)
, then the message will be lost because Kafka has received the submission of offset, so Kafka thinks that this batch of messages has been processed successfully, but the program has not actually been processed successfully. When the program is started next time, it will be recorded from Kafka The offset starts consuming, and the "recorded offset" is the "submitted offset" before an exception occurs and exits. Therefore, the batch of messages at the time of the last abnormal exit has been lost and will not be consumed again.

Example:
Assume that the offset encoding list of the messages consumed by the consumer in this batch is 5, 6, and 7. When automatically submitting the offset, 7 will be submitted to Kafka, indicating that the next message will be consumed starting from 8. However, these three messages 567 were terminated unexpectedly before the consumer process could be processed. When the error is manually processed and the program is restarted, consumption will start from 8, because Kafka thinks that 567 has been processed, but in fact 567 has not been successfully processed, so the batch of 567 messages will be lost.

Further, how to prevent message loss? The answer is to manually submit offsets. Spring-Kafka has also provided support. In fact, Spring-Kafka is just a wrapper for native Kafka. The core thing is that native Kafka supports the ability to manually submit offsets.

On to the dry stuff, there is a very useful class in Spring-Kafka: AcknowledgingMessageListener, this class is to support manual ack messages, that is, manual offset submission, but Spring packages the concept of "manual submission of offsets" into "confirmation messages", which There is a way:

	/**
	 * Invoked with data from kafka.
	 * @param data the data to be processed.
	 * @param acknowledgment the acknowledgment.
	 */
	@Override
	void onMessage(ConsumerRecord<K, V> data, Acknowledgment acknowledgment);

This class is designed to implement it to gain the ability to manually submit offsets, but we can also simplify it. Combined with the previous @KafkaListenermethod, we put Acknowledgment acknowledgmentthe parameters in @KafkaListenerthe method, and Spring can Acknowledgmentpass the object in, so we can control it ourselves. When to "acknowledge the message". Of course, this step alone is not enough. You also need to tell Spring-Kafka that I need to submit the offset manually, through a simple setting:

    @Bean(name = "batch_and_manual_ack_ContainerFactory")
    public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> batch_and_manual_ack_ContainerFactory() {
    
    
        ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());
        // 批量消费配置: 1批量, 2手动提交
        factory.setBatchListener(true);
        factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
        return factory;
    }

The most important sentence is: .setAckMode(AckMode.MANUAL_IMMEDIATE);, the default value of this configuration is AckMode.BATCH. The default value can be found from ContainerPropertiesthe source code of the class:

	/**
	 * The ack mode to use when auto ack (in the configuration properties) is false.
	 * <ul>
	 * <li>RECORD: Ack after each record has been passed to the listener.</li>
	 * <li>BATCH: Ack after each batch of records received from the consumer has been
	 * passed to the listener</li>
	 * <li>TIME: Ack after this number of milliseconds; (should be greater than
	 * {@code #setPollTimeout(long) pollTimeout}.</li>
	 * <li>COUNT: Ack after at least this number of records have been received</li>
	 * <li>MANUAL: Listener is responsible for acking - use a
	 * {@link org.springframework.kafka.listener.AcknowledgingMessageListener}.
	 * </ul>
	 */
	private AbstractMessageListenerContainer.AckMode ackMode = AckMode.BATCH;

To improve throughput, you need to set several parameters:

max.poll.records
The maximum number of records obtained in one poll operation. The default is 500. The larger the value, the greater the throughput, but the consumer is required to be able to process all messages without timeout.
fetch.min.bytes
is the minimum number of bytes for a fetch operation. If it is lower than this number of bytes, it will wait until timeout and return to the consumer. The default is 1B
fetch.max.wait.ms
is the maximum waiting time for a fetch operation. If either "maximum waiting time" or "minimum bytes" is satisfied first, it will be returned to the consumer immediately. The default is 500.
Note: "Maximum waiting time" cannot exceed session.timeout.ms and request.timeout.ms

5. All codes:


    @Bean(name = "batch_and_manual_ack_ContainerFactory")
    public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> batch_and_manual_ack_ContainerFactory() {
    
    
        ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());
        factory.setConcurrency(concurrency);
        factory.getContainerProperties().setPollTimeout(1500);
        // 批量消费配置: 1批量, 2手动提交
        factory.setBatchListener(true);
        factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
        return factory;
    }

    public ConsumerFactory<String, String> consumerFactory() {
    
    
        return new DefaultKafkaConsumerFactory<String, String>(consumerConfigs());
    }


    public Map<String, Object> consumerConfigs() {
    
    
        Map<String, Object> propsMap = new HashMap<String, Object>();
        propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, servers);
        propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, enableAutoCommit);
        propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, autoCommitInterval);
        propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, sessionTimeout);
        propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
        propsMap.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetReset);

		// 调大fetch的相关参数, 以便于提升吞吐量, 但会增大延时
		// 一次poll操作最大获取的记录数量
        propsMap.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, maxPollRecords); // max.poll.records, 缺省是500
		// 一次fetch操作最小的字节数, 如果低于这个字节数, 就会等待, 直到超时后才返回给消费者. 这里给100kB, 缺省是1B
		propsMap.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 1024 * 100); // fetch.min.bytes
		// 一次fetch操作的最大等待时间，“最大等待时间”与“最小字节”任何一个先满足了就立即返回给消费者
		// 需要注意：“最大等待时间”不能超过 session.timeout.ms 和 request.timeout.ms
		propsMap.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 10000); // fetch.max.wait.ms, 缺省是500

        return propsMap;
    }
    
    @KafkaListener(groupId = "junit-test-group", containerFactory = "batch_and_manual_ack_ContainerFactory", topics = {
    
    "test"})
    public void test_batchConsume(List<ConsumerRecord<String, String>> datas, Acknowledgment acknowledgment) {
    
    
        System.out.println(new Date() + " datas = " + datas.size());
        System.out.println(new Date() + " collect = " + datas.stream().map(t -> t.offset()).collect(Collectors.toList()));
        // 最后一定要提交进度 (用于持久化进度到Kafka)
        acknowledgment.acknowledge();
    }

Place the above code in a Spring class, then modify the configuration to use it