Apache Kafka + Spring's message listening container

1. Reception of messages

Reception of messages: Messages can be received by configuring a MessageListenerContainer and providing a message listener or using the @KafkaListener annotation. In this chapter, we mainly explain how to receive messages by configuring MessageListenerContainer and providing a message listener.

1.1, message listener

When using a message listening container, a listener must be provided to receive the data. There are currently eight interfaces that support message listeners:

public interface MessageListener<K, V> {
    
     
     // 当使用自动提交或容器管理的提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的各个 ConsumerRecord 实例。
    void onMessage(ConsumerRecord<K, V> data);
}

public interface AcknowledgingMessageListener<K, V> {
    
     
    // 当使用手动提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的各个 ConsumerRecord 实例。
    void onMessage(ConsumerRecord<K, V> data, Acknowledgment acknowledgment);
}

public interface ConsumerAwareMessageListener<K, V> extends MessageListener<K, V> {
    
     
    // 当使用自动提交或容器管理的提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的各个 ConsumerRecord 实例。提供对 Consumer 对象的访问。
    void onMessage(ConsumerRecord<K, V> data, Consumer<?, ?> consumer);

}

public interface AcknowledgingConsumerAwareMessageListener<K, V> extends MessageListener<K, V> {
    
     
    //当使用手动提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的各个 ConsumerRecord 实例。提供对 Consumer 对象的访问。
    void onMessage(ConsumerRecord<K, V> data, Acknowledgment acknowledgment, Consumer<?, ?> consumer);

}

public interface BatchMessageListener<K, V> {
    
     
   //当使用自动提交或容器管理的提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的所有 ConsumerRecord 实例。使用此接口时不支持 AckMode.RECORD,因为侦听器会获得完整的批次。
    void onMessage(List<ConsumerRecord<K, V>> data);

}

public interface BatchAcknowledgingMessageListener<K, V> {
    
     
    // 当使用手动提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的所有 ConsumerRecord 实例。
    void onMessage(List<ConsumerRecord<K, V>> data, Acknowledgment acknowledgment);

}

public interface BatchConsumerAwareMessageListener<K, V> extends BatchMessageListener<K, V> {
    
     
    // 当使用自动提交或容器管理的提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的所有 ConsumerRecord 实例。使用此接口时不支持 AckMode.RECORD,因为侦听器会获得完整的批次。提供对 Consumer 对象的访问。
    void onMessage(List<ConsumerRecord<K, V>> data, Consumer<?, ?> consumer);

}

public interface BatchAcknowledgingConsumerAwareMessageListener<K, V> extends BatchMessageListener<K, V> {
    
     
    //当使用手动提交方法之一时,使用此接口处理从 Kafka 消费者 poll() 操作接收到的所有 ConsumerRecord 实例。提供对 Consumer 对象的访问。
    void onMessage(List<ConsumerRecord<K, V>> data, Acknowledgment acknowledgment, Consumer<?, ?> consumer);

}

Note: 1. Consumer objects are not thread-safe; 2. You should not execute any Consumer<?, ?> methods that affect the position of the consumer and/or the committed offset in the listener; the container needs to manage this information.

2. Message monitoring container

2.1. Implementation method

MessageListenerContainer provides two implementations:
1. KafkaMessageListenerContainer,
2. ConcurrentMessageListenerContainer

2.1.1、KafkaMessageListenerContainer

2.1.1.1. Basic concepts

KafkaMessageListenerContainer receives all messages from all topics or partitions on a single thread. Delegate ConcurrentMessageListenerContainer to one or more KafkaMessageListenerContainer instances to provide multi-threaded consumption.

  • Starting with version 2.2.7, a RecordInterceptor listener container can be added; it will be called before the listener is called to allow inspection or modification of the record. If the interceptor returns null, the listener will not be called.
  • Starting with version 2.7, it has an additional method that is called after the listener exits (usually or by throwing an exception).
  • Batch Interceptors provide similar functionality to Batch Listeners.
  • Additionally, ConsumerAwareRecordInterceptor (and BatchInterceptor) provide access to Consumer<?, ?>. For example, this can be used to access consumer metrics in interceptors.
  • CompositeRecordInterceptor and CompositeBatchInterceptor can call multiple interceptors.
  • By default, when using transactions, the interceptor is called after the transaction has started. Beginning with version 2.3.4, the interceptBeforeTx property of the listener container can be set to invoke the interceptor before the transaction begins.
  • Starting with version 2.3.8, 2.4.6, ConcurrentMessageListenerContainer supports static membership when concurrency is greater than 1. The suffix of group.instance.id is -n, starting from 1. This, along with increasing the value of session.timeout.ms, can be used to reduce rebalance events, for example, when an application instance is restarted.
  • Static membership is meant to improve the availability of streaming applications, consumer groups, and other applications built on top of the group rebalancing protocol. The rebalance protocol relies on the group coordinator to assign entity IDs to group members. These generated IDs are short-lived and change as members restart and rejoin. For consumer-based applications, this "dynamic membership" can lead to reassignment of large portions of tasks to different instances during administrative operations such as code deployments, configuration updates, and periodic restarts. For large stateful applications, shuffle tasks can take a long time to restore their local state before being processed, resulting in partial or complete application unavailability. Inspired by this observation, Kafka's group management protocol allows group members to providepersistent entity id. Based on these IDs, group membership remains the same, so no rebalance is triggered.

Likewise, interceptors should not implement any methods that affect the consumer's position and/or committed offsets, the container needs to manage this information.

If the interceptor mutates the record (by creating a new record), the topic, partition and offset must remain unchanged to avoid unintended side effects such as loss of records.

2.1.1.2, How to use KafkaMessageListenerContainer

  • KafkaMessageListenerContainer constructor

    public KafkaMessageListenerContainer(ConsumerFactory<K, V> consumerFactory,
                      ContainerProperties containerProperties)
    

    The constructor receives information from the ConsumerFactory about the topics and partitions in the object, as well as other configuration.

  • The container property (ContainerProperties) contains 3 constructors, and we will introduce them one by one below.
    1. Take TopicPartitionOffset as a parameter

    public ContainerProperties(TopicPartitionOffset... topicPartitions)
    

    The constructor takes an array of TopicPartitionOffset parameters to explicitly indicate which partitions the container is to use (using the consumer assign() method) with an optional initial offset. By default, positive values ​​are absolute offsets and negative values ​​are relative to the current last offset within the partition. TopicPartitionOffset provides a constructor with an additional parameter, boolean If true, the initial offset (positive or negative) relative to the current position of the consumer when the container starts.
    2. Take String as a parameter

    public ContainerProperties(String... topics)
    

    The constructor takes an array of topics, and Kafka assigns partitions based on attributes group.id——assigns partitions in groups
    3, taking Pattern as a parameter

    public ContainerProperties(Pattern topicPattern)
    

    This constructor uses a regular expression Pattern to select topics.

  • How to assign the listener to the container
    With the listener, there is also the container, how to assign the listener to the container? . To assign a MessageListener to a container, use the ContainerProps.setMessageListener method when creating the Container:

    ContainerProperties containerProps = new ContainerProperties("topic1", "topic2");
    containerProps.setMessageListener(new MessageListener<Integer, String>() {
          
          
      ...
    });
    DefaultKafkaConsumerFactory<Integer, String> cf =
                          new DefaultKafkaConsumerFactory<>(consumerProps());
    KafkaMessageListenerContainer<Integer, String> container =
                          new KafkaMessageListenerContainer<>(cf, containerProps);
    return container;
    

    Note that when creating the DefaultKafkaConsumerFactory, using a constructor that accepts only the above properties means picking up the key and value deserializer classes from the configuration. Alternatively, a deserializer instance can be passed to the DefaultKafkaConsumerFactory constructor for keys and/or values, in which case all consumers share the same instance. Another option is to provide a Supplier (since version 2.3), which will be used to obtain a separate Deserializer instance for each consumer:

     DefaultKafkaConsumerFactory<Integer, CustomValue> cf =
                          new DefaultKafkaConsumerFactory<>(consumerProps(), null, () -> new      CustomValueDeserializer());
     KafkaMessageListenerContainer<Integer, String> container =
                          new KafkaMessageListenerContainer<>(cf, containerProps);
    return container;
    

Starting with version 2.3.5, a new container property called authorizationExceptionRetryInterval has been introduced. This causes the container to retry getting messages after getting any AuthorizationException from KafkaConsumer. This can happen, for example, when a configured user is denied reading a particular topic. Defining authorizationExceptionRetryInterval should help the application resume immediately after granting the appropriate permissions.

2.1.2、ConcurrentMessageListenerContainer

ConcurrentMessageListenerContainer has only one constructor which is similar to KafkaListenerContainer.

public ConcurrentMessageListenerContainer(ConsumerFactory<K, V> consumerFactory,
                            ContainerProperties containerProperties)

It has a concurrency property, the role of this property is to create several KafkaMessageListenerContainer instances. For example: container.setConcurrency(3) creates three KafkaMessageListenerContainer instances.

When listening to multiple topics, the default partition distribution may not be what we expect. For example, if there are 3 topics with 5 partitions each, and we want to use concurrency=15, but we will only see 5 active consumers, each assigned a partition from each topic, and The other 10 consumers are idle. This is because the default Kafka PartitionAssignor is RangeAssignor. For this case, we need to consider using RoundRobinAssignor, which assigns partitions to all consumers. Then, each consumer is assigned a topic or partition. We can set the partition.assignment.strategy consumer property in the properties provided to DefaultKafkaConsumerFactory to change the PartitionAssignor. (ConsumerConfigs. PARTITION_ASSIGNMENT_STRATEGY_CONFIG).
This can be done in springboot:
spring.kafka.consumer.properties.partition.assignment.strategy=
org.apache.kafka.clients.consumer.RoundRobinAssignor

When the container property is configured with TopicPartitionOffset, ConcurrentMessageListenerContainer distributes TopicPartitionOffset instances among delegating KafkaMessageListenerContainer instances.

Assume that 6 TopicPartitionOffset instances are provided with a concurrency of 3; each container has two partitions. With five
TopicPartitionOffset instances, two containers get two partitions and the third container gets one partition. If the number of concurrency is greater than the number of TopicPartitions, reduce the number of concurrency so that each container gets a partition.

3. Offset

Spring provides several offset options. If the enable.auto.commit consumer property is true, Kafka will automatically submit the offset according to its configuration. If false, the container supports various AckMode settings. The default AckMode is BATCH.

Starting with version 2.3, the framework sets enable.auto.commit to false unless explicitly set in configuration. Previously, if this property was not set, the Kafka default (true) was used.

The consumer poll() method returns one or more ConsumerRecords. Call the MessageListener for each record. The following list describes the actions the container takes for each AckMode (when transactions are not used):

  • RECORD: Commits the offset when the listener returns after processing the record.

  • BATCH: Commit offsets when all records returned by poll() have been processed.

  • TIME: When all records returned by poll() have been processed, commit the offset whenever the ackTime since the last commit has passed.

  • COUNT: Commit the offset when all records returned by poll() have been processed, as long as ackCount records have been received since the last commit.

  • COUNT_TIME: Like TIME and COUNT, but does a commit if either condition is true.

  • MANUAL: The message listener is responsible for acknowledge() acknowledgments. Afterwards, the same semantics as BATCH apply.

  • MANUAL_IMMEDIATE: The offset is committed immediately when the listener calls the Acknowledgment.acknowledge() method.

When using transactions, offsets are sent to the transaction, semantically equivalent to RECORD or BATCH, depending on the listener type (record or batch). MANUAL and MANUAL_IMMEDIATE require the listener to be an AcknowledgingMessageListener or BatchAcknowledgingMessageListener.

Use the commitSync() or commitAsync() method on the consumer, depending on the syncCommits container property. By default, syncCommits is true.

The author's personal suggestion suggests setting: ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG is false.

Starting from version 2.3, the Acknowledgment interface adds two methods nack(long sleep) and nack(int index, long sleep). The first is used with record listeners and the second with batch listeners. Calling the wrong method for the listener type will throw an IllegalStateException. Before that he was like this:

public interface Acknowledgment {
    
    

    void acknowledge();

}
  • If you want to submit a partial batch, use nack().
  • When using transactions, set AckMode to MANUAL;
  • Calling nack() will send the offset of the successfully processed record to the transaction.
  • nack() can only be called on the consumer thread that called the listener.
  • When nack() is called, all pending offsets are committed, remaining records from the previous poll are discarded, and lookups are performed on their partitions so that failed and unprocessed records are redelivered on the next poll ( ).
  • By setting the sleep parameter, the consumer thread can be paused before redelivery. This is similar to how exceptions are thrown when the container is configured with a SeekToCurrentErrorHandler.

When using partition allocation through group management, it is very important to ensure that the sleep parameter (plus the time spent processing previously polled records) is less than the consumer max.poll.interval.ms property .

4. The listener container starts automatically

The listener container implements SmartLifecycle and autoStartup defaults to true. The container is started at a late stage (Integer.MAX-VALUE - 100). Other components that implement SmartLifecycle to process data from listeners should start at an early stage. -100 leaves room for later stages, enabling components to start automatically after the container.

Guess you like

Origin blog.csdn.net/qq_35241329/article/details/132312378