kafka报错org.apache.kafka.clients.consumer.CommitFailedException

报错如下：

org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
    at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:600)
    at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:498)
    at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1104)
    at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1072)
    at com.tanjie.kafka.ConsumerDemo.main(ConsumerDemo.java:29)

分析原因：

kafka消费者的poll方法，传入的参数为时间，单位为毫秒，如poll(2000),意思为拿了2秒钟的消息，从00:00开始拿，到00:02结束，在过去的两秒钟之内，消费者一直在拿消息

max.poll.interval.ms ：kafka处理消息时候最大心跳持续时间，如max.poll.interval.ms=2000 ，

结合poll方法一起说下这俩属性间有什么关联。

就是刚才消费者poll了2秒钟的消息，假如2秒钟内消费者拉取了了10条消息，理论上消费这10条消息需要1000ms，此时1000ms < max.poll.interval.ms=2000ms,属于正常，不会报错。

如果因为网络等其他原因消费10条消息花了3000ms > max.poll.interval.ms=2000ms, 消费消息的时间已经大于最大心跳持续时间，这个消费者心跳已不在最大持续心跳的范围内了，默认这个消费者死掉了，kafka会rebalance（协调）其他分区的消费者再去消费这个消息，后续默认死掉的消费者其实没死，还在消费消息，等他消费完成，kafka新协调的新分区的新消费者也消费了这条消息，结果会导致2个消费者消费了同一条消息。

解决办法，要么一次少拿点消息，即传入poll的参数变小；要么变大心跳持续时间max.poll.interval.ms即可。

仅供参考。

参考：https://www.cnblogs.com/syp172654682/p/9723108.html

kafka报错org.apache.kafka.clients.consumer.CommitFailedException

猜你喜欢