[Difficult and Miscellaneous Diseases] Record kafka.common.ConsumerRebalanceFailedException: Exception

Recently using kafka, there are many exceptions in sending and receiving data during use. There is a very strange exception, as follows:

 

Exception in thread "main" kafka.common.ConsumerRebalanceFailedException:

groupB_ip-10-38-19-230-1414174925481-97fa3f2a can't rebalance after 4

retries

        at

kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:432)

        at

kafka.consumer.ZookeeperConsumerConnector.kafka$consumer$ZookeeperConsumerConnector$$reinitializeConsumer(ZookeeperConsumerConnector.scala:722)

        at

kafka.consumer.ZookeeperConsumerConnector.consume(ZookeeperConsumerConnector.scala:212)

 

        at kafka.javaapi.consumer.Zookeeper……

 

Debug found that on the Consumer side, the code ran to

Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap =   this.consumer
               .createMessageStreams(topicCountMap);

 This line just "stuck" with the exception above. Search the Internet for related solutions. It is said that the zookeeper.sync.time.ms property on the Consumer side is set to a larger value. After trying it, the problem remains the same. .

 

Until I found a more reliable solution at the following address:

 

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Myconsumerseemstohavestopped,why?>

 

Quoted from the English original:

  • consumer rebalancing fails (you will see ConsumerRebalanceFailedException): This is due to conflicts when two consumers are trying to own the same topic partition. The log will show you what caused the conflict (search for "conflict in ").
    • If your consumer subscribes to many topics and your ZK server is busy, this could be caused by consumers not having enough time to see a consistent view of all consumers in the same group. If this is the case, try Increasing rebalance.max.retries and rebalance.backoff.ms.
    • Another reason could be that one of the consumers is hard killed. Other consumers during rebalancing won't realize that consumer is gone after zookeeper.session.timeout.ms time. In the case, make sure that rebalance.max.retries * rebalance.backoff.ms > zookeeper.session.timeout.ms.

 

Then I tried the solution in the bold part, setting two properties on the Consumer side as follows:

props.put("rebalance.max.retries", "5");
props.put("rebalance.backoff.ms", "1200");

 And make sure that the value of 5*1200=6000 is greater than the value corresponding to the zookeeper.session.timeout.ms property (here I am 5000). Start the Producer side and the Comsumer side separately again, and the problem is solved.

 

Note: It is best to have more than one metadata.broker.list attribute of the server-side Producer, which requires you to do load balancing.

PS: For some exceptions of kafka, it is necessary to clearly understand its operating mechanism, but I don't have so much time. So I just crammed to solve the problem.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326933067&siteId=291194637