I. Background
There is a demand in the project, is handled by the consumer kafka message data, but want to achieve the effect of delaying consumption, so the thought of whether you can manage kafka commit themselves to achieve, that is, by setting `enable.auto.commit` to False , if consumption is expected to the news, but do not commit, kafka will re-queue the messages back, but also the follow-up to the consumer again until the delay time is set longer than the real consumption and commit.
So wrote a demo to verify and found that the effect of this configuration is not what you want.
Second, Producer
Send a message to the producer per second of topic kafka.
#!/usr/bin/env python # -*- coding:utf-8 -*- import time from confluent_kafka import Producer, KafkaError from confluent_kafka import TopicPartition from confluent_kafka import OFFSET_BEGINNING p = Producer({'bootstrap.servers':'localhost:9092, localhost:9093, localhost:9094'}) topic = 'nico-test' msg_tpl = 'hello kafka:{0}' while True: msg = msg_tpl.format(time.time()) p.produce(topic, msg) print('Produce msg:{0}'.format(msg)) time.sleep(1) p.flush()
Third, consumers
Consumers set the configuration items enable.auto.commit: False.
#!/usr/bin/env python # -*- coding:utf-8 -*- import time from confluent_kafka import Consumer, KafkaError from confluent_kafka import TopicPartition from confluent_kafka import OFFSET_BEGINNING c = Consumer({ 'bootstrap.servers':'localhost:9092, localhost:9093, localhost:9094', 'group.id':'nico-test', 'auto.offset.reset':'earliest', 'enable.auto.commit':False }) topic = 'nico-test' c.subscribe([topic]) cd = c.list_topics() print(cd.cluster_id) print(cd.controller_id) print(cd.brokers) print(cd.topics) print(cd.orig_broker_id) print(cd.orig_broker_name) while True: msg = c.poll(1.0) if msg is None: continue print('topic:{topic}, partition:{partition}, offset:{offset}, headers:{headers}, key:{key}, msg:{msg}, timestamp:{timestamp}'.format(topic=msg.topic(), msg=msg.value(), headers=msg.headers(), key=msg.key(), offset=msg.offset(), partition=msg.partition(), timestamp=msg.timestamp()))
Fourth, the results
The result is the consumer starts spending would have been the order of the message, and the message will not be heavy on the queue, but when the consumer is kill off the restart, every time from the very beginning of consumption, so to sum up, the CI role when configured as true, after each message will automatically update the acquired offset value stored in the zookepper.
Finally, he is thought for a moment, where the consumer reason for the delay does not in fact support and implement the principles of kafka has a great relationship, kafka directly to messages stored in a disk file, if you want to achieve playback (supports delayed consumption) then We need to delete the message from the message queue, and then re-inserted into the message queue, so that just the contrary kafka designs.