kafka:enable.auto.commit

I. Background

There is a demand in the project, is handled by the consumer kafka message data, but want to achieve the effect of delaying consumption, so the thought of whether you can manage kafka commit themselves to achieve, that is, by setting `enable.auto.commit` to False , if consumption is expected to the news, but do not commit, kafka will re-queue the messages back, but also the follow-up to the consumer again until the delay time is set longer than the real consumption and commit.

So wrote a demo to verify and found that the effect of this configuration is not what you want.

Second, Producer

Send a message to the producer per second of topic kafka.

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import time

from confluent_kafka import Producer, KafkaError
from confluent_kafka import TopicPartition
from confluent_kafka import OFFSET_BEGINNING

p = Producer({'bootstrap.servers':'localhost:9092, localhost:9093, localhost:9094'})

topic = 'nico-test'
msg_tpl = 'hello kafka:{0}'

while True:
    msg = msg_tpl.format(time.time())
    p.produce(topic, msg)
    print('Produce msg:{0}'.format(msg))
    time.sleep(1)

p.flush()

Third, consumers

Consumers set the configuration items enable.auto.commit: False.

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import time

from confluent_kafka import Consumer, KafkaError
from confluent_kafka import TopicPartition
from confluent_kafka import OFFSET_BEGINNING

c = Consumer({
    'bootstrap.servers':'localhost:9092, localhost:9093, localhost:9094', 
    'group.id':'nico-test', 
    'auto.offset.reset':'earliest', 
    'enable.auto.commit':False
})

topic = 'nico-test'

c.subscribe([topic])

cd = c.list_topics()
print(cd.cluster_id)
print(cd.controller_id)
print(cd.brokers)
print(cd.topics)
print(cd.orig_broker_id)
print(cd.orig_broker_name)

while True:
    msg = c.poll(1.0)
    if msg is None:
        continue

    print('topic:{topic}, partition:{partition}, offset:{offset}, headers:{headers}, key:{key}, msg:{msg}, timestamp:{timestamp}'.format(topic=msg.topic(), msg=msg.value(), headers=msg.headers(), key=msg.key(), offset=msg.offset(), partition=msg.partition(), timestamp=msg.timestamp()))

Fourth, the results

The result is the consumer starts spending would have been the order of the message, and the message will not be heavy on the queue, but when the consumer is kill off the restart, every time from the very beginning of consumption, so to sum up, the CI role when configured as true, after each message will automatically update the acquired offset value stored in the zookepper.

Finally, he is thought for a moment, where the consumer reason for the delay does not in fact support and implement the principles of kafka has a great relationship, kafka directly to messages stored in a disk file, if you want to achieve playback (supports delayed consumption) then We need to delete the message from the message queue, and then re-inserted into the message queue, so that just the contrary kafka designs.

 

Guess you like

Origin www.cnblogs.com/lit10050528/p/12105297.html