[03] Kafka consumers

0. Contents

1. Consumers and consumer groups

  • Each partition can only be consumed by a consumer a consumer group.
  • Increase in the consumer group Consumers can increase the overall spending power of consumers can not be more than the number of partitions.
  • Messaging middleware 2 modes: point to point queue model, publish / subscribe model
    point: all consumers belong to a consumer group
    publish / subscribe: individual consumers belonging to different consumer groups
  • Configuration parameters by group.id

    2. Client Development

    2.1 Consumer steps:

    (1) Create and configure the client consumption parameters set of examples
    (2) relating to subscriptions
    (3) and pull message consumer
    (4) Consumption filed displacement
    (5) to close the consumer set of examples

2.2 client parameters:

group.id default value is ""
client.id default value is "", then automatically generates KafkaConsumer "consumer -" + Digital

2.3 Sub topics and partitions

一个消费者可以订阅一到多个主题:
public void subscribe(Collection<String> topics, ConsumerRebalanceListener listener);   //ConsumerRebalanceListener用于设置再均衡监听器
public void subscribe(Collection<String> topics);
public void subscribe(Pattern pattern, ConsumerRebalanceListener listener); //正则表达式,可以订阅多个主题,用于和其他系统之间进行数据复制。
public void subscribe(Pattern pattern);

Subscribe specified partition:

public void assign(Collection<TopicPartition> partitions);

public class TopicPartition{
    int partition;
    String topic;
}

If you do not know the number of partitions, you can KafkaConsumer.partitionsFor (String topic); acquisition.

public List<PartitionInfo> partitionsFor(Collection<TopicPartition> partitions);
    //获取分区信息的列表
public class PartitionInfo {
    String topic;
    int partition;
    Node leader;
    ...
}

Three different subscription status:
AUTO_TOPICS ----> SUBCRIBE (Collection)
AUTO_PATTERN ----> SUBCRIBE (Pattern)
USER_ASSIGNED ----> ASSIGN (Collection)

2.4 Consumer news

Pull message

public ConsumerRecords<K, V> poll (final Duration timeout); 

poll timeout is used to control the blocking time, if there is no data available blockage occurs. How long it will hand over control of polling application.
It is set to 0, whether or not to pull message, return directly.
If the application thread only job is to pull and consume messages can be set to the maximum (Long.MAX_VALUE)
Consumer information
specified partition consumption

ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(1000));
for (TopicPartition tp : records.partitions()) {
    for (ConsumerRecord<String, String> record : records.records(tp)) {
    System.out.println(record.value);
    }
}

2.5 Consumer internal logic

Displacement submitted

  • Consumer internal displacement exist Kafka theme: _consumer_offset.
  • Filed filed displacement is necessary to pull the next position.
  • Submit two kinds of abnormal displacement caused by: repeated consumption, the message is lost
    repeated consumption: consumption after completing all messages submitted displacement, abnormal consumption process, will re-start pulling messages from the previous offset.
    Message loss: submitted displacement after pulling multiple messages, an exception occurs in the message consumer pull process, leading to message loss.
  • The default auto-commit shift that regularly submit, default 5s. Automatic submission will bring two issues above
  • Manual commit:
    synchronous commit:
    commitSync (); submitted once per batch; submitted once per batch for each partition. Blocking.
    Asynchronous submit:
    commitAsync (); non-blocking. Each batch submitted once; submit batch each time each partition; increase callback method, is introduced in the callback method retry, retry by saving time and determines the size of the displacement of the previous number to decide whether to submit displacement.
    As long as consumers abnormal exit, there is the inevitable question repeated consumption.
    If the consumer is normal exit, or the occurrence of rebalancing is required to submit a final synchronization checks.

2.6 Consumer elegant close

    public static final AtomicBoolean isRunning = new AtomicBoolean(true);
    consumer.subscribe(ArraysList.asList(topic));
    try {
        while(running.get()) {
            // consumer.poll($%##);
            // commit offset
        }
    } catch (WakeupException e) {
        //  ignore the error
    } catch (Exception e) {
        // some process handle error
    } finally {
        consumer.close();    
    }

Exit the loop can be used running.set (false) or

Guess you like

Origin www.cnblogs.com/suyeSean/p/11241901.html