Kafka - specify offset for consumption

Kafka - specify offset for consumption


After searching on the Internet, I found that the data is consumed from the beginning of the message queue. The main code is as follows:

String topicName = "A25";
//用于分配topic和partition
consumer.assign(Arrays.asList(new TopicPartition(topicName, 0)));
//不改变当前offset,指定从这个topic和partition的开始位置获取。
consumer.seekToBeginning(Arrays.asList(new TopicPartition(topicName, 0)));

Poll 20 times to view the data consumption results:

for (int i = 0; i < 20; i++) {
    ConsumerRecords<String, String> records = consumer.poll(100);

    logger.info("records length = {}", records.count());

    for (ConsumerRecord record : records) {
        logger.info("topic = {}, partition = {}, offset = {}, key = {}, value = {}\n",
                record.topic(), record.partition(), record.offset(),
                record.key(), record.value());
    }
}

Kafka consumption from scratch.png-51.1kB

As you can see from the console output, Kafka consumers have consumed data from scratch. At the same time, it is also found that when pulling data, there will be many times that a piece of data is not pulled down, so if it is not used in an infinite loop, the 20 polls in this example may not consume a single piece of data.

When I found a seekToBeginningway to read other people's blog posts, I used maven to download the source code of the corresponding jar package and checked the source code (I have to say, you still need to make good use of tools) :

maven download source code.png-88.9kB

Found that there are two other methods:

/**
 * Overrides the fetch offsets that the consumer will use on the next {@link #poll(long) poll(timeout)}. If this API
 * is invoked for the same partition more than once, the latest offset will be used on the next poll(). Note that
 * you may lose data if this API is arbitrarily used in the middle of consumption, to reset the fetch offsets
 *
 * @throws IllegalArgumentException if the provided TopicPartition is not assigned to this consumer
 *                                  or if provided offset is negative
 */
@Override
public void seek(TopicPartition partition, long offset)

Consume at the specified offset.

/**
 * Seek to the last offset for each of the given partitions. This function evaluates lazily, seeking to the
 * final offset in all partitions only when {@link #poll(long)} or {@link #position(TopicPartition)} are called.
 * If no partitions are provided, seek to the final offset for all of the currently assigned partitions.
 * <p>
 * If {@code isolation.level=read_committed}, the end offset will be the Last Stable Offset, i.e., the offset
 * of the first message with an open transaction.
 *
 * @throws IllegalArgumentException if {@code partitions} is {@code null} or the provided TopicPartition is not assigned to this consumer
 */
public void seekToEnd(Collection<TopicPartition> partitions)

directly to the end of the message.

But from the parameters of the method we can also find:

  1. seek(TopicPartition partition, long offset)Methods can only have one subject
  2. seekToBeginningand seekToEnda list of acceptable topics and partitions

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325951023&siteId=291194637