Kafka: Detailed explanation, usage tutorial and examples

Kafka: Detailed introduction, tutorials and examples

What is Kafka?

Kafka is a distributed stream processing platform originally developed by LinkedIn and has become a top-level project of the Apache Foundation. Known for its high throughput, reliability, and scalability, it is widely used in scenarios such as real-time data transmission, log collection, event processing, and streaming analysis. Kafka is designed to handle large-scale data streams, making it ideal for building modern distributed applications.

Core concepts of Kafka

Before diving into the Kafka usage tutorial, let us first introduce some core concepts of Kafka, which are the basis for understanding Kafka:

  • Broker: Each server node in the Kafka cluster is called a Broker, and they are responsible for storing and processing data.

  • Topic: The topic of message publishing, which is the category of data flow. Producers publish messages to topics and consumers subscribe to them.

  • Partition: Each Topic can be divided into multiple Partitions, and each Partition is an ordered message queue. Partitioning allows data to be distributed horizontally and processed in parallel.

  • Producer: The publisher of data, which sends messages to one or more Topics.

  • Consumer: A data subscriber who consumes messages from one or more Topics.

  • Consumer Group: A collection of consumers who jointly consume a Topic message. Each partition can only be consumed by one consumer in one consumer group.

  • Offset: The unique identifier of each message in the Partition. Consumers use Offset to track the consumed messages.

How to use Kafka?

The following is a detailed Kafka usage tutorial, which fully introduces the usage of Kafka from installation to practical examples:

1. Install and start Kafka

First, you need to install Kafka. You can download the latest version from the official website (https://kafka.apache.org/downloads) and follow the guide to install it. After the installation is complete, you need to start the Kafka server and ZooKeeper.

Start ZooKeeper (Kafka depends on ZooKeeper):

bin/zookeeper-server-start.sh config/zookeeper.properties

Then, start the Kafka server:

bin/kafka-server-start.sh config/server.properties

2. Create Topic

In Kafka, you need to create one or more Topics to store messages. Create a my-topictopic named with the following command:

bin/kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

This will create a my-topicTopic named with 3 partitions and 1 replica.

3. Using the Kafka producer

Kafka producers are used to publish messages to the specified Topic. Here is a simple Java producer example:

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;

public class KafkaProducerExample {
    
    

    public static void main(String[] args) {
    
    
        Properties properties = new Properties();
        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());

        KafkaProducer<String, String> producer = new KafkaProducer<>(properties);
        String topic = "my-topic";

        for (int i = 0; i < 10; i++) {
    
    
            String message = "Message " + i;
            producer.send(new ProducerRecord<>(topic, message));
            System.out.println("Sent: " + message);
        }

        producer.close();
    }
}

4. Using Kafka Consumers

Kafka consumers subscribe and process messages from Topic. Here is an example of a simple Java consumer:

import org.apache.kafka.clients.consumer.Consumer;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;

import java.time.Duration;
import java.util.Collections;
import java.util.Properties;

public class KafkaConsumerExample {
    
    

    public static void main(String[] args) {
    
    
        Properties properties = new Properties();
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        properties.put(ConsumerConfig.GROUP_ID_CONFIG, "my-group");
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());

        Consumer<String, String> consumer = new KafkaConsumer<>(properties);
        String topic = "my-topic";

        consumer.subscribe(Collections.singletonList(topic));

        while (true) {
    
    
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
            records.forEach(record -> {
    
    
                System.out.println("Received: " + record.value());
            });
        }
    }
}

5. Run the example

First, open a terminal window and run the Kafka producer example:

java KafkaProducerExample

Then, open another terminal window and run the Kafka consumer example:

java KafkaConsumerExample

You will see that the messages sent by the producer are received and processed by the consumer.

Summarize

Kafka is a powerful distributed stream processing platform for real-time data transmission and processing. Through the detailed introduction, tutorials and examples in this article, you can understand the core concepts of Kafka, install, create Topic, use producers and consumers, so as to lay a solid foundation for building modern distributed applications. Whether building a real-time data streaming platform, log collection system, or event-driven architecture, Kafka is a reliable and efficient solution.

Guess you like

Origin blog.csdn.net/weixin_42279822/article/details/132206109