Kafka basics—2. Kafka producer API

Kafka Knowledge Base - Index Directory

1. Kafka producer API

1. Send message

To use the Kafka producer API in the Go language, you first need Kafka's Go client library.

Commonly used libraries include sarama or confluent-kafka-go.

sarama is a Kafka client library in Go language, used to integrate with Kafka and implement the functions of Kafka producer and consumer.

Using sarama here, let's look at a simple example. The steps are as follows:

Step 1: Install the Sarama library

go get github.com/Shopify/sarama

Step 2: Write producer code

package main

import (
	"fmt"
	"log"
	"os"
	"os/signal"

	"github.com/Shopify/sarama"
)

func main() {
    
    
	// 设置 Kafka broker 地址,创建配置
	config := sarama.NewConfig()
	config.Producer.RequiredAcks = sarama.WaitForAll
	config.Producer.Retry.Max = 5
	config.Producer.Return.Successes = true

	// 创建生产者
	producer, err := sarama.NewSyncProducer([]string{
    
    "kafka-broker1:9092", "kafka-broker2:9092"}, config)
	if err != nil {
    
    
		log.Fatalf("Error creating producer: %s", err.Error())
	}
	defer func() {
    
    
		if err := producer.Close(); err != nil {
    
    
			log.Fatalf("Error closing producer: %s", err.Error())
		}
	}()

	// 构造消息
	message := &sarama.ProducerMessage{
    
    
		Topic: "your_topic", // 设置消息发送到的主题
		Value: sarama.StringEncoder("Hello, Kafka!"), // 设置消息内容
	}

	// 发送消息
	partition, offset, err := producer.SendMessage(message)
	if err != nil {
    
    
		log.Fatalf("Failed to send message: %s", err.Error())
	}

	fmt.Printf("Message sent to partition %d at offset %d\n", partition, offset)

	// 处理退出信号
	sigchan := make(chan os.Signal, 1)
	signal.Notify(sigchan, os.Interrupt)
	<-sigchan
}

explain code

  1. Introduce the sarama library and set the connection configuration of Kafka.
  2. Create a producer instance.
  3. Construct the message to be sent to the Kafka topic.
  4. SendMessageSend messages using the producer's method.
  5. Process the partition and offset after successful transmission.
  6. Set up signal processing and wait for the interrupt signal to close the program.

2. Message compression

When talking about message compression in the Kafka producer API, we are referring to compressing messages before sending them to Kafka to reduce the amount of data transmitted over the network and improve transmission efficiency.

Kafka provides multiple message compression algorithms, including GZIP, Snappy, and LZ4.

Step 1: Import the Sarama library

First, make sure the Sarama library is installed, which can be installed using the following command:

go get github.com/Shopify/sarama

Step 2: Write message compression code

package main

import (
	"fmt"
	"log"
	"os"
	"os/signal"

	"github.com/Shopify/sarama"
)

func main() {
    
    
	// 设置 Kafka broker 地址,创建配置
	config := sarama.NewConfig()
	config.Producer.RequiredAcks = sarama.WaitForAll
	config.Producer.Retry.Max = 5
	config.Producer.Return.Successes = true
	config.Producer.Compression = sarama.CompressionGZIP // 设置消息压缩算法

	// 创建生产者
	producer, err := sarama.NewSyncProducer([]string{
    
    "kafka-broker1:9092", "kafka-broker2:9092"}, config)
	if err != nil {
    
    
		log.Fatalf("Error creating producer: %s", err.Error())
	}
	defer func() {
    
    
		if err := producer.Close(); err != nil {
    
    
			log.Fatalf("Error closing producer: %s", err.Error())
		}
	}()

	// 构造消息
	message := &sarama.ProducerMessage{
    
    
		Topic: "your_topic",                          // 设置消息发送到的主题
		Value: sarama.StringEncoder("Hello, Kafka!"), // 设置消息内容
	}

	// 发送消息
	partition, offset, err := producer.SendMessage(message)
	if err != nil {
    
    
		log.Fatalf("Failed to send message: %s", err.Error())
	}

	fmt.Printf("Message sent to partition %d at offset %d\n", partition, offset)

	// 处理退出信号
	sigchan := make(chan os.Signal, 1)
	signal.Notify(sigchan, os.Interrupt)
	<-sigchan
}

Step Three: Explain the Code

In this example, we set the message compression algorithm when creating the Kafka producer:

config.Producer.Compression = sarama.CompressionGZIP // 设置消息压缩算法

Here, we chose the GZIP compression algorithm, but you can also choose other algorithms, such as Snappy or LZ4.

Once the compression algorithm is set, messages sent by the producer will be compressed before being sent to reduce the amount of data transmitted over the network.

Kafka consumers automatically decompress messages after receiving them, so compression is transparent to the consumer.

3. Producer configuration

Step 1: Import the Sarama library

First, make sure the Sarama library is installed, which can be installed using the following command:

go get github.com/Shopify/sarama

Step 2: Write producer configuration code

package main

import (
	"fmt"
	"log"
	"os"
	"os/signal"

	"github.com/Shopify/sarama"
)

func main() {
    
    
	// 创建配置
	config := sarama.NewConfig()

	// 设置生产者参数
	config.Producer.RequiredAcks = sarama.WaitForAll      // 设置等待所有副本确认消息
	config.Producer.Retry.Max = 5                          // 设置最大的重试次数
	config.Producer.Return.Successes = true                // 设置是否返回成功的消息
	config.Producer.Compression = sarama.CompressionGZIP   // 设置消息压缩算法
	config.Producer.Partitioner = sarama.NewRandomPartitioner // 设置分区策略为随机分区

	// 创建生产者
	producer, err := sarama.NewSyncProducer([]string{
    
    "kafka-broker1:9092", "kafka-broker2:9092"}, config)
	if err != nil {
    
    
		log.Fatalf("Error creating producer: %s", err.Error())
	}
	defer func() {
    
    
		if err := producer.Close(); err != nil {
    
    
			log.Fatalf("Error closing producer: %s", err.Error())
		}
	}()

	// 构造消息
	message := &sarama.ProducerMessage{
    
    
		Topic: "your_topic",                          // 设置消息发送到的主题
		Value: sarama.StringEncoder("Hello, Kafka!"), // 设置消息内容
	}

	// 发送消息
	partition, offset, err := producer.SendMessage(message)
	if err != nil {
    
    
		log.Fatalf("Failed to send message: %s", err.Error())
	}

	fmt.Printf("Message sent to partition %d at offset %d\n", partition, offset)

	// 处理退出信号
	sigchan := make(chan os.Signal, 1)
	signal.Notify(sigchan, os.Interrupt)
	<-sigchan
}

Step Three: Explain the Code

In this example, we create an sarama.Configobject and then set some common producer configuration parameters:

  1. config.Producer.RequiredAcks: Set to wait for all replicas to confirm the message, which means that the producer will wait for all replicas to successfully write the message before considering the message sent successfully.

  2. config.Producer.Retry.Max: Set the maximum number of retries. When the message fails to be sent, the producer will try to resend the message the maximum number of times.

  3. config.Producer.Return.Successes: Set whether to return a successful message. If set to true, the producer will return a message when the message is sent successfully success.

  4. config.Producer.Compression: Set the message compression algorithm. GZIP is selected here. You can select other algorithms as needed.

  5. config.Producer.Partitioner: Set the partitioning strategy. The random partitioning strategy is selected here, that is, the message is sent to a random partition.

In this example, we are using a synchronous producer ( NewSyncProducer), but you can also choose an asynchronous producer ( NewAsyncProducer) if needed.

4. Partitioner

When we talk about partitioners in the Kafka producer API, we are referring to the mechanism that determines which partition a message is sent to.

Partitioners help producers determine which partition of a Kafka topic a message is written to, which is a key design decision because it affects message ordering, load balancing, and parallelism.

4.1 The role of partitioner

Kafka topics are usually divided into multiple partitions, and each partition is an ordered log. Messages are sent to specific partitions, which helps maintain message order. The task of the partitioner is to decide which partition the message belongs to based on certain rules.

4.2 Examples and detailed explanations

package main

import (
	"fmt"
	"log"
	"os"
	"os/signal"

	"github.com/Shopify/sarama"
)

// CustomPartitioner 自定义分区器
type CustomPartitioner struct{
    
    }

// Partition 实现sarama.Partitioner接口
func (p *CustomPartitioner) Partition(message *sarama.ProducerMessage, numPartitions int32) (int32, error) {
    
    
	// 在这里,设置自定义的分区逻辑
	// 这里简单地使用消息的 key 的哈希值对分区数取余
	key := message.Key
	if key == nil {
    
    
		// 如果消息没有 key,使用默认分区(随机分区)
		return int32(sarama.NewRandomPartitioner.RandomPartition(message, numPartitions)), nil
	}
	return int32(sarama.NewHashPartitioner.Hash(key, numPartitions)), nil
}

func main() {
    
    
	// 创建配置
	config := sarama.NewConfig()

	// 设置自定义分区器
	config.Producer.Partitioner = &CustomPartitioner{
    
    }

	// 创建生产者
	producer, err := sarama.NewSyncProducer([]string{
    
    "kafka-broker1:9092", "kafka-broker2:9092"}, config)
	if err != nil {
    
    
		log.Fatalf("Error creating producer: %s", err.Error())
	}
	defer func() {
    
    
		if err := producer.Close(); err != nil {
    
    
			log.Fatalf("Error closing producer: %s", err.Error())
		}
	}()

	// 构造消息
	message := &sarama.ProducerMessage{
    
    
		Topic: "your_topic",                          // 设置消息发送到的主题
		Value: sarama.StringEncoder("Hello, Kafka!"), // 设置消息内容
		Key:   sarama.StringEncoder("some_key"),      // 设置消息的 key
	}

	// 发送消息
	partition, offset, err := producer.SendMessage(message)
	if err != nil {
    
    
		log.Fatalf("Failed to send message: %s", err.Error())
	}

	fmt.Printf("Message sent to partition %d at offset %d\n", partition, offset)

	// 处理退出信号
	sigchan := make(chan os.Signal, 1)
	signal.Notify(sigchan, os.Interrupt)
	<-sigchan
}

4.3 Explain the code

In this example, we first create a custom partitioner CustomPartitionerand implement sarama.Partitionerthe interface's Partitionmethods. In Partitionthe method, we can define our own partitioning logic. Here, we simply use the hash value of the message's key to modulate the number of partitions. If the message does not have a key, the default random partition is used.

func (p *CustomPartitioner) Partition(message *sarama.ProducerMessage, numPartitions int32) (int32, error) {
    
    
	key := message.Key
	if key == nil {
    
    
		return int32(sarama.NewRandomPartitioner.RandomPartition(message, numPartitions)), nil
	}
	return int32(sarama.NewHashPartitioner.Hash(key, numPartitions)), nil
}

Next, we set this custom partitioner into the producer's configuration:

config.Producer.Partitioner = &CustomPartitioner{
    
    }

This way, when sending a message, the producer will use our custom partitioning logic to determine which partition the message should be sent to.

5. Serializer

When we talk about Serializer in the Kafka producer API, we are referring to the process of converting messages into a stream of bytes so that they can be transmitted over the network.

In Kafka, messages need to be serialized into a byte stream before they can be sent to the Kafka cluster by the producer. The serializer is responsible for converting the message object into a byte stream and deserializing the byte stream into the original message object on the consumer side.

This process is to ensure that messages can be efficiently transmitted and stored in the network.

5.1 Why is serialization needed?

Kafka is a distributed messaging system, and messages need to be delivered between different nodes. In order to achieve this cross-network transmission, the message must be represented in the form of a byte stream. Serialization is the process of converting a message object into a byte stream.

5.2 Examples and detailed explanations

Let us illustrate the serialization process with a simple example.

Suppose we have a message structure:

type MyMessage struct {
    
    
	ID   int
	Body string
}

First, we need to define a serializer to convert this structure into a byte stream:

package main

import (
	"bytes"
	"encoding/gob"
	"fmt"
	"log"
)

// MyMessage 结构体
type MyMessage struct {
    
    
	ID   int
	Body string
}

// Serializer 序列化器
type Serializer struct{
    
    }

// Serialize 将消息对象序列化为字节流
func (s *Serializer) Serialize(message *MyMessage) ([]byte, error) {
    
    
	var buffer bytes.Buffer
	encoder := gob.NewEncoder(&buffer)
	err := encoder.Encode(message)
	if err != nil {
    
    
		return nil, err
	}
	return buffer.Bytes(), nil
}

// Deserialize 将字节流反序列化为消息对象
func (s *Serializer) Deserialize(data []byte) (*MyMessage, error) {
    
    
	var message MyMessage
	decoder := gob.NewDecoder(bytes.NewReader(data))
	err := decoder.Decode(&message)
	if err != nil {
    
    
		return nil, err
	}
	return &message, nil
}

func main() {
    
    
	// 创建消息对象
	message := &MyMessage{
    
    
		ID:   1,
		Body: "Hello, Kafka!",
	}

	// 创建序列化器
	serializer := &Serializer{
    
    }

	// 序列化消息
	serializedData, err := serializer.Serialize(message)
	if err != nil {
    
    
		log.Fatalf("Error serializing message: %s", err.Error())
	}

	// 打印序列化后的字节流
	fmt.Printf("Serialized data: %v\n", serializedData)

	// 反序列化消息
	deserializedMessage, err := serializer.Deserialize(serializedData)
	if err != nil {
    
    
		log.Fatalf("Error deserializing message: %s", err.Error())
	}

	// 打印反序列化后的消息对象
	fmt.Printf("Deserialized message: %+v\n", deserializedMessage)
}

5.3 Interpretation of code

In this example, we first define a simple message structure MyMessageand then create a serializer Serializerthat implements Serializethe and Deserializemethods. Serializemethod converts a message object into a byte stream, while Deserializethe method deserializes a byte stream into a message object.

We use encoding/gobthe package for serialization and deserialization. In actual applications, you may choose other serialization methods, such as JSON or Avro, depending on your needs and system compatibility.

Finally, we create a message object, serialize it to a byte stream using a serializer, and print the result. Then, we deserialize the byte stream into a message object and print the result. This demonstrates the basic use of serialization in a practical application.

Guess you like

Origin blog.csdn.net/weixin_49015143/article/details/135061197