An in-depth and simple analysis of kafka client programming ----- Producer Chapter - 10,000-word summary

What was mentioned earlier in the in-depth understanding of kafka is only the theoretical design principle.

This article talks about programming the kafka library based on C language! ! ! ! !

First of all, you need to write the producer's code. You must first know how the producer's logic is reflected in the code.

1.The logic of kafka producer

How to understand it?

Before we instantiate the producer object, we must configure some parameters,

For example, the conf introduced below

Then after configuring the parameters, it iscreating the producer instance, then after instantiating the producer, it is to prepare the producer

Producing messages, then when the producer produces messages, we mustinitialize and build the message objectsend it

Because using objects to manage messages is easier to expand and maintain and manage later, and it is also easier for consumers to read messages.

It is not easy to make mistakes. After constructing the message object, you need to hand the message object to the producer,Let the producer

Produced to the message queue in the specified kafka topic (that is, the partition in the topic, because each

The partitions are all independent queues). When produced to the message queue, the message is sent.When it reaches the message queue, it waits for the consumer to consume.

If the producer is no longer needed, the producer can be closed

 

Configuration parameters:

  • Before instantiating the producer object, you need to configure the parameters of the producer. This is typically done by creating a RdKafka::Conf object and setting various configuration options for it using the set methods. These configuration options can include the address of the Kafka server, messaging semantics (e.g., at-least-once delivery, exact-once delivery, etc.), serializers, partitioners, etc. This object can be passed to the constructor when instantiating the producer.  Conf

 Example:

RdKafka::Conf *conf = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);
conf->set("bootstrap.servers", "your_kafka_broker");

// 设置其他配置项...

RdKafka::Producer::create(conf);

Instantiate the producer:

  • Use the configured parameters to create a Kafka producer instance. This is achieved by calling the RdKafka::Producer::create() function. Passing a configuration object as a parameter ensures that the producer has the required configuration when it is created.

Build the message object:

  • After the producer is ready, you can construct the message object. This usually includes specifying information such as the subject, key, content, etc. of the message. The RdKafka::Message  class provides corresponding interfaces to set these message properties.
RdKafka::Producer *producer = RdKafka::Producer::create(conf);
RdKafka::Message *msg = RdKafka::Message::create();
msg->set_payload("Your message payload");
msg->set_topic("your_topic");
// 设置其他消息属性...

// 生产者会在生产消息时拥有这个消息对象
producer->produce(msg);

 Production news:

  • Call the producer's method to send the message to the Kafka cluster. In this step, the message will be put into a buffer inside the producer and then sent asynchronously to the Kafka server.  Method will return an error code. You can check the error code to understand the status of message sending.  produceproduce

polling:

  • In order to ensure that the message delivery report(RdKafka::DeliveryReportCb) callback is called, you need to call it regularly  a>RdKafka::poll(). This operation is usually done in a separate thread to ensure timely processing of message reports.
producer->poll(0);  // 参数表示非阻塞 poll

The delivery report function (RdKafka::DeliveryReportCb) is used to receive callback notifications about the message delivery results after the Kafka producer sends the message. Its main function is to ensure whether the message is successfully delivered to the Kafka server and the final processing result.

Close the producer:

  • Resources are released by calling when the producer no longer needs them. Before releasing resources, you may need to call  to ensure that all pending messages have been sent.  delete flush 
producer->flush(10000);  // 等待最多 10 秒钟
delete producer;

2.kafka’s C++API 

2.1 RdKafka::Conf can be understood as the configuration client parameters in the appeal logic

enum ConfType{ 
	CONF_GLOBAL, 	// 全局配置 
	CONF_TOPIC 		// Topic配置 
};
enum ConfResult{ 
	CONF_UNKNOWN = -2, 
	CONF_INVALID = -1, 
	CONF_OK = 0 
};
CONF_UNKNOWN: 表示配置未知,可能是因为没有进行相关的验证或检查。
CONF_INVALID: 表示配置无效,可能是由于配置值不符合期望的范围或格式。
CONF_OK: 表示配置有效,通过了验证。

You don’t need to memorize all these interfaces, just collect them and follow them, and recall them if you have forgotten them! ! ! Just remember the main points

static Conf * create(ConfType type);
//创建配置对象。

Conf::ConfResult set(const std::string &name, const std::string &value, std::string &errstr);
//设置配置对象的属性值,成功返回CONF_OK,错误时错误信息输出到errstr。

Conf::ConfResult set(const std::string &name, DeliveryReportCb *dr_cb, std::string &errstr);
//设置dr_cb属性值。

Conf::ConfResult set(const std::string &name, EventCb *event_cb, std::string &errstr);
//设置event_cb属性值。

Conf::ConfResult set(const std::string &name, const Conf *topic_conf, std::string &errstr);
//设置用于自动订阅Topic的默认Topic配置。

Conf::ConfResult set(const std::string &name, PartitionerCb *partitioner_cb, std::string &errstr);
//设置partitioner_cb属性值,配置对象必须是CONF_TOPIC类型。

Conf::ConfResult set(const std::string &name, PartitionerKeyPointerCb *partitioner_kp_cb,std::string &errstr);
//设置partitioner_key_pointer_cb属性值。

Conf::ConfResult set(const std::string &name, SocketCb *socket_cb, std::string &errstr);
//设置socket_cb属性值。

Conf::ConfResult set(const std::string &name, OpenCb *open_cb, std::string &errstr);
//设置open_cb属性值。

Conf::ConfResult set(const std::string &name, RebalanceCb *rebalance_cb, std::string &errstr);
//设置rebalance_cb属性值。

Conf::ConfResult set(const std::string &name, OffsetCommitCb *offset_commit_cb, std::string &errstr);
//设置offset_commit_cb属性值。

Conf::ConfResult get(const std::string &name, std::string &value) const;
//查询单条属性配置值。

2.2 RdKafka::Message

Message represents a consumption or production message, or event. This can be understood as the construction message object in the production logic

 The following is the interface based on the Message object (some content is encapsulated in the message):

std::string errstr() const;
//如果消息是一条错误事件,返回错误字符串,否则返回空字符串。

ErrorCode err() const;
//如果消息是一条错误事件,返回错误代码,否则返回0。

Topic * topic() const;
//返回消息的Topic对象。如果消息的Topic对象没有显示使用RdKafka::Topic::create()创建,需要使用topic_name函数。

std::string topic_name() const;
//返回消息的Topic名称。

int32_t partition() const;
//如果分区可用,返回分区号。

void * payload() const;
//返回消息数据。

size_t len() const;
//返回消息数据的长度。

const std::string * key() const;
//返回字符串类型的消息key。

const void * key_pointer() const;
//返回void类型的消息key。

size_t key_len() const;
//返回消息key的二进制长度。

int64_t offset () const;
//返回消息或错误的位移。

void * msg_opaque() const;
//返回RdKafka::Producer::produce()提供的msg_opaque。

virtual MessageTimestamp timestamp() const = 0;
//返回消息时间戳。

virtual int64_t latency() const = 0;
//返回produce函数内生产消息的微秒级时间延迟,如果延迟不可用,返回-1。

virtual struct rd_kafka_message_s *c_ptr () = 0;
//返回底层数据结构的C rd_kafka_message_t句柄。

virtual Status status () const = 0;
//返回消息在Topic Log的持久化状态。

virtual RdKafka::Headers *headers () = 0;
//返回消息头。

virtual RdKafka::Headers *headers (RdKafka::ErrorCode *err) = 0;
//返回消息头,错误信息会输出到err。

 main:

In RdKafka::Message, the most important and commonly used member functions and properties include:

  1. err(): This function can be used to obtain the error code of the message, which is used to check whether an error occurred during the production or consumption of the message.

  2. len(): Returns the length of the message, indicating the number of bytes in the message body.

  3. payload(): Provides access to the actual content (payload) of the message.

  4. topic_name(): Returns the topic name to which the message belongs.

  5. partition(): Returns the partition number where the message is located.

  6. offset(): Returns the offset of the message, indicating the position of the message in the partition.

These member functions and properties are often the most important information when processing Kafka messages. With this information, you can check the status of the message, understand the source and content of the message, and track the location of the message on the consumer side. Some other attributes, such as key(), used to get the key of the message, are optional and depend on whether the message contains the key.

2.3 RdKafka::DeliveryReportCb


Each time a message produced by the RdKafka::Producer::produce() function is received, the delivery report callback function is called, and RdKafka::Message::err() will identify the result of the Produce request.
In order to use the queued delivery report callback function, the RdKafka::poll() function must be called.

The delivery report function (RdKafka::DeliveryReportCb) is used to receive callback notifications about the message delivery results after the Kafka producer sends the message. Its main function is to ensure whether the message is successfully delivered to the Kafka server and the final processing result.

The delivery report function serves the following purposes:

  1. Confirm whether the message was successfully sent: Once the message is sent to the Kafka server by the producer, the delivery report function is called. This allows you to know if the message has successfully reached the server.

  2. Track message delivery status: The delivery reporting function provides information about the message delivery status. By checking the error code of the message (obtained through RdKafka::Message::err()), you can understand whether the message was successfully delivered to the partition, and possible reasons for the error, such as message sending timeout, partition does not exist, etc.

  3. Ensure message processing: This callback function can help you ensure that the message has been processed, whether it was successfully sent or some error occurred. Through error codes, you can appropriately handle problems during message sending, such as retrying, recording error logs, or performing other remedial measures.

Throughout the process, the delivery reporting function is used to provide the status and results of message delivery. It allows you to track the processing of messages, ensure that messages are successfully sent to the Kafka server, and be notified and handled in a timely manner when problems occur. Therefore, in an actual production environment, it is very important to handle this callback function in a timely manner to ensure reliable delivery of messages.

virtual void dr_cb(Message &message)=0;

Pure virtual function, needs inheritance to override

When a message is successfully produced or rdkafka encounters a permanent failure or the number of retries is exhausted, the delivery report callback function will be called.

 C++ encapsulation example:

class ProducerDeliveryReportCb : public RdKafka::DeliveryReportCb
{
public:
	void dr_cb(RdKafka::Message &message)
	{
		if(message.err())
			std::cerr << "Message delivery failed: " << message.errstr() << std::endl;
		else
		{
			// Message delivered to topic test [0] at offset 135000
			std::cerr << "Message delivered to topic " << message.topic_name()
				<< " [" << message.partition() << "] at offset "
				<< message.offset() << std::endl;
		}
	}
};

2.4 RdKafka::Event

event object

is a class used to represent Kafka events, which encapsulates event-related information. In your description, list the different types of events, such as EVENT_ERROR, EVENT_STATS, EVENT_LOG, and EVENT_THROTTLE. Each event has corresponding properties and methods to obtain the event type, error code, log information, etc.

enum Type{ 
	EVENT_ERROR, //错误条件事件 
	EVENT_STATS, // Json文档统计事件 
	EVENT_LOG, // Log消息事件 
	EVENT_THROTTLE // 来自Broker的throttle级信号事件 
};
virtual Type type() const =0;
//返回事件类型。
virtual ErrorCode err() const =0;
//返回事件错误代码。
virtual Severity severity() const =0;
//返回log严重级别。
virtual std::string fac() const =0;
//返回log基础字符串。
virtual std::string str () const =0;
//返回Log消息字符串。
virtual int throttle_time() const =0;
//返回throttle时间。
virtual std::string broker_name() const =0;
//返回Broker名称。
virtual int broker_id() const =0;
//返回Broker ID。
  1. type(): Returns the type of event. Types include error condition events (EVENT_ERROR), JSON document statistics events (EVENT_STATS), log message events (EVENT_LOG), and events from Broker Throttle level signal event (EVENT_THROTTLE).

  2. err(): Returns the error code of the event if the event type is an error condition event.

  3. severity(): Returns the severity level of the log message.

  4. fac(): Returns the underlying string of the log message.

  5. str(): Returns the string of the log message.

  6. throttle_time(): If the event type is a throttle-level signal event, return the throttle time.

  7. broker_name(): Returns the name of the Broker associated with the event.

  8. broker_id(): Returns the ID of the Broker associated with the event.

2.5 RdKafka::EventCb

event callback

An abstract base class that defines an event callback function for processingRdKafka::Event.

Events are a common interface for passing errors, statistics, logs and other messages from RdKafka to applications.

virtual void event_cb(Event &event)=0; //  事件回调函数

Pure virtual function, needs inheritance to override

C++ encapsulation example:

class ProducerEventCb : public RdKafka::EventCb
{
public:
    void event_cb(RdKafka::Event &event)
    {
        switch(event.type())
        {
        case RdKafka::Event::EVENT_ERROR:
            std::cout << "RdKafka::Event::EVENT_ERROR: " << RdKafka::err2str(event.err()) << std::endl;
            break;
        case RdKafka::Event::EVENT_STATS:
            std::cout << "RdKafka::Event::EVENT_STATS: " << event.str() << std::endl;
            break;
        case RdKafka::Event::EVENT_LOG:
            std::cout << "RdKafka::Event::EVENT_LOG " << event.fac() << std::endl;
            break;
        case RdKafka::Event::EVENT_THROTTLE:
            std::cout << "RdKafka::Event::EVENT_THROTTLE " << event.broker_name() << std::endl;
            break;
        }
    }
};

The function of this callback function is to be called when a Kafka event occurs and pass the correspondingRdKafka::Event object to the application. The application can implement its own RdKafka::EventCbsubclass, and then implement the event_cbmethod in this subclass to handle specific events. In this way, when there are errors, statistics, logs, or throttle-level signal events from the Broker,then the logic becomes as follows

  1. Configuration parameters: You first configure the parameters of the producer, including the address of the Kafka cluster, topic configuration, etc.

  2. Create a producer instance: Create a producer instance using the configured parameters.

  3. Prepare producers: Before producing messages, you may need to do some preparation work, such as initializing and building message objects.

  4. Production message: Hand the constructed message object to the producer, and let the producer send the message to the specified Kafka topic.

  5. Handling events: This is the role of the above RdKafka::Event and RdKafka::EventCb. During the life cycle of the producer, some asynchronous events may occur, such as errors, log information, etc. By setting RdKafka::EventCb, you can be notified when the corresponding event occurs and execute your own processing logic.

  6. Close the producer: If the producer is no longer needed, remember to close it to free up resources.

Here is an example:

class MyEventCallback : public RdKafka::EventCb {
public:
    void event_cb(RdKafka::Event &event) override {
        // 处理事件的逻辑
        switch (event.type()) {
            case RdKafka::Event::EVENT_ERROR:
                // 处理错误事件
                break;
            case RdKafka::Event::EVENT_STATS:
                // 处理统计信息事件
                break;
            // 可以处理其他类型的事件
            default:
                break;
        }
    }
};

int main() {
    // 配置生产者参数
    RdKafka::Conf *conf = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);
    // 设置事件回调
    MyEventCallback eventCallback;
    conf->set("event_cb", &eventCallback, errstr);

    // 创建生产者实例
    RdKafka::Producer *producer = RdKafka::Producer::create(conf, errstr);
    if (!producer) {
        // 处理生产者创建失败的情况
        return 1;
    }

    // 准备生产者...

    // 生产消息...
    
    // 处理事件...

    // 关闭生产者
    delete producer;
    delete conf;

    return 0;
}

2.6 RdKafka::PartitionsCb

A callback function used to customize the message partitioning strategy partitioner_cb, which will be called when producing a message and deciding which partition of the Kafka topic the message should be sent to. When you are producing messages, you may want some specific logic to decide which partition the message should be sent to, rather than using the default partitioning strategy.

PartitionerCb is used to implement a custom partitioning strategy, and you need to use RdKafka::Conf::set() to set the partitioner_cb attribute.

virtual int32_t partitioner_cb(const Topic *topic, const std::string *key, int32_t partition_cnt,void *msg_opaque)=0;
//Partitioner回调函数

Returns the partition using key in the topic topic. The key can be NULL or a string.
partition_cnt indicates the number of partitions of the topic (used for hash calculation)
The return value must be between 0 and partition_cnt. If the partition fails, RD_KAFKA_PARTITION_UA (-1) may be returned .
msg_opaque is the same as the msg_opaque provided by the RdKafka::Producer::produce() call.

This callback function needs to implement the following functions:

  • Receives a pointer to the topic topic.
  • A string pointer to the message key key. Keys can be empty or strings.
  • An integer representing the number of topic partitions partition_cnt is used to help determine which partition the message will be distributed to.
  • A pointer to the opaque data of the message msg_opaque, the same as the msg_opaque passed by the producer when issuing the message.

The callback function needs to return an integer value indicating the partition to which the message should be sent. This return value must be between 0 and partition_cnt. If the partition fails, RD_KAFKA_PARTITION_UA (-1) can be returned.

The purpose of thispartitioner_cb callback function is that when the producer needs to decide which partition to send the message to when sending a message to the Kafka topic, it will Call this function. You can implement this callback function with your own logic and let it decide which partition the message should be sent to based on the message's key or other characteristics. In this way, you can customize the partitioning strategy of messages.

After configuring parameters and creating a producer instance, you can use RdKafka::Conf::set() to set partitioner_cb properties, specify Customized partitioning strategy function. Then, when you use the producer to send a message, the Kafka client calls the function you defined to determine which partition the message should be sent to. partitioner_cb

Once a message is assigned to a corresponding partition, the producer sends the message to that partition's message queue. Consumers can read messages from these partitioned queues.

When you have finished sending messages to the producer, you can shut down the producer instance.

C++ encapsulation example:

class HashPartitionerCb : public RdKafka::PartitionerCb
{
public:
    int32_t partitioner_cb (const RdKafka::Topic *topic, const std::string *key,
                            int32_t partition_cnt, void *msg_opaque)
    {
        char msg[128] = {0};
        int32_t partition_id = generate_hash(key->c_str(), key->size()) % partition_cnt;
        //                          [topic][key][partition_cnt][partition_id] 
        //                          :[test][6419][2][1]
        sprintf(msg, "HashPartitionerCb:topic:[%s], key:[%s]partition_cnt:[%d], partition_id:[%d]", topic->name().c_str(),       
                key->c_str(), partition_cnt, partition_id);
        std::cout << msg << std::endl;
        return partition_id;
    }
private:

    static inline unsigned int generate_hash(const char *str, size_t len)
    {
        unsigned int hash = 5381;
        for (size_t i = 0 ; i < len ; i++)
            hash = ((hash << 5) + hash) + str[i];
        return hash;
    }
};

Pseudocode example:
 

#include <librdkafka/rdkafkacpp.h>

class MyPartitionerCallback : public RdKafka::PartitionerCb {
public:
    int32_t partitioner_cb(const RdKafka::Topic *topic, const std::string *key, int32_t partition_cnt, void *msg_opaque) override {
        // 自定义分区逻辑
        // 在这里,你可以根据消息的键(key)或其他标准来决定消息应该分发到哪个分区
        // 你可以使用 topic、key、partition_cnt 等参数进行逻辑判断

        // 假设你的自定义逻辑是简单地根据键的哈希值来选择分区
        if (key) {
            std::hash<std::string> hasher;
            size_t hash_value = hasher(*key);
            return static_cast<int32_t>(hash_value % partition_cnt);
        } else {
            // 如果键为空,则使用默认分区策略
            return RD_KAFKA_PARTITION_UA;
        }
    }
};

int main() {
    // 创建配置对象
    RdKafka::Conf *conf = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);

    // 设置分区回调函数
    MyPartitionerCallback partitioner_callback;
    conf->set("partitioner_cb", &partitioner_callback, errstr);

    // 创建生产者实例
    RdKafka::Producer *producer = RdKafka::Producer::create(conf, errstr);
    if (!producer) {
        // 处理生产者创建失败的情况
        std::cerr << "Failed to create producer: " << errstr << std::endl;
        delete conf;
        return -1;
    }

    // 创建消息对象
    RdKafka::Producer::Message msg("my_topic", 0, RdKafka::Producer::RK_MSG_COPY, /* payload */, /* payload size */, /* key */, /* opaque */);

    // 生产消息
    RdKafka::ErrorCode resp = producer->produce(msg, /* partition */, /* delivery report callback */);

    if (resp != RdKafka::ERR_NO_ERROR) {
        // 处理消息发送失败的情况
        std::cerr << "Failed to produce message: " << RdKafka::err2str(resp) << std::endl;
    }

    // 在这里可以继续生产更多的消息

    // 关闭生产者
    delete producer;
    delete conf;

    return 0;
}

2.7 RdKafka::Topic

Topic object, logical unit in kafka

RdKafka::TopicPlays the role of managing operations related to Kafka topics.

static Topic * create(Handle *base, const std::string &topic_str, Conf *conf, std::string &errstr);
//使用conf配置创建名为topic_str的Topic句柄。

const std::string name ();
//获取Topic名称。

bool partition_available(int32_t partition) const;
//获取parition分区是否可用,只能在 RdKafka::PartitionerCb回调函数内被调用。

ErrorCode offset_store(int32_t partition, int64_t offset);
//存储Topic的partition分区的offset位移,只能用于RdKafka::Consumer,不能用于RdKafka::KafkaConsumer高级接口类。
//使用本接口时,auto.commit.enable参数必须设置为false。

virtual struct rd_kafka_topic_s *c_ptr () = 0;
//返回底层数据结构的rd_kafka_topic_t句柄,不推荐利用rd_kafka_topic_t句柄调用C API,但如果C++ API没有提供相应功能,
//可以直接使用C API和librdkafka核心交互。
static const int32_t PARTITION_UA = -1;		//未赋值分区
static const int64_t OFFSET_BEGINNING = -2;	//特殊位移,从开始消费
static const int64_t OFFSET_END = -1;		//特殊位移,从末尾消费
static const int64_t OFFSET_STORED = -1000;	//使用offset存储
  1. PARTITION_UA (-1): This constant represents an unspecified partition. In some cases, if you do not want to bind the consumer to a specific partition, you can use this constant to represent an unassigned partition.

  2. OFFSET_BEGINNING (-2): This constant indicates that messages will be consumed from the beginning of the partition. You can use this constant if you want to start consuming from the oldest message in the Kafka topic.

  3. OFFSET_END (-1): This constant is used to indicate that consumption starts from the end of the partition (the latest message). You can use this constant if you want the consumer to start consuming from the latest message in the topic.

  4. OFFSET_STORED (-1000): This constant indicates using the stored offset for consumption. Sometimes, consumers may store the consumption offset somewhere (such as external storage, database, etc.) so that they can continue consuming from this location later. This constant helps the consumer specify the stored offset as the starting position for consumption.

These constants provide flexible options so that consumers can choose different starting locations or partitions as needed when consuming Kafka topic messages to meet specific business needs.

Pseudocode example:
 

#include <librdkafka/rdkafkacpp.h>

int main() {
    // 创建 Kafka 消费者配置
    RdKafka::Conf *conf = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);

    // 设置消费者配置参数...

    // 创建 Kafka 消费者
    RdKafka::Consumer *consumer = RdKafka::Consumer::create(conf, errstr);
    if (!consumer) {
        // 处理消费者创建失败的情况
        return 1;
    }

    // 指定主题和分区
    RdKafka::Topic *topic = RdKafka::Topic::create(consumer, "your_topic", NULL);

    // 设置消费者的分区和偏移量
    int32_t partition = RdKafka::Topic::PARTITION_UA;  // 未指定分区
    int64_t offset = RdKafka::Topic::OFFSET_BEGINNING;  // 从起始位置开始消费

    // 在消费者上订阅主题和分区
    RdKafka::ErrorCode resp = consumer->assign({RdKafka::TopicPartition("your_topic", partition, offset)});
    if (resp != RdKafka::ERR_NO_ERROR) {
        // 处理分配分区失败的情况
        return 1;
    }

    // 开始消费消息
    while (true) {
        // 从消费者拉取消息...
        RdKafka::Message *msg = consumer->consume(1000);  // 1000毫秒超时

        // 处理消息...
        if (msg->err()) {
            // 处理消息消费错误
        } else {
            // 处理接收到的消息
        }

        // 释放消息资源
        delete msg;
    }

    // 关闭资源
    delete topic;
    delete consumer;

    return 0;
}

 What does the above code mean?

int32_t partition = RdKafka::Topic::PARTITION_UA;  // 未指定分区

What it means is that the consumer does not specify the specific partition to consume, so the consumer will be dynamically allocated to the available partitions. In fact, this method allows consumers to be evenly distributed to different partitions according to the load balancing strategy to improve overall consumption efficiency.

If it is the above code, when the consumer consumes, it will not go to a specific partition to read the data, but according to the allocation strategy of the Kafka consumer (In fact, the allocation strategy It is the load balancing strategy!!!!)The mechanism is assigned to the topic subscribed by the consumer

Read from a partition in

1.Building Topic 对子:

In the Kafka producer logic, you first need to create the RdKafka::Topic object, usually through the create function, which accepts some parameters, including the name of the Kafka topic and configuration. Topic objects are created to subsequently send messages to the specified topic.

RdKafka::Topic *topic = RdKafka::Topic::create(/* parameters */);

2. Build message object:

You describe the process of initializing and building message objects before message production. This may involve creating a message object and setting the message's content, keys, partitions, and other properties. Such message objects can use the corresponding classes provided in the Kafka producer library (perhaps or other classes). RdKafka::Message

RdKafka::Message *message = /* 构建消息对象 */;

3. Send message to topic:

With the created Kafka producer instance and the previously created object, the constructed message can be sent to the specified topic in the Kafka cluster. Typically, a function that sends a message accepts a Topic object and a message object as parameters. RdKafka::Topic

producer->produce(topic, partition, /* other parameters */, message);

 Among them, producer is the Kafka producer instance created previously, and partition is the specified partition.

2.8 RdKafka::Producer (core)

static Producer * create(Conf *conf, std::string &errstr);
//创建一个新的Producer客户端对象,conf用于替换默认配置对象,本函数调用后conf可以重用。成功返回新的Producer客户端对象
//,失败返回NULL,errstr可读错误信息。

ErrorCode produce(Topic *topic, int32_t partition, int msgflags, void *payload, size_t len,
const std::string *key, void *msg_opaque);
//生产和发送单条消息到Broker。msgflags:可选项为RK_MSG_BLOCK、RK_MSG_FREE、RK_MSG_COPY。

 Return error code:

ErrorCode produce(Topic *topic, int32_t partition, int msgflags, void *payload, size_t len,const void *key, 
size_t key_len, void *msg_opaque);
//生产和发送单条消息到Broker,传递key数据指针和key长度。

ErrorCode produce(Topic *topic, int32_t partition, const std::vector< char > *payload, 
const std::vector< char > *key, void *msg_opaque);
//生产和发送单条消息到Broker,传递消息数组和key数组。接受数组类型的key和payload,数组会被复制。

//ErrorCode flush (int timeout_ms);
//等待所有未完成的所有Produce请求完成。为了确保所有队列和已经执行的Produce请求在中止前完成,flush操作优先于销毁生产者
//实例完成。本函数会调用Producer::poll()函数,因此会触发回调函数。

//ErrorCode purge (int purge_flags);
//清理生产者当前处理的消息。本函数调用时可能会阻塞一定时间,当后台线程队列在清理时。应用程序需要在调用poll或flush函数后
//,执行清理消息的dr_cb回调函数。

virtual Error *init_transactions (int timeout_ms) = 0;
//初始化Producer实例的事务。失败返回RdKafka::Error错误对象,成功返回NULL。
//通过调用RdKafka::Error::is_retriable()函数可以检查返回的错误对象是否有权限重试,调用
//RdKafka::Error::is_fatal()检查返回的错误对象是否是严重错误。返回的错误对象必须elete。

virtual Error *begin_transaction () = 0;
//启动事务。本函数调用前,init_transactions()函数必须被成功调用。
//成功返回NULL,失败返回错误对象。通过调用RdKafka::Error::is_fatal_error()函数可以检查是否是严重错误,返回的错误对象
//必须delete。

virtual Error send_offsets_to_transaction (const std::vector &offsets,const ConsumerGroupMetadata
 *group_metadata,int timeout_ms) = 0;
//发送TopicPartition位移链表到由group_metadata指定的Consumer Group协调器,如果事务提交成功,位移才会被提交。

virtual Error *commit_transaction (int timeout_ms) = 0;
//提交当前事务。在实际提交事务时,任何未完成的消息会被完成投递。
//成功返回NULL,失败返回错误对象。通过调用错误对象的方法可以检查是否有权限重试,是否是严重错误、可中止错误等。

virtual Error *abort_transaction (int timeout_ms) = 0;
//停止事务。本函数从非严重错误、可终止事务中用于恢复。未完成消息会被清理。

3 Kafka producer client development


3.1 Necessary parameter configuration (bootstrap.servers)


(1) Specify the list of broker addresses required to connect to the Kafka cluster. The specific content format is host1:port1,host2:port2. You can set one or more addresses, separated by commas. The value of this parameter The default value is "".
(2) Note that not all broker addresses are required here, because the producer will look for information about other brokers from a given broker.
(3) It is recommended to set up at least two broker address information. When any one of them goes down, the producer can still connect to the Kafka cluster.
 

 Hint something! ! !

That is, if our producer is connected to the kafka cluster, it can connect to a kafka server in the kafka cluster.

Or multiple kafka servers. The kafka server corresponds to the broker.

As shown below:

// 创建Kafka Conf对象
m_config = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);
if(m_config == NULL)
{
    std::cout << "Create RdKafka Conf failed." << std::endl;
}
// 创建Topic Conf对象
m_topicConfig = RdKafka::Conf::create(RdKafka::Conf::CONF_TOPIC);
if(m_topicConfig == NULL)
{
    std::cout << "Create RdKafka Topic Conf failed." << std::endl;
}
// 设置Broker属性
RdKafka::Conf::ConfResult errCode;
m_dr_cb = new ProducerDeliveryReportCb;
std::string errorStr;
errCode = m_config->set("dr_cb", m_dr_cb, errorStr);
if(errCode != RdKafka::Conf::CONF_OK)
{
    std::cout << "Conf set failed:" << errorStr << std::endl;
}
m_event_cb = new ProducerEventCb;
errCode = m_config->set("event_cb", m_event_cb, errorStr);
if(errCode != RdKafka::Conf::CONF_OK)
{
    std::cout << "Conf set failed:" << errorStr << std::endl;
}

m_partitioner_cb = new HashPartitionerCb;
errCode = m_topicConfig->set("partitioner_cb", m_partitioner_cb, errorStr);
if(errCode != RdKafka::Conf::CONF_OK)
{
    std::cout << "Conf set failed:" << errorStr << std::endl;
}

errCode = m_config->set("statistics.interval.ms", "10000", errorStr);
if(errCode != RdKafka::Conf::CONF_OK)
{
    std::cout << "Conf set failed:" << errorStr << std::endl;
}

errCode = m_config->set("message.max.bytes", "10240000", errorStr);
if(errCode != RdKafka::Conf::CONF_OK)
{
    std::cout << "Conf set failed:" << errorStr << std::endl;
}
errCode = m_config->set("bootstrap.servers", m_brokers, errorStr);
if(errCode != RdKafka::Conf::CONF_OK)
{
    std::cout << "Conf set failed:" << errorStr << std::endl;
}

3.2 Other important producer parameters

3.2.1 acks


Used to specify how many replicas in the partition must receive this message before the producer will consider this message to be successfully written. acks is a very important parameter in the producer client, which involves the trade-off between message reliability and throughput. The acks parameter has three types of values ​​(all string types).

1.acks = 1. The default value is 1. After the producer sends the message, as long as the leader copy of the partition successfully writes the message, it
will receive a successful response from the server. If the message cannot be written to the leader copy, such as when the leader copy crashes and a new leader copy is re-elected, the producer will receive an error response. In order to avoid message loss , the producer can choose to resend the message. If the message is written to the leader copy and returned successfully to the producer, and the leader copy crashes before being pulled by other follower copies, the message will still be lost at this time because the newly elected leader copy There is no message corresponding to . acks is set to 1, which is a compromise between message reliability and throughput.




2.acks = 0. The producer does not need to wait for any response from the server after sending the message.
If some exception occurs during the process from sending the message to writing it to Kafka, resulting in Kafka not receiving the message, then the producer has no way of knowing, and the message is lost. .
Under the same configuration environment, setting acks to 0 can achieve the maximum throughput.


3.acks = -1 or acks = all. After sending the message, the producer needs to wait for all replicas in the ISR to successfully write the message before it can receive a successful response from the server.

Under the same configuration environment, setting acks to -1 can achieve the strongest reliability.
But this does not mean that the message must be reliable, because there may be only a leader copy in the ISR, which degenerates into the
situation of acks = 1. To obtain higher message reliability, it is necessary to cooperate with the linkage of parameters such as min.insync.replicas.


Note that the value configured by the acks parameter is a string type, not an integer type.

//范例:
RdKafka::Conf *conf = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);
ConfResult ret = conf->set("acks", "1", errstr);
ConfResult ret = conf->set("acks", "0", errstr);
ConfResult ret = conf->set("acks", "all", errstr);

3.2.2 max.request.size


This parameter is used to limit the maximum message size that the producer client can send. The default value is 1048576 B, which is 1 MB.
Under normal circumstances, this default value can meet most application scenarios.
It is not recommended to blindly increase the configuration value of this parameter, especially when you do not have enough control over the overall context of Kafka.
Because this parameter also involves the linkage of some other parameters, such as the message.max.bytes parameter on the broker side, if the configuration
is incorrect, it may cause some unnecessary errors. One game.
For example, the message.max.bytes parameter on the broker side is configured as 10, and the max.request.size parameter is configured as 20.
Then when we send a message size When the message size is 15, the producer client will report an exception:
The reqeust included a message larger than the max message size the server will accept.

errCode = conf->set("message.max.bytes", "10240000", errorStr);


 

3.2.3 retries and retry.backoff.ms


retries Number of retries, default 0

retry.backoff.ms retry interval, default 100

1. The retries parameter is used to configure the number of retries of the producer. The default value is 0, which means no retries will be performed when an exception occurs.

  1. retriesparameter:

    • The default is 0, which controls the number of retries of the producer. Setting it to greater than 0 enables internal retries when a recoverable exception occurs.

2. Some temporary exceptions may occur before the message is successfully sent from the producer to the server, such as network jitters, election of Leader replicas, etc. Such exceptions can often be recovered by themselves. The producer can configure retries greater than 0 value to recover by internal retries rather than blindly throwing exceptions to the producer application.

  1. Temporary exception handling:

    • Temporary exceptions (such as network jitter, Leader replica election) can be recovered through configuration retries to avoid passing exceptions to applications

3. If the retry reaches the set number of times, the producer will give up retrying and return an exception.
However, not all exceptions can be resolved by retrying. For example, when the message is too large and exceeds the value configured by the max.request.size parameter, this method will not work.

  1. Retry limit:

    • After reaching the set number of times, the producer gives up retrying and returns an exception.

4. Retry is also related to another parameter retry.backoff.ms. The default value of this parameter is 100. It is used to set the time interval between two retries to avoid invalid frequent retries.

  1. Non-retryable exception:

    • Not all exceptions can be resolved by retrying, such as messages too large to exceed max.request.size.

5. Before configuring retries and retry.backoff.ms, it is best to estimate the possible exception recovery time, so that the total retry time can be set to be greater than the exception recovery time to avoid the producer giving up prematurely. try.

  1. retry.backoff.msparameter:

    • Control the time interval between two retries, the default is 100 milliseconds, to avoid frequent invalid retries.

7. Kafka can ensure that messages in the same partition are in order.

  1. Sequence in Kafka:

    • Messages within the same partition remain ordered.

8. If the producer sends messages in a certain order, then these messages will also be written to the partition in order, and the consumer can consume them in the same order.

  • The producer sends messages in order, and Kafka writes in order, supporting ordered consumption.

9. For some applications, order is very important, such as Mysql binlog transmission. If an error occurs, it will cause very serious consequences. If the retries parameter is set to a non-zero value and the max.in.flight.requests.per.connection parameter is configured to a value greater than 1, then misordering will occur: if the first batch of messages fails to be written, and If the second batch of messages is written successfully, the producer will retry sending the first batch of messages. At this time, if the first batch of messages is written successfully, the two batches of messages will be out of order.

  • For order-sensitive applications (such as MySQL binlog transfer), be careful when configuring retries and max.in.flight.requests.per.connection to avoid misordering.

10. Generally speaking, when the order needs to be guaranteed, it is recommended to configure the parameter max.in.flight.requests.per.connection to 1 instead of configuring retries to 0. However, this will also affect the overall Hesitation.
max.in.flight.requests.per.connection = 1 Limits the number of unresponsive requests a client can send on a single connection. Setting this value to 1 means that the client cannot send another request to the same broker before the kafka broker responds to the request. Note: This parameter is set to avoid message disorder

  1. max.in.flight.requests.per.connectionparameter:

    • Set to 1 to limit the number of unresponsive requests on a single connection to avoid message misordering, but may affect overall throughput.

3.2.4 compression.type


This parameter is used to specify the compression method of consumption. The default value is "none", that is, by default, the message will not be compressed.
This parameter can also be configured as "gzip", "snappy", "lz4".
Compressing messages can greatly reduce network transmission volume and network I/O, thus improving overall performance.
Message compression is an optimization method that uses time for space. If there are certain requirements on latency, it is not recommended to compress messages
.

3.2.5 connection.max.idle.ms


This parameter is used to specify how long after which idle connections are closed. The default value is 540000 ms, which is 9 minutes.

3.2.6 linger.ms


This parameter is used to specify the time for the producer to wait for more messages (ProducerRecord) to join
before sending Producer Batch. The default value is 0.

The producer client will send out when the ProducerBatch is filled or the waiting time exceeds the linger.ms value.

Increasing the value of this parameter will increase the message delay, but at the same time it can improve a certain throughput.

This linger.ms parameter is similar to the Nagle algorithm in the TCP protocol.
 

3.2.7 receive.buffer.bytes


This parameter is used to set the size of the Socket message buffer (SO_RECBUF). The default value is 32768 (B), which is 32
KB.

If set to -1, the operating system default is used.

If Producer and Kafka are in different computer rooms, you can increase this parameter value appropriately.

3.2.8 send.buffer.bytes


This parameter is used to set the size of the Socket send message buffer (SO_SNDBUF). The default value is 131072 (B), which is
128 KB.
As with the receive.buffer.bytes parameter, if set to -1 , the operating system default is used.


3.2.9 request.timeout.ms


This parameter is used to configure the maximum time the Producer waits for a request response. The default value is 30000 ms.
You can choose to retry after the request times out.
Note that this parameter needs to be larger than the value of the broker-side parameter replica.lag.time.max.ms, which can reduce the risk of client retry
The probability of message duplication.
According to specific scenarios and requirements, this parameter value needs to be adjusted according to network conditions, Kafka cluster load and message processing requirements. Scenarios with lower latency requirements can choose a smaller value. For situations where the network is unstable or processing pressure is high, the parameter value may need to be increased appropriately.
 

3.2.10 client.id


Used to set the client id corresponding to KafkaProducer. The default value is "".

3.2.11 batch.size


batch.size is one of the most important parameters of producer! It plays a very important role in tuning producer throughput and latency performance indicators. The producer will encapsulate multiple messages sent to the same partition into a batch. When the batch is full, the producer will send all the messages in the batch. . However, the producer does not always wait for the batch to be full before sending a message. It is very likely that the producer will send the batch when the batch still has a lot of free space. Obviously, the size of the batch is very important. Generally speaking, the number of messages contained in a small batch is very small, so the number of messages that can be written in one send request is also very small, so producer The throughput will be very low; if a batch is very large, it will put great pressure on memory usage, because the producer will allocate a fixed size to the batch regardless of whether can be filled. of memory. Therefore, the setting of the batch.size parameter is actually a reflection of the trade-off between time and space. The default value of the batch.size parameter is 16384, which is 16KB. This is actually a very conservative number. If you increase the parameter value reasonably during actual use, you will usually find that the throughput of the producer has been increased accordingly.










 

Separation of declarations and definitions! ! ! ! !

Complete code:

kafka_producer.h

#ifndef KAFKAPRODUCER_H
#define KAFKAPRODUCER_H

#pragma once
#include <string>
#include <iostream>
#include "librdkafka/rdkafkacpp.h"

class KafkaProducer
{
public:
	/**
	* @brief KafkaProducer
	* @param brokers
	* @param topic
	* @param partition
	*/
	explicit KafkaProducer(const std::string& brokers, const std::string& topic, int partition);
	/**
	* @brief push Message to Kafka
	* @param str, message data
	*/
	void pushMessage(const std::string& str, const std::string& key);
	~KafkaProducer();

private:
	std::string m_brokers;			// Broker列表,多个使用逗号分隔
	std::string m_topicStr;			// Topic名称
	int m_partition;				// 分区

	RdKafka::Conf* m_config;        // Kafka Conf对象
	RdKafka::Conf* m_topicConfig;   // Topic Conf对象
	RdKafka::Topic* m_topic;		// Topic对象
	RdKafka::Producer* m_producer;	// Producer对象

	/*只要看到Cb 结尾的类,要继承它然后实现对应的回调函数*/
	RdKafka::DeliveryReportCb* m_dr_cb;
	RdKafka::EventCb* m_event_cb;
	RdKafka::PartitionerCb* m_partitioner_cb;
};

#endif

std::string m_brokers; // Broker address list of Kafka cluster

std::string m_topicStr; //The name of the Kafka topic

int m_partition; //Partition number to which the message is to be sent

RdKafka::Producer* m_producer; // Kafka Producer instance

RdKafka::Topic* m_topic; // Kafka Topic instance

RdKafka::Conf* m_config; // Kafka global configuration

RdKafka::Conf* m_topicConfig; // Kafka Topic arrangement

RdKafka::ProducerDeliveryReportCb m_dr_cb; // Producer delivery report callback function RdKafka::ProducerEventCb m_event_cb; // Producer event callback function

RdKafka::HashPartitionerCb m_partitioner_cb; // Partitioner callback function 

kafka_producer.cc

#include "kafka_producer.h"

// call back
class ProducerDeliveryReportCb : public RdKafka::DeliveryReportCb
{
public:
	void dr_cb(RdKafka::Message &message)
	{
		if(message.err())
			std::cerr << "Message delivery failed: " << message.errstr() << std::endl;
		else
		{
			// Message delivered to topic test [0] at offset 135000
			std::cerr << "Message delivered to topic " << message.topic_name()
				<< " [" << message.partition() << "] at offset "
				<< message.offset() << std::endl;
		}
	}
};

class ProducerEventCb : public RdKafka::EventCb
{
public:
	void event_cb(RdKafka::Event &event)
	{
		switch (event.type())
		{
		case RdKafka::Event::EVENT_ERROR:
			std::cout << "RdKafka::Event::EVENT_ERROR: " << RdKafka::err2str(event.err()) << std::endl;
			break;
		case RdKafka::Event::EVENT_STATS:
			std::cout << "RdKafka::Event::EVENT_STATS: " << event.str() << std::endl;
			break;
		case RdKafka::Event::EVENT_LOG:
			std::cout << "RdKafka::Event::EVENT_LOG " << event.fac() << std::endl;
			break;
		case RdKafka::Event::EVENT_THROTTLE:
			std::cout << "RdKafka::Event::EVENT_THROTTLE " << event.broker_name() << std::endl;
			break;
		}
	}
};

class HashPartitionerCb : public RdKafka::PartitionerCb
{
public:
	int32_t partitioner_cb(const RdKafka::Topic *topic, const std::string *key,
		int32_t partition_cnt, void *msg_opaque)
	{
		char msg[128] = { 0 };
		int32_t partition_id = generate_hash(key->c_str(), key->size()) % partition_cnt;
		//                          [topic][key][partition_cnt][partition_id] 
		//                          :[test][6419][2][1]
		sprintf(msg, "HashPartitionerCb:topic:[%s], key:[%s]partition_cnt:[%d], partition_id:[%d]", topic->name().c_str(),
			key->c_str(), partition_cnt, partition_id);
		std::cout << msg << std::endl;
		return partition_id;
	}
private:

	static inline unsigned int generate_hash(const char *str, size_t len)
	{
		unsigned int hash = 5381;
		for (size_t i = 0; i < len; i++)
			hash = ((hash << 5) + hash) + str[i];
		return hash;
	}
};


KafkaProducer::KafkaProducer(const std::string& brokers, const std::string& topic, int partition)
{
	m_brokers = brokers;
	m_topicStr = topic;
	m_partition = partition;

	/* 创建Kafka Conf对象 */
	m_config = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL);
	if(m_config==NULL)
		std::cout << "Create RdKafka Conf failed." << std::endl;

	/* 创建Topic Conf对象 */
	m_topicConfig = RdKafka::Conf::create(RdKafka::Conf::CONF_TOPIC);
	if (m_topicConfig == NULL)
		std::cout << "Create RdKafka Topic Conf failed." << std::endl;

	/* 设置Broker属性 */
	RdKafka::Conf::ConfResult errCode;
	std::string errorStr;
	m_dr_cb = new ProducerDeliveryReportCb;
	// 设置dr_cb属性值
	errCode = m_config->set("dr_cb", m_dr_cb, errorStr);
	if (errCode != RdKafka::Conf::CONF_OK)
	{
		std::cout << "Conf set failed:" << errorStr << std::endl;
	}
	// 设置event_cb属性值
	m_event_cb = new ProducerEventCb;
	errCode = m_config->set("event_cb", m_event_cb, errorStr);
	if (errCode != RdKafka::Conf::CONF_OK)
	{
		std::cout << "Conf set failed:" << errorStr << std::endl;
	}
	// 自定义分区策略
	m_partitioner_cb = new HashPartitionerCb;
	errCode = m_topicConfig->set("partitioner_cb", m_partitioner_cb, errorStr);
	if (errCode != RdKafka::Conf::CONF_OK)
	{
		std::cout << "Conf set failed:" << errorStr << std::endl;
	}
	// 设置配置对象的属性值,都是在kafka全局配置对象中设置
	errCode = m_config->set("statistics.interval.ms", "10000", errorStr);
	if (errCode != RdKafka::Conf::CONF_OK)
	{
		std::cout << "Conf set failed:" << errorStr << std::endl;
	}
	errCode = m_config->set("message.max.bytes", "10240000", errorStr);
	if (errCode != RdKafka::Conf::CONF_OK)
	{
		std::cout << "Conf set failed:" << errorStr << std::endl;
	}
	errCode = m_config->set("bootstrap.servers", m_brokers, errorStr);
	if (errCode != RdKafka::Conf::CONF_OK)
	{
		std::cout << "Conf set failed:" << errorStr << std::endl;
	}

	/* 创建Producer */
	m_producer = RdKafka::Producer::create(m_config, errorStr);
	if (m_producer == NULL)
	{
		std::cout << "Create Producer failed:" << errorStr << std::endl;
	}

	/* 创建Topic对象 */
	m_topic = RdKafka::Topic::create(m_producer, m_topicStr, m_topicConfig, errorStr);
	if (m_topic == NULL)
	{
		std::cout << "Create Topic failed:" << errorStr << std::endl;
	}
}

KafkaProducer::~KafkaProducer()
{
	while (m_producer->outq_len() > 0)
	{
		std::cerr << "Waiting for " << m_producer->outq_len() << std::endl;
		m_producer->flush(5000);
	}
	delete m_config;
	delete m_topicConfig;
	delete m_topic;
	delete m_producer;
	delete m_dr_cb;
	delete m_event_cb;
	delete m_partitioner_cb;
}

void KafkaProducer::pushMessage(const std::string& str, const std::string& key)
{
	int32_t len = str.length();
	void* payload = const_cast<void*>(static_cast<const void*>(str.data()));
	RdKafka::ErrorCode errorCode = m_producer->produce(
		m_topic,
		RdKafka::Topic::PARTITION_UA,
		RdKafka::Producer::RK_MSG_COPY,
		payload,
		len,
		&key,
		NULL);
	m_producer->poll(0);
	if (errorCode != RdKafka::ERR_NO_ERROR)
	{
		std::cerr << "Produce failed: " << RdKafka::err2str(errorCode) << std::endl;
		if (errorCode == RdKafka::ERR__QUEUE_FULL)
		{
			m_producer->poll(100);
		}
	}
}

 The following is the process of KafkaProducer::KafkaProducer function

  1. Initialize member variables:
    • m_brokers Stores the Kafka broker address.
    • m_topicStr Stores the Kafka topic name.
    • m_partition Storage partition number.

m_brokers = brokers; m_topicStr = topic; m_partition = partition;

  1. Create global configuration object (m_config):
    • Create the Kafka global configuration object through RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL) .
    • If creation fails, an error message is output.

m_config = RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL); if(m_config==NULL) std::cout << "Create RdKafka Conf failed." << std::endl;

  1. Create Topic configuration object (m_topicConfig):
    • Create the Kafka Topic configuration object through RdKafka::Conf::create(RdKafka::Conf::CONF_TOPIC) .
    • If creation fails, an error message is output.

m_topicConfig = RdKafka::Conf::create(RdKafka::Conf::CONF_TOPIC); if (m_topicConfig == NULL) std::cout << "Create RdKafka Topic Conf failed." << std::endl;

  1. Set the callback function and other configuration properties:
    • Create an instance of the ProducerDeliveryReportCb class as the callback function of the delivery report.
    • Create an instance of ProducerEventCb class as the callback function of event callback.
    • Create an instance of the HashPartitionerCb class as the callback function of the custom partition strategy.
    • Use the set method to set these callback functions to the corresponding configuration objects.
    • Set some other configuration properties such as statistics interval, maximum message size, and bootstrap.servers.

m_dr_cb = new ProducerDeliveryReportCb; errCode = m_config->set("dr_cb", m_dr_cb, errorStr); m_event_cb = new ProducerEventCb; errCode = m_config->set("event_cb", m_event_cb, errorStr); m_partitioner_cb = new HashPartitionerCb; errCode = m_topicConfig->set("partitioner_cb", m_partitioner_cb, errorStr); // 其他配置属性的设置

  1. Create Kafka Producer instance (m_producer):
    • Create a Kafka Producer instance using the above configuration object.
    • If creation fails, an error message is output.

m_producer = RdKafka::Producer::create(m_config, errorStr); if (m_producer == NULL) { std::cout << "Create Producer failed:" << errorStr << std::endl; }

   1. Create topic object 

CMakeLists.txt

cmake_minimum_required(VERSION 2.8)

project(KafkaProducer)

set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_COMPILER "g++")
set(CMAKE_CXX_FLAGS "-std=c++11 ${CMAKE_CXX_FLAGS}")
set(CMAKE_INCLUDE_CURRENT_DIR ON)

# Kafka头文件路径
include_directories(/usr/local/include/librdkafka)
# Kafka库路径
link_directories(/usr/lib64)

aux_source_directory(. SOURCE)

add_executable(${PROJECT_NAME} ${SOURCE})
TARGET_LINK_LIBRARIES(${PROJECT_NAME} rdkafka++)

Test file main.cc

#include <iostream>
#include "KafkaProducer.h"
using namespace std;

int main()
{
    // 创建Producer
    // KafkaProducer producer("127.0.0.1:9092,192.168.2.111:9092", "test", 0);
    KafkaProducer producer("127.0.0.1:9092", "test", 0);
    for(int i = 0; i < 10000; i++)
    {
        char msg[64] = {0};
        sprintf(msg, "%s%4d", "Hello RdKafka ", i);
        // 生产消息
        char key[8] = {0};      // 主要用来做负载均衡
        sprintf(key, "%d", i);
        producer.pushMessage(msg, key);  
    }
    RdKafka::wait_destroyed(5000);
}

Compile:

mkdir build
cd build
cmake ..
make

4. Summary


Kafka Producer usage process:

Create a Kafka configuration instance.
Create a Topic configuration instance.
Set the Kafka configuration instance Broker properties.
Set Topic configuration instance properties.
Register the callback function (the partition policy callback function needs to be registered to the Topic configuration instance).
Create a Kafka Producer client instance.
Create a Topic instance.
Blocks and waits for the Producer to complete the message production.
Wait for the Produce request to complete.
Destroy the Kafka Producer client instance.
 

Guess you like

Origin blog.csdn.net/txh1873749380/article/details/134856445