Kafka producer

1. Communication process

   Producers and consumers are regarded as clients, and message brokers are regarded as servers. The network communication between them is as follows:

Two, the concept

1. Producer sending process : The producer wants to send a message, not directly to the server, but first puts the message in the queue on the client, and then a message sending thread pulls the message from the queue to batch salt Send messages to the server in a way.

2. Record Accumulator (RecordAccumulator): Responsible for caching the messages generated by the producer client (batch record queue). The batch records are not full, and you need to wait until enough information is collected. Here you need to determine whether the batch records in the queue are full, and continue to add if they are not full; if they are full, you need to open a new batch record.

The process of appending a message to the record collector :

The sending thread sends to the server process :

3. Sending thread (Sender): Responsible for reading the batch messages of the record collector and sending them to the server through the network. There are two sending methods: direct sending according to the partition, and repeated sending according to the target node of the partition.

4. Client network connection object (NetworkClient): Manages the network communication between the client and the server, including connection establishment, sending of client requests, and reading of client responses. Connection object method:

Restriction : For the same server, if the previous client request has not been sent, it is not allowed to send a new client request.

Two scenarios for whether the client needs to respond to the results:

   (1) Process that does not require response. Start sending request → add client request to queue → send request → request sent successfully → delete sending request from the queue → construct client response.

  (2) The process that needs response. Start sending the request → add the client request to the queue → send the request → request sent successfully → wait to receive the response → receive the response → receive the complete response → delete the client request from the queue → construct the client response.

The flow relationship between client request and client response :

5, a selector (Selector): write processing and network connection process, using a network connection (NetworkClient) processing client network requests, to ensure rapid response to the network client requests. Use JavaNIO asynchronous non-blocking way to manage connection and read and write requests, it can manage multiple network connection channels with a single thread. The advantage of using selectors is that the producer client only needs to use one selector to communicate with multiple servers in the Kafka cluster at the same time.

NIO concept :

 

The relationship between SocketChannel, selection key, transport layer, and Kafka channel is: SocketChannel is registered on the selector to return the selection key, the selector is used to construct the transport layer, and the transport layer is used to construct the Kafka channel.

Kafka channel writing process :

Kafka channel reading process :

Polling of selector :

The client establishes a connection to the server to accept the connection steps :

The events of the client and the server are corresponding. The client connects to OP_CONNECT and the server accepts OP_ACCEPT, the client writes OP_WRITE to the server to read OP_RE AD, and the server writes to OP_WRITE corresponds to the client to read OP_READ.

Selector steps for client and server :

6. Asynchronous sending mode : Provide a callback, after calling send, you can continue to send messages without waiting. When there is a result shipped back, the callback function will be executed. Asynchronous sending means that after the producer finishes sending a message, it does not need to care whether the server has finished processing it or not, and can then send the next message. That is, the callback method (onCompletion) of the callback class is called when the response result is returned.

7. Synchronous mode : When calling send to return Future, you need to call get immediately, because Future.get will always block when no result is returned.

     Note: (1) There is no key when sending the message, and the round-robin method is adopted to send the message to different partitions in a balanced manner.

                 (2) If the message has a key, the key is hashed, and then the partition number is obtained by modulo the number of partitions.

8. PartitionInfo object : Represents the distribution information of a partition. Its member variables include topic name, partition number, primary replica node, all replicas, and ISR list. The process of selecting the master copy of the message set: the message selects the partition number, and the master copy is selected according to the number to improve the write performance. As shown below:

9. SocketServer : Mainly focus on the communication protocol of the network layer, and the specific business processing logic is handed over to

KafkaRequestHandler and KafkaApis to complete. The specific implementation steps are as follows:

10. Reactor mode : A receiver thread is responsible for receiving all client connection requests and distributing the received requests to different processors for processing. Design idea: use different threads to process the connection part and the request part, so that the processing of the request will not block the incoming connection.

11. The request queue and response queue of the request channel : The request channel is where the processor exchanges data with the request processing thread and KafkaApis: if the processor adds a request to the request channel, both the request processing thread and KafkaApis can get the information in the request channel Request; if the request processing thread and KafkaApis add a response to the request channel, the processor can also get the response from the request channel. The receiving and sending of requests and responses is in order: send request→receive request→send n to respond→receive response.

 

 

 Inside Kafka Technology: Graphical and detailed explanation of Kafka source code design and implementation. Zheng Qihuang.pdf's study notes

Guess you like

Origin blog.csdn.net/baidu_28068985/article/details/106164388