1. Serialization a message
There are key messages and value
kafka provides the basis for a sequence of data types tool for custom class service needs to achieve self serialization
ProducerRecord object containing KV and headers, or in this case the object KV
KV will be serialized in KafkaProducer # doSend to obtain a byte array of KV
Headers and then added to the byte array in ProducerBatch
Code, see:
org.apache.kafka.clients.producer.internals.ProducerBatch#recordsBuilder
org.apache.kafka.common.record.MemoryRecordsBuilder#appendStream
2. kafka of tcp packets
Schema using Struct and converts the data into ProducerBatch tcp packets matching format kafka
To send a message, for example
org.apache.kafka.common.requests.AbstractRequest#toSend
org.apache.kafka.common.requests.AbstractRequest#serialize
org.apache.kafka.common.requests.AbstractRequestResponse#serialize
org.apache.kafka.common.requests.ProduceRequest#toStruct
org.apache.kafka.common.protocol.types.Schema#write
org.apache.kafka.common.requests.RequestHeader#toStruct
public Struct toStruct() { Schema schema = schema(apiKey.id, apiVersion); Struct struct = new Struct(schema); struct.set(API_KEY_FIELD_NAME, apiKey.id); struct.set(API_VERSION_FIELD_NAME, apiVersion); // only v0 of the controlled shutdown request is missing the clientId if (struct.hasField(CLIENT_ID_FIELD_NAME)) struct.set(CLIENT_ID_FIELD_NAME, clientId); struct.set(CORRELATION_ID_FIELD_NAME, correlationId); return struct; }
org.apache.kafka.common.requests.ProduceRequest#toStruct
public Struct toStruct() { // Store it in a local variable to protect against concurrent updates Map<TopicPartition, MemoryRecords> partitionRecords = partitionRecordsOrFail(); short version = version(); Struct struct = new Struct(ApiKeys.PRODUCE.requestSchema(version)); Map<String, Map<Integer, MemoryRecords>> recordsByTopic = CollectionUtils.groupDataByTopic(partitionRecords); struct.set(ACKS_KEY_NAME, acks); struct.set(TIMEOUT_KEY_NAME, timeout); struct.setIfExists(NULLABLE_TRANSACTIONAL_ID, transactionalId); List<Struct> topicDatas = new ArrayList<>(recordsByTopic.size()); for (Map.Entry<String, Map<Integer, MemoryRecords>> topicEntry : recordsByTopic.entrySet()) { Struct topicData = struct.instance(TOPIC_DATA_KEY_NAME); topicData.set(TOPIC_NAME, topicEntry.getKey()); List<Struct> partitionArray = new ArrayList<>(); for (Map.Entry<Integer, MemoryRecords> partitionEntry : topicEntry.getValue().entrySet()) { MemoryRecords records = partitionEntry.getValue(); Struct part = topicData.instance(PARTITION_DATA_KEY_NAME) .set(PARTITION_ID, partitionEntry.getKey()) .set(RECORD_SET_KEY_NAME, records); partitionArray.add(part); } topicData.set(PARTITION_DATA_KEY_NAME, partitionArray.toArray()); topicDatas.add(topicData); } struct.set(TOPIC_DATA_KEY_NAME, topicDatas.toArray()); return struct; }
Assembling packets
public abstract class AbstractRequestResponse { /** * Visible for testing. */ public static ByteBuffer serialize(Struct headerStruct, Struct bodyStruct) { ByteBuffer buffer = ByteBuffer.allocate(headerStruct.sizeOf() + bodyStruct.sizeOf()); headerStruct.writeTo(buffer); bodyStruct.writeTo(buffer); buffer.rewind(); return buffer; } } public class NetworkSend extends ByteBufferSend { public NetworkSend(String destination, ByteBuffer buffer) { super(destination, sizeDelimit(buffer)); } private static ByteBuffer[] sizeDelimit(ByteBuffer buffer) { return new ByteBuffer[] {sizeBuffer(buffer.remaining()), buffer}; } private static ByteBuffer sizeBuffer(int size) { ByteBuffer sizeBuffer = ByteBuffer.allocate(4); sizeBuffer.putInt(size); sizeBuffer.rewind(); return sizeBuffer; } }
It is possible to infer, kafka message format: a length of 4 bytes of storage, headerStruct, bodyStruct
Of course, by NetworkSend and comments can also be seen NetworkReceive