kafka high throughput
1. Sequential reading and writing
Kafka's messages are continuously appended to the file. This feature enables Kafka to make full use of the sequential read and write performance of the disk.
Sequential read and write does not require the seek time of the hard disk head, and only requires a small sector rotation time, so the speed is much faster than random read and write
2. Zero copy
After Linux kernel2.2, a system call mechanism called "zero-copy" appeared, which skips the copy of "user buffer" and establishes a direct mapping between disk space and memory, and the data is no longer copied to "User Mode Buffer"
3. Partition
The content in the topic in afka can be divided into multiple partitions, and each partition is divided into multiple segments, so each operation is performed on a small part, which is very portable and increases the ability of parallel operations
4. Batch sending
Kafka allows sending messages in batches. When the producer sends a message, it can cache the message locally and wait until the fixed condition is sent to Kafka.
- Wait for the number of messages to reach a fixed number
- send once in a while
5. Data compression
Kafka also supports the compression of message collections. Producer can compress message collections in GZIP or Snappy format. The
advantage of compression is to reduce the amount of transmitted data and reduce the pressure on network transmission.
Batch sending and data compression are used together. If data compression is performed on a single piece, the effect is not obvious