Knowledge Point presentation

1: Kafka Why so fast read and write performance 

      Using the sequential reading Kafka

         kafka sequential reading of about 600M / s; Random Access about 100kb / s

         Random and sequential read and write, are the two input and output memory.

         Random Write: data stored on disk and then occupy the space for a new disk, the operating system will turn the data file written to disk when the chant delete some data, it will vacate the storage space occupied by the original data change, a long time constantly write delete data, it will generate a lot of space is fragmented, it will cause a large data files on a number of non-contiguous storage space, when this part of reading and writing data, that is, random write;

        Random access position, constant adjustment to track the head to read and write data at different positions with respect to the order of reading and writing on a continuous space, time consuming to many. At boot time, when you start a large program, computer to read a large number of small files, but these files are not stored in a row, also within the scope of random reads.

 

       Improvement: do disk defragmentation, consolidate fragmented files, but then there will be another generation of debris resulting in a decline of disk read and write performance, but also can not solve the problem of random reads of small files, this is only a temporary solution! A better solution: replace the electronic drives (ssd), electronic drives melancholy eliminates the mechanical hard disk head movement, for reading and writing data has greatly improved.

 

      Memory Mapped Files

         Even if the order written to the hard disk access speed is still impossible to catch memory. So, kafka real-time data is not written to disk, it takes full advantage of modern operating system paging storage to use the content to improve I / O efficiency.

         Memory Mapped Files also become a memory-mapped files in 64-bit operating system in general may represent a data file 20G, which works directly with the OS to achieve the Page file memory mapping to the house. After you complete the mapping operation on the physical memory is synchronized to the hard disk.

         By mmap, the process of reading and writing as hard as read-write memory (virtual memory), do not care about the size of memory, virtual memory as we reveal all the details.

 

      Consumers read data

         A web server transmits a static file, how to optimize? The answer is zero copy. Under the traditional model, we read a file from the hard disk as follows: first copied into kernel space (read system call and put DMA, so use kernel space), and then copied to the user space, re-copied from user space to kernel space, finally sent to the card.

         Zero Copy from kernel space (DMA) to the kernel space (Socket), and then sent directly to the card.

to sum up:

         Kafka put all the messages in a single file. Improved I / O speed by mmap, writing data when it is added at the end so the optimum speed; when reading data directly output with sendfile violence.

 

Published 33 original articles · won praise 3 · Views 5857

Guess you like

Origin blog.csdn.net/WandaZw/article/details/85324159
Recommended