How big data architecture Kafka millions per second to achieve high concurrent write?

Article to talk about kafka some of the architectural design principles, and this is when the Internet company for an interview very high frequency technology test center.

Kafka is a high-throughput, low-latency high concurrency, high performance messaging middleware, there is a very widely used in the field of big data. Good Kafka cluster configuration can be done even hundreds of thousands per second, millions of ultra-high concurrent writes.

So Kafka in the end is how to achieve such a high throughput and performance of it? This article we talk about a little bit.

First, the page cache technology + Disk Sequential Write

First, for each received data Kafka will write to disk up, as shown below:

So here we can not help but have a question, if the data to disk-based storage, frequently write data to disk file, this performance will not be poor? I am sure you all feel disk write performance is very poor.

Yes, if really so simple with the above that figure, then indeed this performance is relatively poor.

But in fact Kafka here there is a very good and outstanding design, is to ensure that the data write performance, Kafka is the first operating system-based page caching to achieve file written.

The operating system itself with a layer of cache, called the page cache, is cached in memory, we can also call it os cache, meaning that the operating system caches own management.

When you write to the disk file, you can write directly to the os cache inside, that is only written to memory, when the next is determined by the operating system itself to os cache the data in real brush into a disk file.

This is only a step, it can be a lot of disk file write performance improves, because in fact this is the equivalent of writing memory, not writing disk, it looks:

Then other times one is kafka write data, a very crucial point, he is disk sequential write way to write. In other words, just append data to the end of the file, not a random position in the file to modify the data.

Conventional mechanical disk if you wanted to random write, really poor performance, that is, just find a location for the file to write data.

But if you are at the end of the file append a sequential manner to write data, then this disk sequential write performance can basically write memory with performance itself is the same.

So we know, on top of the Ituri, Kafka when writing data on the one hand based on the os-level page cache to write data, high performance, in essence, is nothing to write memory.

Another, he is the use of sequential disk write way, so even if the data disk into the brush when the performance is very high, but also with the memory write is about the same.

Based on the above two points, kafka on the realization of ultra-high performance write data.

So we think, if that kafka write data to a 1 millisecond time-consuming, it is not every second is that you can write 1000 data? But if kafka high performance, write a data only takes 0.01 milliseconds it? So every second is not to be written 100 000 number?

So to ensure that the core points per second write tens or even hundreds of thousands of pieces of data, that is, the maximum extent possible to enhance the performance of each piece of data is written, so that you can write more of the amount of data per unit time, improve throughput .

Second, zero-copy

Having a written piece, come to talk about this consumption.

We should all know, from Kafka where we often have to consumption data, consumption of which had actually been from kafka reading a disk file and then sent to the consumer of data downstream, as shown below.

If this frequently read data from the disk and then to the consumer, where performance bottlenecks it?

If what optimization is not kafka suppose to do, is to send a very simple disk reads data from the downstream to the consumer, then the process is probably as follows:

Take a look at the data to be read in the absence os cache, if not, then on into the os cache file from disk after reading the data.

Then copy from os cache operating system's data to cache the application process, the then copy the data from the cache application process in the operating system level Socket cache, sent to the card after data last extract from Socket cache, and finally sent out to the downstream consumption.

The whole process, as shown below:

We fancy diagram, it is clear that you can see no need to have two copies of it!

One was copied from cache to cache the operating system's application process, the application cache and from then copied back to the operating system Socket cache.

And in order to perform these two copies, several intermediate further context switch occurs, while the application is executed, while the operating system context switch to be performed.

Therefore, to read the data in this way is more consumption performance.

Kafka为了解决这个问题,在读数据的时候是引入零拷贝技术。

也就是说,直接让操作系统的cache中的数据发送到网卡后传输给下游的消费者,中间跳过了两次拷贝数据的步骤,Socket缓存中仅仅会拷贝一个描述符过去,不会拷贝数据到Socket缓存。

大家看下图,体会一下这个精妙的过程:

通过零拷贝技术,就不需要把os cache里的数据拷贝到应用缓存,再从应用缓存拷贝到Socket缓存了,两次拷贝都省略了,所以叫做零拷贝。

对Socket缓存仅仅就是拷贝数据的描述符过去,然后数据就直接从os cache中发送到网卡上去了,这个过程大大的提升了数据消费时读取文件数据的性能。

而且大家会注意到,在从磁盘读数据的时候,会先看看os cache内存中是否有,如果有的话,其实读数据都是直接读内存的。

如果kafka集群经过良好的调优,大家会发现大量的数据都是直接写入os cache中,然后读数据的时候也是从os cache中读。

相当于是Kafka完全基于内存提供数据的写和读了,所以这个整体性能会极其的高。

说个题外话,下回有机会给大家说一下Elasticsearch的架构原理,其实ES底层也是大量基于os cache实现了海量数据的高性能检索的,跟Kafka原理类似。

三、最后的总结

通过这篇文章对kafka底层的页缓存技术的使用,磁盘顺序写的思路,以及零拷贝技术的运用,大家应该就明白Kafka每台机器在底层对数据进行写和读的时候采取的是什么样的思路,为什么他的性能可以那么高,做到每秒几十万的吞吐量。

这种设计思想对我们平时自己设计中间件的架构,或者是出去面试的时候,都有很大的帮助。

强力推荐阅读文章

年薪40+W的大数据开发【教程】,都在这儿!

大数据零基础快速入门教程

Java基础教程

web前端开发基础教程

linux基础入门教程学习

大数据工程师必须了解的七大概念

云计算和大数据未来五大趋势

如何快速建立自己的大数据知识体系

 

Guess you like

Origin blog.csdn.net/chengxvsyu/article/details/93367461