Kafka principle of messaging middleware to achieve hundreds of thousands of QPS
Disk Sequential Write: Write text additional disk is the way[append-only]
Page Caching: not write directly to disk, and write the operating system level os cache
Zero-copy technology: If you have taken from os cache os cache, otherwise fetch from disk, sent directly to the card, no other copy operations
Kafka messaging middleware messages are not lost
Distributed Storage: a large amount of data can not only be stored on a server, data can be distributed to a topic to the server through a plurality of the Partition;
High availability architecture: Each Partition can have a copy of the data, the main Partition became Leader, copy to become the Follower, copy other servers, when the server is down where the Leader, Follower will be promoted to Leader;
ISR mechanism to prevent data loss problems: (1) scenario: Suppose a data to the Leader, Follewer not come synchronize data, Leader or Leader of all the host's own thread down, this time will be promoted to Leader Follower, this time will missing piece of data. (2) ISR mechanism solutions: automatic for each Partition ISR maintains a list, this list will have its Leader, and will also include the Follower to keep pace with the Leader, what-if scenarios (1) appeared, this time inside the ISR list the Leader will be kicked off
Kafka write data flow
Each Partition will at least have to have a Follower in the ISR list, to keep up with the Leader of data synchronization
Every time data is written, are required to write at least Partition Leader success, as well as at least one of the ISR Follower also successfully written, the write is considered successful
If these two conditions are not met, it would have been written to fail, so that the production system constantly attempt to retry until the above two conditions are met before it can be considered successful write
According to the above ideas to configure the appropriate parameters, in order to ensure data is not lost writes Kafka