kafka Getting Started Part III producers message zoning principle mechanism analysis

1. Why partition
kafka concept of relating (Topic), which is a logical container carrying real data, and further under the topic into several partitions, i.e. kafka message organization is actually tertiary structure: Subject partition --- --- messages. Under the theme of each message will only save one partition, without saving multiple copies in multiple partitions. The official website of this picture is very clear demonstration of the tertiary structure kafka, as follows:
In fact, the partition function is to provide load balancing capabilities, or that the main reason for the data partition, is to achieve high system scalability. Different partitions can be placed on different nodes machine, and the database read and write operations are also performed for the particle size of the partition, so that the machine can independently of each node of each partition performs write request processing. And we can also remember to add new nodes to increase the throughput of the overall system.
 
2. all those partitioning strategy
The so-called partitioning strategy is to determine the producer sends a message to the algorithm that partitions. kafka provides a default partitioning strategy for us, but it also supports a partition plan your custom. Common partitioning strategy so few, as follows:
  1. In rotation strategy
In rotation strategy, also known as Round-robin strategy, that is the order of distribution. A topic such as the following three partitions, the first message is sent to the partition 0, the second message is sent to a partition, the third message is sent to the partition 2, and so one, the fourth message has been sent to the partition 0, just like the picture below:
If the policy is not specified then the partition producer program will write a message in all partitions in rotation in accordance with the theme of the way.
In rotation strategy is very advantageous load balancing performance, it is always to ensure that the message is equally distributed to the maximum on all partitions, so by default it is the most logical partitioning strategy is one of our common partitioning strategy.
  1. Random Strategy
Random Randomness policy strategy, also known as the so-called random is random, we will send a message to a partition, as shown in the picture below:
First calculate the number of partitions of the subject, and then returns a random integer less than it.
Essentially all of the random strategy also sought to partition the even distribution of the message to, but from the actual performance, its performance is not as good as in rotation strategy.
  1. Health insurance policies by message sequence
kafka允许为每条消息定义消息建,简称Key。这个Key可以是有着明确业务含义的字符串;也可以用来表征消息元数据。
特别是在kafka不支持时间戳的年代,在一些场景中,工程师们都是将消息创建时间分装进Key里面。一旦消息被定义了key,那么你就可以保证同一个key的所有消息都进入到相同的分区里面,由于每个分区下的消息处理都是有顺序的,故这个策略被称为按消息健保序策略,如下图:

Guess you like

Origin www.cnblogs.com/tugeboke/p/11760387.html