Kafka partition allocation strategy Comments

RangeAssignor Kafka is the default partition allocation policy, Kafka provides consumer client parameters partition.assignment.strategy used to set the partition allocation policy between consumers and subscribe to topics. By default, this parameter is: org.apache.kafka.clients.consumer.RangeAssignor, i.e. using RangeAssignor allocation strategy. In addition, Kafka also provides two other allocation strategies: RoundRobinAssignor and StickyAssignor. Consumers client parameters partition.asssignment.strategy can configure multiple distribution strategies, separated by commas between each other.

A, RangeAssignor (default)

According to Kafka's consumer default logic setting, a partition can only be consumed by a consumer within the same consumer group (ConsumerGroup).

Assume that within a consumer group currently has only one consumer C0, subscribe to a topic, this topic contains 7 partition, this means that consumers C0 subscribed to seven partitions.

At this point in the consumer group has joined a new consumer C1, in accordance with established logical partition allocation needs to be part of the original consumer C0 to C1 consumer spending, consumers responsible for their own consumption C0 and C1 assigned to the partition, each other no substantial interference between.

Then in the consumer group has joined a new consumer C2, consumers C0, C1 and C2 are each responsible for consumer assigned to the partition.

If consumers too, appeared in the case of the number of consumers is greater than the number of partitions, consumers will have assigned see any partitions. Refer to the figure, a total of eight consumers, 7 partition, then the final consumer C7 due to the unavailability of any partition and thus can not consume any messages.

principle:

Principle RangeAssignor strategy is carried out in accordance with the total number of consumers and the total number of partitions divisible operations to obtain a span, and then distributed according to the average span partitions, as far as possible to ensure the partition allocated evenly to all consumers. For each topic, all consumers subscribed to this topic in the RangeAssignor strategy will be in accordance with the consumer group lexicographical order of the names, then consumption for each consumer division fixed partition range, if not equally distributed, then the front of the lexicographically who will be assigned more than one partition.

Suppose n = number of partitions / customer number, m = number% number of consumers partition, then the consumer first m, n + 1 assigned to each partition, the latter (the number of consumers -m) assigned to each consumer n th partitions.

1, assuming that there are two consumer C0 and C1 in the consumer group, subscribe to the theme t0 and t1, and each theme has four partitions, then subscribe to all partitions can be identified as : t0p0, t0p1, t0p2, t0p3 .

t1p0、t1p1、t1p2、t1p3。

The final allocation result:
consumers C0: t0p0, t0p1, t1p0, t1p1

Consumers C1: t0p2, t0p3, t1p2, t1p3

Such distribution is very uniform, then this allocation policies have been able to maintain this good characteristic of it?

2, assuming the example above two themes are only three partitions, then subscribe to all partitions can be identified as follows:

t0p0、t0p1、t0p2、

t1p0、t1p1、t1p2。

The final allocation of the results as follows:

Consumers C0: t0p0, t0p1, t1p0, t1p1

Consumers C1: t0p2, t1p2

You can clearly see that this distribution is not uniform, if a similar situation to expand, there may be some consumers overload situation occurs.

二, RoundRobinAssignor

RoundRobinAssignor principle strategy is to partition all the topic of all consumers in the consumer and consumer group subscribed in accordance with the lexicographical order, one by one then this partition is assigned to each polling consumers.

RoundRobinAssignor strategy parameter value corresponding partition.assignment.strategy: org.apache.kafka.clients.consumer.RoundRobinAssignor.
1, if a consumer within the same group of consumers of all subscription information is the same, then the partition is assigned RoundRobinAssignor strategy would be uniform.

Assuming that there are two consumer groups in consumer C0 and C1, subscribe to the theme t0 and t1, and each theme has three partitions, then subscribe to all partitions can be identified as: t0p0, t0p1, t0p2.

t1p0、t1p1、t1p2。

The final allocation of the results as follows:

Consumers C0: t0p0, t0p2, t1p1

Consumers C1: t0p1, t1p0, t1p2

2, if a consumer within the same group of consumers subscribe to information is not the same, then the partition is assigned at the time of execution of the assignment is not complete polling, it may lead to an uneven distribution of the partition. If a consumer does not subscribe to a topic within a consumer group, then the consumer will not assign any partition of this topic at the time of allocation of the partition.

Suppose there are three consumer C0, C1 and C2 in the consumer group, they subscribed to a total of three themes: t0, t1, t2, respectively, these three themes, two, three partitions, the entire consumer group that is subscribed to t0p0 , t1p0, t1p1, t2p0, t2p1, t2p2 six partitions. Specifically, the consumer is the subject t0 C0 subscription, the subscription is the subject of consumer C1 t0 and t1, consumers subscribe C2 is the subject t0, t1 and t2,

Then the final allocation result:
consumers C0: t0p0

Consumers C1: t1p0

Consumers C2: t1p1, t2p0, t2p1, t2p2

We can see RoundRobinAssignor strategy is not very perfect, so the distribution is actually not the optimal solution, because a full partition can be assigned to the consumer t1p1 C1.

Three, StickyAssignor

Kafka from the beginning of the introduction of this version 0.11.x allocation strategy

It has two main purposes:

(1) assigned to the partition as uniform as possible;

(2) allocate the partition remains the same as last time allocation as possible.

When the two come into conflict, the first goal in preference to the second goal. Given these two objectives, StickyAssignor specific implementation strategies than RangeAssignor and RoundRobinAssignor both assignment policy is much more complicated. For example we look at the actual results StickyAssignor strategy.

1, assuming that there are three consumers in the consumer group: C0, C1 and C2, which are subscribed to four themes: t0, t1, t2, t3, and each theme has two partitions, which means that the entire consumer group Subscribe the t0p0, t0p1, t1p0, t1p1, t2p0, t2p1, t3p0, t3p1 the eight partitions.

The final allocation result is as follows:

Consumers C0: t0p0, t1p1, t3p0

Consumers C1: t0p1, t2p0, t3p1

Consumers C2: t1p0, t2p1

At first glance this seems to be the same result as using RoundRobinAssignor policy assigned, but is that really the case? Again assuming that consumers C1 from consumer groups, the consumer group will perform re-balance operation, then the consumer will be re-assigned partition. If RoundRobinAssignor strategy,

Then the time allocation result is as follows:

Consumers C0: t0p0, t1p0, t2p0, t3p0

Consumers C2: t0p1, t1p1, t2p1, t3p1

As shown in the allocation result, RoundRobinAssignor strategy will be re-allocated in consumer polling C0 and C2. If used in this case is StickyAssignor strategy,

Then the allocation results as:

Consumers C0: t0p0, t1p1, t3p0, t2p0

Consumers C2: t1p0, t2p1, t0p1, t3p1

The results can be seen in the distribution retains the last distribution to all distribution consumers C0 and C2 result, consumers and original C1 "burden" is assigned to the remaining two consumer C0 and C2, C0 and C2 final the assignment also maintained a balance.

If the partition redistribution occurs, then for the same partition before it is possible to assign the new consumer and the consumer is not the same for consumers to deal with before even half of the consumer reproduced again in the new assignment in again, this is obviously a waste of system resources. StickyAssignor strategy As the name of "sticky", like, so allocation policies have a certain "stickiness" so as to allocate the same before and after the two, thereby reducing the occurrence of loss as well as other abnormalities of system resources.

So far consumers are analyzed subscription information is the same situation, we look at the subscription process in the case of different information.

For example, the same consumer group within three consumers: C0, C1 and C2, the cluster has three themes: t0, t1 and t2, respectively, these three themes, two, three partitions, that cluster there t0p0, t1p0, t1p1, t2p0, t2p1, t2p2 six partitions. Consumers subscribe to the theme C0 t0, C1 consumers subscribed to the topic t0 and t1, C2 consumers subscribed to the topic t0, t1 and t2.

If used at this time RoundRobinAssignor strategy, the final allocation result is as follows (the same as when the policy tells RoundRobinAssignor, so what might repeat):
[1] Set the allocation result

Consumers C0: t0p0

Consumers C1: t1p0

Consumers C2: t1p1, t2p0, t2p1, t2p2

If at this time uses StickyAssignor strategy, the final allocation result is:

[2] sets the assignment result

Consumers C0: t0p0

Consumers C1: t1p0, t1p1

Consumers C2: t2p0, t2p1, t2p2

We can see this is an optimal solution (C0 consumer does not subscribe to topics t1 and t2, so you can not assign any topic t1 and t2 of the partition to it, for the same reason consumers C1 can be inferred).
If this time out of the consumer group Consumers C0, then the allocation results RoundRobinAssignor strategy is:

Consumers C1: t0p0, t1p1

Consumers C2: t1p0, t2p0, t2p1, t2p2

You can see RoundRobinAssignor the strategy of keeping the distribution of consumers C1 and C2 Central Plains and some three partitions: t2p0, t2p1 and t2p2 (the result set for 1). If using a StickyAssignor strategy, then assign the result is:

Consumers C1: t1p0, t1p1, t0p0

Consumers C2: t2p0, t2p1, t2p2

You can see StickyAssignor the strategy of keeping the distribution of consumers C1 and C2 Central Plains and some five partitions: t1p0, t1p1, t2p0, t2p1, t2p2.

From the results point of view is even more superior than the other in terms of both strategy StickyAssignor allocation strategy, this strategy is very complex code to achieve.

Published 69 original articles · won praise 2 · Views 4468

Guess you like

Origin blog.csdn.net/zuodaoyong/article/details/104383117