Kafka copy mechanism

First, what is a copy of mechanisms:

Generally refers to a distributed system on a network interconnection of multiple machines the same copy of the data stored

 

Second, a copy of the mechanisms of benefits:

1, providing data redundancy

Part of the system fails, the system can still continue to operate, thereby increasing the overall availability and data durability

2, to provide high stretchability

Lateral support extension, it is possible to improve the machine performance by increasing the read mode, the read operation and to improve throughput

3, improve data locality

Allowing local data into the geographical proximity of the user, to reduce system latency.

 

Third, a copy of kafka

1, it is essentially a written message can only be added to the log file

2, with all copies of a partition in the same sequence of messages is stored

3, a copy stored on different dispersion Broker, can be brought down against the portion Broker data is not available (Kafka is almost several topics, each topic may be further divided into several partitions. Each partition is configured with a plurality of copy)

As follows: A copy of the distribution on Kafka cluster of three Broker

 

Four, kafka how to ensure that all copies stored in the same partition with the same sequence of messages:

Based on the leader (Leader-based) copy mechanism

It works as:

 

1, Kafka are classified into two copies: a copy of the leader (Leader Replica) and a copy of followers (Follower Replica). Each partition when you create a copy of the election must be called by a copy of the leader, the rest of the copy of the copy is automatically called followers.

2, Kafka, the followers of the copy is not external service provider. Followers copy does not handle client requests, its only task is to copy those from the leadership, all read and write requests must be sent to the leader of a copy where the Broker, the Broker in charge of the process. (Therefore currently only enjoy kafka bring a copy of the mechanism of an advantage, that is, to provide data redundancy to achieve high availability and high persistence )

3, when the leader of a copy of the Broker where downtime, Kafka relies on ZooKeeper monitoring capabilities provide real-time perception, and immediately open a new round of election leader from followers selected as a copy of a new leader. Leader old copy back after the restart, only added to the cluster as a follower of copies.

 

Five, kafka followers in the end be considered a copy of sync with the Leader under what conditions

Kafka introduced the In-sync Replicas, also known as ISR replica set. A copy of the ISR is synchronized with a copy of the Leader, on the contrary, is not a copy of the followers of the ISR is considered to be out of sync with the Leader

 

六、kafka In-sync Replicas(ISR)

1, ISR not just followers replica set, it must include a copy of the Leader. Even, in some cases, ISR only one copy of this Leader

2, by the end Broker replica.lag.time.max.ms parameters (Follower copy can copy behind Leader maximum time interval) value to control a copy of which followers of sync with the Leader? As long as a Follower copy of a copy of the lag time Leader discontinuous over 10 seconds, then Kafka considers the Follower copies are synchronized with the Leader, Follower even saved a copy of the message at this time significantly less than the Leader copy of the message.

3, ISR is a dynamic adjustment of the set, rather than static unchanging.

Write messages to speed the process of a copy of followers pulls data from the leader a copy of a copy of the Leader continued to slow, then after replica.lag.time.max.ms time, this will be considered to be a copy of Follower and Leader a copy of sync, and therefore can not be placed in the ISR. At this time, Kafka will automatically shrink ISR collection, a copy of "kicked out" ISR.

If the copy back slowly to catch up with the progress of the Leader, then it can be re-added back to the ISR.

4, ISR set is empty copy of the leader is also linked, the partition is not available, the Producer can not send any message to the partition. (Conversely leader a copy of a copy that can be hung from the ISR collection election leader)

 

Seven, kafka leader a copy where the broker hung up, how a copy of the election leader

1, ISR is not empty, from the ISR election

2, ISR is empty, Kafka can never survive in the ISR copy of the election, a process called Unclean leader election, by the end Broker parameters unclean.leader.election.enable control whether to allow Unclean leader election. Open Unclean leader election may result in loss of data, but the advantage is that it makes a copy of the partition Leader has always existed, and will not cease to provide services, and thus enhance the high availability. On the other hand, the benefits of prohibiting Unclean leader election is to maintain data consistency, avoiding message loss, but at the expense of high availability.

A distributed system is usually only meet two consistency (Consistency), availability (Availability), partition fault tolerance (Partition tolerance) in. Clearly, on this issue, Kafka gives you the right to choose A or C of.

It is strongly recommended not to open unclean leader election, after all we can to enhance high availability through other means. If you order it to get high availability improved at the expense of data consistency, it is very worth when the.

ps1: Election leader a copy of the election can be understood as the partition leader

PS2: leader election broker is different from the partition leader election,

Leader election Kafka is achieved by creating / controller nodes on a temporary zookeeper leader election, the current broker and write information in the node
{ "version": 1, " brokerid": 1, "timestamp": "1512018424988" }
use Zookeeper characteristics of strong consistency , a node can only be successfully create a client to create a successful broker is the leader, that is first come first served principle , leader is the cluster controller, responsible for all the cluster size of the transaction.
When the leader and zookeeper lose the connection, temporary node is removed, while the other broker will monitor changes in the node, when the node removed, other broker will receive event notifications to relaunch leader election

 

Eight, if allowed to provide external read Follower copy service, you should think how to avoid or mitigate inconsistent data due Follower copy of a copy of sync with the Leader and lead to a situation?

.......

 

 

 

Guess you like

Origin www.cnblogs.com/jetqiu/p/11681838.html
Recommended