Visibility from the comments of micro-channel circle of friends, to talk about cause and effect consistency in the application of distributed business

Outline       

       Recently reading Designing Data-Intensive Application (referred to as DDIA) designed for data-intensive applications, the Chinese translation , translation was the overall feeling is good. Read Chapter 9, "consistency and consensus" when there are micro-channel circle of friends technical director (prior to the causal consistency exposition, combined with global architects in 2015 ArchSummit Summit relevant to share information can be obtained from the following places ) comments and reply comments on about a state of a circle of friends, between multiple copies across data center (IDC) when copying data, the consistency of the application of the theory of cause and effect, cleverly solved the problem of write conflicts. Let's look at the actual causal consistency in business development is how to apply.

Causal consistency of understanding

     Let us first look at the contents of a simple DDIA ask - answer example, and thus define a causal relationship to the next:

1, ask - answer causation

       It can be seen from Figure 1, the observer (Observer) a dialogue the first to see the answer to the question "About ten seconds ...", and then only to see the answer to the question "How far into the ...". This is confusing, because it is contrary to our intuition of cause and effect: if a question is answered, the question itself obviously have to be there, because the answer must have seen this problem, we believe that the problem causal link between dependence and answers.

       We consider the design of such a Q & A platform, when a user has access to data, such as refreshing the latest list of all the questions and answers, just know almost recommendation Refresh page, he must first see the problem, and then see the answer, otherwise it will bring great distress to the user. Because only see the answer, but there is no corresponding question is of no practical significance.

       As DDIA mentioned, causality applied to a sequence of events: fruit due before the message is sent before the message fee. And just like in real life, one thing will lead to another thing occur sequentially: a node to read some data and then write some of the results, another node reads the content of their writing, and in turn write Some other content, and more. These causal chain of operations dependent on the defined causal ordering system, i.e. what what happens before. We thus also leads to the consistency of a distributed system of cause and effect, if a system is subject to the provisions of the order of a causal relationship, we say it is a causal consistency.

Causal consistency micro-channel circle of friends

       Let's look at a review of a bar state micro-channel circle of friends and reply to comments (and comments) causality posed, as well as micro letter how by ensuring that the causal consistency between different data centers to ensure that a user in the brush circle of friends when responding to the comments do not appear to see the corresponding, but do not see a corresponding reply comments. To understand the following premise is that we must go to learn things before mentioned: micro-channel circle of friends technical director Chen Ming to share information on a global architect summit in 2015 ArchSummit.

2, the micro-channel four places globally distributed data centers

       Figure 2 micro-channel distribution from four places in the global data center, we can see Wang users have two friends: When Mary, Kate, are in different regions (data center), so they want to see the contents of your friend circle, must wait until the relevant data to see after the copy between different data center synchronized to the user's complete IDC.

 Figure 3-1, data synchronization between the scrambled copies

 Figure 3-2, data synchronization between the scrambled copies

       3-1 can be seen from the figure, due to the delay in the network copy data between different copies of the distributed system interruption common scenario, two leads in the synchronization message to the user Kate (Canada) where datacenter when the copy has been out of order. That original order is this: "Mary: Where is it?" -> "Wang: Mary, this is the Meili Snow Mountain," but Kate found in the message to the database is in this order: "Wang: Mary, this is the Meili Snow Mountain "->" Mary: this is where the "middle or at some point to only query" Wang: Mary, this is the Meili Snow Mountain, "this message, you say that Kate would not ignorant force?.

      To solve this problem, micro-channel is how to deal with it, let me see the following analysis.

Comb 4-1, causal diagram

Comb 4-2, causal diagram

       Seen from the figure 4-1, 4-2, we can comment on the circle of friends Mary Wang state issued the "Mary:? This is where 'as a result, while the Wang responded to Mary's comment" Wang : Mary, this is the Meili Snow Mountain "when the results. According to this agreement, out of order when the two copies of the data center data synchronization to occur even where Kate, Kate according to a circle of friends in the brush can be adjusted according to the causal relationship of this review, the order to the correct answer, you can read way. That micro-channel in the end is what method to allow users to understand the various regions of the agreement it? Specifically:

 5, circle of friends event causal consistency algorithm

      We understood from FIG. 5, the micro-channel using the following scheme:

  • Each comment has a unique and incremental digital ID, ensure that duplication
  • Every new comment ID must be larger than the global maximum local ID has been seen, to ensure causality
  • All comments and see new comments broadcast locally to other IDC; comment merger duplication of the same ID

     We can read the technology behind the top three to make assumptions of rationality:

1, each comment has a unique and increasing numbers ID : behind it is certainly an ID generator, it can globally unique entrance, each data center must get to the entrance.

2, the largest global ID for each new comment than local ID must have seen the large, to ensure causality : the above chart data centers in Hong Kong, when finished, 2 reviews published and synchronized Shanghai data center over 1 after a review of 47 such as ID, if there is another user in the Hong Kong area Post new comment, then it must be greater than the global maximum current Hong Kong ID data center can see, this time is 7, so users recently published the Hong Kong region at this time the ID of the comment must be greater than 7 (upper graph has a "skip 5 'note), the above figure ID (. 8) is derived from this.

3, all comments and see new comments broadcast locally to other IDC; review of the merger same ID duplication : So when broadcast it? When in fact the user under this area there are comments on the state of the same circle of friends, the region is responsible for a global application ID, and then will broadcast this event to review other data centers. Note that this process needs to merge all sequences can see, for example, Hong Kong Data Center on the merger 12478 series of IDs for comment on the incident with a circle of friends in the state, then the overall broadcast out, so as to ensure the same state for all the most current event be broadcast as a whole, or else the Hong Kong IDC broadcast only 8, and if the previous sequence of events in the middle of the broadcast is lost, then other nodes such as IDC Canada will miss part of the comment on the incident, which is the multiple data measures to fill the seats. Of course, this method has a premise is this: because the state issued the same circle of friends, the general review will not be a lot, so the resulting data redundancy interaction will not be great, otherwise it will not work. Regarding the comment merger same row ID of weight, can be seen on Figure 5, IDC Canada will receive 147 event series from Shanghai to IDC, 1478 will also receive events from Hong Kong IDC synchronized from the series, the two a repeat broadcast of the event series, we need to go heavy.

to sum up

      These are the causal consistency in the application of distributed real business.

Guess you like

Origin www.cnblogs.com/king0101/p/11908305.html