Distributed Transaction: Saga data consistency

Distributed transaction implementation

  1. There are second-order commit protocol (Two Phase Commitment Protocol), the third-order commit protocol (Three Phase Commitment Protocol) and Paxos algorithm.
  2. X / Open DTP model (1994) includes an application program (AP), the Transaction Manager (TM), the resource manager (RM), communication resource manager (CRM) four parts. Generally, the common transaction manager (TM) is a transaction middleware, common resource manager (RM) is a database, a common communication resource manager (CRM) is the messaging middleware. Second-order and third-order commit protocol commit protocol is derived based on this idea.
  3. Paxos algorithm.
  4. Saga achieve.

Second-order submission

  1. There are two phases the first phase: the preparation phase (voting phase) and the second stage: submission stage (execution stage).
  2. During the preparation phase, transmission of information with the transaction by the coordinator to the other participants, and asked if they could commit the transaction, and then wait for its return.
  3. Commit phase, as long as any one of the participants failed to return, then interrupt transaction coordinator sends a request to rollback all participants, participants rollback.
  4. Problems:
    1. Synchronous blocking problem. During execution, all participating nodes are transaction blocking type. When a participant possession of public resources, and other third-party access nodes public resources have blocked state.
    2. Single point of failure. Because of the importance coordinator, once the coordinator has failed. Participants will have blocked it. Especially in the second phase, the coordinator fails, then all the players are still in the locked state of affairs resources, and can not continue to complete the transaction operation. (If it is hung coordinator, a coordinator may be re-elected, but can not be solved because the participants coordinator downtime caused by problems in the blocked state)
    3. Data inconsistencies. In stage two-phase commit protocol, the coordinator after sending a request to commit the participants, local network anomaly occurs or a fault has occurred in the coordinator transmits commit request process, this time resulting in only a portion of the participants received a commit request. In this partial participant after the commit request will commit to the operation. But other parts of the machine has not received a request can not be executed commit transaction commits. So the whole distributed system giving rise to the phenomenon of data inconsistency.
    4. The second phase can not solve the problem: After commit coordinator and then send the message down, and the only received this message participant also is down. Then created a new coordinator coordinator even by election protocol, this state of affairs is uncertain, no one knows whether the transaction was already submitted.

Third-order submission

  1. 2PC 3PC the preparatory stage again into two, so there are three stages to submit CanCommit, PreCommit, DoCommit three stages.
  2. CanCommit: coordinator requesting containing affairs, asked whether participants to commit the transaction, and then wait for its return.
  3. PreCommit: As long as there is any participant returns an error or timeout, the transaction to the third cloth interrupt operation. If the normal return on the third step to execute the transaction commits.
  4. DoCommit: Upon receiving the operation is interrupted, then send an interrupt request to all participants to enable them to perform a rollback.
  5. Problems:
    1. Relative to the single point of failure 2PC, 3PC the main problem, and reduce congestion, because once the participants can not receive timely information from the coordinator, who will perform the default commit. And it would not have been held in a blocked state affairs and resources. However, this mechanism can lead to data consistency problems because, due to network reasons, abort coordinator sent a timely response is not received by the participants, the participants perform the commit operation after waiting for a timeout. Data inconsistencies exist between this and the other received the abort command to roll back and participants.
  • tips: Knowing the 2PC and 3PC, whether it is two-phase commit protocol or three-phase commit can not completely solve the problem of distributed consistency.

Complex Paxos algorithm - the Internet has been a lot of articles written Paxos algorithm, there is no need to repeat them here to be a simple introduction

  1. paxos algorithm to solve is the ultimate consistency. It contains two roles => proponent recipient
  2. For example, suppose a cluster has multiple nodes, Paxos how to make these data nodes to reach agreement it? Specific approach in two major steps:
  3. The first stage is the election, the proposal has not been proposed. At this stage, we (multiple nodes - 接收者) from 提议者the elected leadership to lead us to make proposals, used here 序列号to identify who is more qualified election (the higher the serial number, the higher the eligibility to vote).
  4. The second stage is mainly based on results of the first stage, a clear acceptance of the proposal elected node, and clearly the proposed content.
  • Here 序列号it is very important, no matter at what stage, a small sequence number, his election will be rejected. In the first stage, once 接受者had accepted before the election 提议者proposal, and then come back and that this 接受者is 提议者, even if the first phase of election, but also forced to modify its proposal, then he will move on to the second phase of the proposal before the elected leaders the proposal before the leaders, to strive to make your views converge. If you 接受者have not previously received any proposal, and that the newly elected 提议者can put forward their own proposed.

Reference: The first article in principle it very thoroughly, know almost The second example is very easy to understand article

  1. Paxos algorithm Distributed Systems
  2. How to explain the Paxos algorithm is plain?

Saga (unfinished)

Requirements and Limitations
  1. In a distributed system due to network requests may delay the need to support services such as power Saga calls. The saga that is the result of multiple calls are caused by the same.
  2. Only guarantee ACD, does not guarantee data isolation
    1. Semantic data inconsistency while operating a resource appears
    2. Simultaneously operating an operation result data coverage
    3. Dirty read, such as to modify a data process, a saga additional read, unmodified data before causing misreading
    4. Fuzzy read, i.e., during data read, modify the data of a saga, causing inconsistencies read twice
    5. Solution: start from the operational level to join a Session and lock mechanisms to ensure that resources can serialize operation. Also you can go to isolate this part of the resources at the operational level, and finally in the process of business operations can get the latest updates by reading the current status timely manner.
  3. Saga occur when multiple transactions, a transaction execution error, how can we ensure an effective fallback?
  4. Forward recovery: through regulation of Saga affairs manager to see if the transaction is successful, by setting the number of automatic retries or manually retry.
  5. Back fallback: the addition of supplemental events manually, so that the compensation mechanism after each micro-service failures.
  6. Choreography Saga transaction: To be added

Choerodon (pig toothfish) of Saga Solutions - With Kafka

  1. In Saga, the Saga pig toothfish is assigned an orchestrator as a transaction manager, when the service starts, the registration service to all SagaTask Manager. When an instance starts Saga, service consumers may actively pulled to Saga by way of polling data corresponding to this example, and executes corresponding service logic. The state of execution can be viewed by the transaction manager to show on the screen.
  2. Here Idempotence is recorded by executing instance to achieve, it will be recorded in each instance event when the implementation of a SagaTask, next time if there is such an instance will not generate execution.
  3. Data isolation solution is reduced to a single transaction management services, such as: when A and B containing the micro-service service need to be changed to the C data service at a certain time, this will generate two Saga data. C after pulling Saga these data, and to perform a database operation (here corresponding to the data to isolate the process database)
  4. Pig toothfish is currently implemented recovery strategy forward. If the subsequent improvement strategies will be better in time to share.
  5. Choreography transactions: To be added
reference:

Micro-service data Choerodon pig toothfish platform consistency of the solution: Dong Fan

Guess you like

Origin juejin.im/post/5d6a066af265da03b31be515