Distributed Transaction - 2/3-stage submission

  2/3 phase commit problem is the consistency of distributed data operation, atomicity problem that the operation of the plurality of nodes between different data, such as data node A Node B increases and decreases in data requires atomic operation.

  Solution: In the top of each distributed node introduction of TC (Trancaction Coordinator), the final operation coordinated commit by the TC, only the pre-operation of each node (precommit) was successfully commit, or rollback.

  Two-phase commit, the first phase precommit, distributed to each node sends an acknowledgment message, each node will perform the operation (but not submitted), and writes redo logs and undo log, after it's happened inform TC ready;

  The second stage, TC received will commit instruction issued after all the nodes ACK reply, if there is no response to node failure or a corresponding preparation will abort instruction issued; to commit after receiving the instruction, the data will be submitted to the respective nodes ; If the received command is to abort, by executing the rollback will undo instruction;

  Note here that, if the second phase of the commit fails, such as machine downtime and the like, will continue to be submitted after the restart the machine; the first phase of short ACK reply on behalf of the node has the ability to submit success. If commit fails, then you need to manually reply, like the Great God tells:

no distributed asynchronous protocol can correctly agree in presence of crash-failures

  It does not solve the commit stage 2 scene because downtime and other causes of failure; in addition, there are still weaknesses Stage 2 submission, if TC is hung up how to do, all the nodes are still here waiting for final instructions issued by TC , and other resources related tables are in a locked state, in order to avoid this from happening, it was suggested that a three-phase commit, the basic idea is as follows:

  The first stage can commit, a general comparison of TC consult each node, in fact, the essence is to see whether the node, or service is normal; a node to see their normal ACK reply

  The second stage of pre commit, TC see everyone a normal instruction will be issued precommit, after each node receives the precommit instruction, the operation will be executed, and writes undo and redo log log; Of course, if found can commit stage TC there node in question, even if this thing does not need to notify each node, because we can commit what stage do not do anything, can be introduced to reduce the number of failure scenarios of operation for some degree on the commit phase can, for example, some node itself when the other nodes do not want to execute a statement, written redo / undo the log.

  The third stage do commit, TC a look at everyone's precommit have been ready, and will be issued a commit instruction; knock on the blackboard, if each node is not received within the specified time commit command will automatically commit, but chose to commit not a rollback, because after the first two stages, a high probability we will successfully submitted (for data consistency), so the timeout operation is submitted; if TC found that some nodes precommit phase error, or do not have the will at send abort instruction, each node performs undo log contents rollback.

  So the three-stage and two-stage difference lies in the nature of the commit phase increases the timeout submission mechanisms to avoid causing problems resources can not be released in time when TC's own failure, or network failure.

Log redo / undo log

  To understand redo / undo log data to understand the mechanism of operation, in a traditional relational database, the data for the transaction:

  1) notify the database server locks (table) resources;

  2) Modify client records (such as may be modified directly by the database client Hedisql the like)

  3) After the commit, client modification information will be submitted to the server segment, server data to a file processing is completed after the resource release (off disc);

  For many large data databases such as LSM tree

  1) Client terminal data submitted first placed into memory (easily lost)

  2) After the condition is satisfied, will the memory data to the hard disk brush, to achieve off disc (fly)

  Either way, we see that in the absence of placing orders, are at risk, once the client / memory failure, such as downtime will result in lost data and the like, in order to achieve data loss can still reply, then set up a mechanism to redo logs, all the data will not fall discs are recorded, each service will be restarted first read the redo logs, data hardly ever reload the disk to ensure that data is not lost.

  In a distributed transaction to ensure that, in order to ensure that the data (and hardly ever dish) after precommit timely implementation stage was still able to restore the state after an unexpected restart the service, so the data is written to the redo operation file.

  As undo file, it is better understood, that data recorded state before modification, upon the occurrence of rollback, and memory values ​​directly rollback data to a client side state before.

 

reference:

https://blog.csdn.net/lengxiao1993/article/details/88290514
https://my.oschina.net/wangzhenchao/blog/736909
https://www.hollischuang.com/archives/681
https://www.letiantian.me/2014-06-18-db-undo-redo-checkpoint/

 

Guess you like

Origin www.cnblogs.com/xiashiwendao/p/12103754.html