Distributed Transaction - 2PC

1. ACID is the basic standard of transactions, and the ultimate purpose is consistency and durability, of which atomicity and isolation are only to meet the means of the first two items; the basic principle of atomicity is that all operations on data either take effect or all Does not take effect; isolation at least guarantees that within the scope of the same batch of data, operations are mutually exclusive or serial, that is, for the same data A, transaction M cannot operate data A before transaction N is completed, or transaction M Fail, or wait for N to complete.

2. In order to meet the requirements of atomicity in a distributed scenario, it is necessary to find a way on the basis of local transactions, that is to say, the atomicity of distributed transactions = intra-process atomicity guarantee + inter-process atomic guarantee, intra-process atomicity guarantee Let’s not talk about atomicity. You can refer to the ideas of Redo and Undo on the database. The atomicity between processes is to ensure that either all processes succeed or all fail. Isolation is to ensure mutual exclusion or serialization on a data partition. OK, so the simplest transaction implementation between processes is that each machine executes the transaction. If it fails, all roll back, and the successful transaction ends. This is the classic 2PC protocol.

3. The 2PC protocol itself is very simple, but the machine that initiates the vote cannot be a single point, so it is necessary to backup dual machines to solve this problem.

4. But once you enter the network environment, there is a problem. The network is unreliable. Once the machine is unreachable, you don’t know whether the machine is down or the network is simply unreachable. Therefore, a single machine must have self-recovery capabilities, that is to say I must record where I am executing. Once the network is reachable (whether it is crash recovery or network recovery), the single machine first enters the recovery mode, records its own state, and then synchronizes the data to the standard host (master), and then enters the unified mode. The state of providing services to the outside world is the derivation process of the most primitive distributed consensus protocol, which is very similar to ZK's ZAB protocol.



5. 2PC, also known as two-phase commit, is a common consensus algorithm for distributed transaction processing. It is very convenient to use this protocol to complete the coordination of each participant, and to uniformly determine the commit and rollback of transactions, thereby ensuring the consistency of distributed data. .

6. As the name suggests, two-phase commit divides the process of transaction commit into two phases.
Phase 1: Commit transaction request phase
The coordinator submits the transaction pre-execution query -- "Participant transaction pre-execution (depending on the completion of the local transaction, such as recording Undo and Redo information in the transaction log) -- "Participant returns the coordinator ACK, NO or timeout
Stage 2: Commit transaction execution phase
When all participants return ACK messages in phase one, execute the transaction commit phase:
the coordinator submits the transaction for execution -- "participant submits transaction local submission -- "participant returns coordinator ACK message -- "transaction completes
when In the stage, some participants return NO message or part of the execution times out, and the transaction rollback stage is executed:
the coordinator submits the transaction rollback--" the participant submits the local rollback--" the participant returns the coordinator ACK message--" the transaction is completed.

Seven,
2PC advantages: simple and clear, easy to implement.
Disadvantages of 2PC:
synchronous blocking:
One of the biggest problems of 2PC is synchronous blocking, which is to form a completed transaction in 2 stages. During this process, all participants are blocked, which means that most of the time is waiting for other participants. the response of the user instead of actually performing a logical calculation.
Single-point problem:
The big problem of single-point problem lies in the role of the coordinator. When the coordinator executes the local task before it can issue a commit or rollback or issue some commands (only some participants receive the commit or rollback command), at this time the entire The 2PC stage will not be able to operate or even some participants will be locked all the time, and the overall data may be inconsistent.
Too conservative:
When any participant fails or crashes due to its own reasons, or causes:
1. The coordinator can only wait for the timeout, and does not achieve a better timeliness.
2. The entire transaction process fails, which obviously deviates from the original intention of the distributed design. The distributed itself is to integrate the extreme speed capability and excellent horizontal scalability. This fault-tolerant design is very conservative.
  

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326548620&siteId=291194637