Transaction Processing in Distributed Systems: Evolution and Challenges of 2PC and 3PC

In traditional monolithic applications, transaction management is relatively simple, and data consistency can be achieved through database transactions. However, with the splitting of the system and the application of a distributed architecture, transaction operations across multiple services become more complex. This leads to the concept of distributed transactions, which is the key to ensuring data consistency in distributed systems.

1. Based on the XA protocol

In a distributed system, data consistency needs to be ensured when multiple transactions are going on concurrently and involve multiple databases or resource managers (RMs). The XA protocol is a common distributed transaction management protocol, which provides a reliable mechanism to coordinate and ensure the correct execution of distributed transactions. The XA protocol provides two main commit strategies: two-phase commit and three-phase commit. This article will delve into the principles and characteristics of these two strategies, as well as their use in practical applications.

2. Two-phase commit protocol (2PC)

The two-phase commit protocol is used to ensure the atomicity of distributed transactions, that is, all participating nodes either execute or do not execute. The execution process is mainly divided into two stages:

The first stage is the preparation stage ; the second stage is the submission stage .

1) Preparation stage

The coordinator sends a prepare message to each participant, and each participant votes to return yes or cancel. Pre-execute local transactions, resources are blocked , but transactions are not committed.

2) Submission phase

The coordinator is based on the vote of each participant in the preparation stage. If and only if all participants agree to submit, the coordinator will notify all participants to submit the transaction, otherwise the coordinator will notify all participants to cancel the transaction.

3. Three-Phase Commitment Protocol (3PC)

The three-phase commit protocol introduces a timeout mechanism in both the coordinator and the participants, and splits the first phase of the two-phase commit protocol into two steps: query, then lock resources, and finally commit.

Three Stages of Execution

  1. In the CanCommit stage
    , the coordinator sends a CanCommit request to the participant. If the participant can submit, it returns a Yes response, otherwise it returns a No response.
  2. In the PreCommit phase
    , the coordinator decides whether to continue the PreCommit operation of the transaction according to the response of the participants.
  3. DoCommit stage

The coordinator decides whether to actually commit the transaction or interrupt the transaction based on the feedback results of each participant's PreCommit phase.

3. Comparison between 2PC and 3PC

2PC 3PC
time out Coordinator can set timeout Coordinator and participants can set timeout
block Block all nodes after entering the query The first stage is buffered, and the second stage is blocked after passing
consistency Inconsistency occurs with node exceptions Inconsistency occurs with node exceptions

In distributed systems, 2PC (two-phase commit) and 3PC (three-phase commit) are two common transaction protocols used to ensure the consistency of distributed transactions. Here is a comparison between them:

  1. The number of stages is different: 2PC has only two stages, the preparation stage and the submission stage; while 3PC has three stages, namely the preparation stage, pre-submission stage and formal submission stage.
  2. The timeout mechanism is different: 2PC does not have a timeout mechanism. Once it enters the waiting state, participants will wait until they receive a response; while 3PC introduces a timeout mechanism. If no response from most participants is received within the specified time, it will return Get out of business.
  3. Both 2PC and 3PC may be blocked, but 3PC is easier to avoid blocking than 2PC.
  4. Neither 2PC nor 3PC can perfectly solve the problem of distributed data consistency. Although transaction ACID characteristics cannot be guaranteed, the idea of ​​solving problems is widely used in many practical architectures.
  5. Different availability: Since 3PC introduces two stages of pre-submission and formal submission, it can better handle problems such as node failures in abnormal situations such as network partitions, and improve the availability and reliability of the system.
  6. Different implementation complexity: Since 3PC requires more resources and time to implement, it is more complicated to implement than 2PC.

==============================

If the article is helpful to you, please don't forget to add attention and like it!

Guess you like

Origin blog.csdn.net/citywu123/article/details/131372805