Distributed Transaction: Saga mode

1 Saga related concepts

1987 Hector Garcia-Molina Princeton University and Kenneth Salem published a Paper Sagas, it is about how to deal with long lived transaction (long live the transaction). Saga is a long-living transaction can be decomposed into a set of sub-running transaction may be staggered. Wherein each sub-transaction is a true transactional database consistency is maintained.
Papers Address: Sagas

Consisting of 1.1 Saga

  • Each Transaction Saga by a series of T-Sub I Composition
  • Each T I has a corresponding compensation operation C I , the compensation operation for revoking T I resulting in

It can be seen and compared to TCC, Saga no "reserved" action, its T i is submitted directly to the library.

Saga order of execution of two ways:

  • T 1 , T 2 , T 3 , ..., T n
  • T . 1 , T 2 , ..., T J , C J , ..., C 2 , C . 1 , where 0 <j <n

Saga defines two recovery strategies:

  • backward recovery, backward recovery, compensation of all completed transactions, if any sub-transaction fails. I.e., the second execution order mentioned above, wherein j is an error occurred sub-transaction, the effect of this approach is successful before withdrawal out all sub-transation, so that the results of the entire Saga revocation.
  • forward recovery, the recovery forward, retry the failed transaction, assuming that each sub-matters will eventually succeed. Must be successfully applied to a scene, the execution order is similar to this: T . 1 , T 2 , ..., T j (failure), T j (retry), ..., T n- , where j is occurring wrong sub-transaction. C in this case does not need I .

Obviously, there is no need to provide compensation for the move to restore the transaction, if your business in sub-transactions (final) will always succeed, or compensating transactions difficult or impossible to define, move to restore more in line with your needs.

Theoretically compensating transactions never fail, however, in a distributed world, the server may be down, the network might fail, or even data center may also be a power outage. In this case what can we do? The last resort is to provide fallback measures, such as manual intervention.

1.2 Saga conditions of use

Saga looks promising to meet our needs. Long live all transactions can be done? Here are a few restrictions:

  1. Saga allow only two levels of nested , top Saga and simple sub-transactions
  2. In the outer layer of the all-atom can not be met. In other words, sagas might see some of the results of other sagas
  3. Each sub-atomic transaction should be independent behavior
  4. In our business scenario, each business environment (such as: flight reservations, car rentals, hotel reservations and payment) is an independent natural behavior, database and each transaction can be used for a corresponding service guarantee atomic operations.

Compensation also has Considerations:

  • From the perspective of compensating transaction semantics withdrawn transaction T I behavior, but will not necessarily return the database to perform T I state when. (For example, if the transaction is triggered missile launch, it might not be able to undo this action)

But this is not a problem for our business. In fact, it is difficult to undo behavior is likely to be compensated. For example, email transactions can be compensated by another email sent to explain the problem.

For ACID guarantees:

Saga to guarantee ACID and TCC as:

  • Atomicity (Atomicity): ensure that under normal circumstances.
  • Consistency (Consistency), at some point, there will be cases of data A and B library library violation conformance requirements, but in the end is the same.
  • Isolation (Isolation), at some point in time, A transaction can read the results of the transaction portion B submitted.
  • Persistent (Durability), and local affairs as long as they commit the data is persistent.

ACID Saga does not provide assurance as atomicity and isolation can not be met. Original paper described as follows:

full atomicity is not provided. That is, sagas may view the partial results of other sagas

By saga log, saga can ensure consistency and durability.


TCC and contrast

Saga disadvantage compared reservation operation is the lack of TCC, resulting in a troublesome operation to achieve compensation: T I is the commit, such as a service message is sent, in TCC mode, save draft (the Try) retransmission (Confirm The), revocation then delete draft (Cancel) on the line. The Saga is directly send mail (T i ), if you want to revoke it'll have to send an e-mail Description Revocation (C i ), to implement some trouble.

If the above example is replaced by e-mail: A complete services in T i send Event to ESB (Enterprise Service Bus, can be considered a messaging middleware), listening to the Event downstream services do some of their work and then immediately after send Event to ESB, if a service performs compensation operation C i , then the whole compensation operation on a deep level.

However, no action may be considered to be reserved advantages:

  • Some business is very simple, we need to modify the original TCC apply business logic, while Saga simply add a compensation operation on the line.
  • TCC communication for a minimum number of 2n, while Saga of n (n = number of sub-transaction).
  • Some third-party services are not Try interfaces, TCC model is more tricky to implement, while Saga is very simple.
  • No reserve action would mean having to worry about the release of resources, it is also easier exception handling (exception handling Saga Compare the recovery strategy and the TCC).

2 Saga associated implementation

 

Saga Log

Saga to ensure that all children are able to complete the transaction or compensation, but the Saga system itself may crash. It may be in the following state collapse Saga:

  • Saga received transaction requests, but has not yet begun. Factor corresponding micro-transaction service status has not been modified Saga, we do not need to do anything.
  • Some sub-transaction has been completed. After the restart, Saga must then last completed transaction recovery.
  • Child transaction has begun but not completed. Since the remote service may complete the transaction, the transaction may fail, even the service request timed out, saga can only be re-launched unacknowledged completed before the child affairs. This means that the sub-transaction must be idempotent.
  • Sub-transaction fails, the transaction has not yet begun its compensation. Saga must perform a corresponding compensation transaction after the restart.
  • Compensation Affairs has begun but not completed. The same solutions and the previous one. This means that the compensation transaction must also be idempotent.
  • All sub-transactions or compensating transactions have been completed, the same as the first case.

To return to the above-mentioned state, we must track every step of the sub-transactions and compensating transactions. We decided to meet the above requirements by way of events, and save the following events in the persistent store called the saga log in:

  • Saga started event request to save the entire saga, which comprises a plurality of transactions / requests compensation
  • Transaction started event corresponding to stored transaction request
  • Transaction ended event to save the corresponding transaction requests and responses
  • Transaction aborted event saved the corresponding transaction requests and reasons for failure
  • Transaction compensated event corresponding to the stored request and reply compensation
  • Saga ended marks the end of the transaction request saga event, you do not need to save anything

 

 

By combining these events persist in the saga log, we can restore saga into any such state.

Due to persistent Saga Just do the event, and the event in JSON content in the form of storage for Saga log is very flexible database (SQL or NoSQL), durable message queues, even ordinary file can be used to store events, of course, some more fast recovery to help saga state.

Precautions

For services, the realization Saga has the following requirements:

  1. T I and C I are idempotent.
  2. C i must be able to succeed, and if not successful will require manual intervention.
  3. T I - C I and C I - T I execution result must be the same: sub-transaction is revoked.

The first point requires T i and C i is idempotent, for example, assume that in the implementation of T i timeout time, at this time we do not know the results, and if forward recovery strategy will be sent again T i , then there may occur T i is performed twice, so they requested that T i idempotent. If backward recovery strategy will be sent C i , and if C i have a timeout, it will try again to send C i , then it may appear C i is performed twice, it requires C i idempotent.

The second point requires C i must be able to succeed, this is well understood, because if C i can not perform successfully meant the entire Saga can not be completely revoked, this is not allowed. But there will always be some special cases, such as C i codes have bug, crashes and other services for a long time, this time on the need for human intervention.

The third point is rather strange at first glance, illustrate, consider or T i execution timeout scenario, we have adopted a backward recovery, sending a C i , then there will be three cases:

  1. T i requested got lost, not before the service, not executed after T i
  2. T I in C I performed before
  3. C I in T I performed before

For the first case, easier to handle. For the second and third case, the required T I and C I are exchangeable (commutative), and the final result is sub-transaction is revoked.


3 Saga coordination

Coordination saga: saga saga coordination logic implementation comprises steps. When the system command to start the saga, coordination logic must select and inform the participants of a saga perform local affairs. Once the transaction is completed, saga sort coordination selected and called the next saga participants. This process continues until all steps saga. If any local transaction fails, the saga must perform compensating transactions in the reverse order. Construction of a saga coordination logic There are several different methods:

  • Choreography (Choreography): allocation decisions and ordering participants in the saga. They mainly communicate by exchanging events.
  • Control (Orchestration): saga centralized coordination logic in the control class saga. A saga controller sends a command message to the saga participants, telling them what action to take.

3.1 choreography (Choreography)

Based saga choreography: One way is to use the sagas implementation schedule. When using the arrangement, there is no central coordinator saga tells participants what to do. Instead, sagas participants subscribe to each other's events and respond accordingly.


By following this path sagas:

  1. Order Service to create an Order in APPROVAL_PENDING status and post OrderCreated event.
  2. Consumer Service Consumer OrderCreated event, verify that consumers can place orders, and publish ConsumerVerified event.
  3. Kitchen Service消费OrderCreated事件,验证订单,在CREATE_PENDING状态下创建故障单,并发布TicketCreated事件。
  4. Accounting服务消费OrderCreated事件并创建一个处于PENDING状态的Credit CardAuthorization。
  5. Accounting Service消费TicketCreated和ConsumerVerified事件,收取消费者的信用卡,并发布信用卡授权活动。
  6. Kitchen Service使用CreditCardAuthorized事件并更改AWAITING_ACCEPTANCE票的状态。
  7. Order Service收到CreditCardAuthorized事件,更改订单状态到APPROVED,并发布OrderApproved事件。

创建订单saga还必须处理saga参与者拒绝订单并发布某种失败事件的场景。例如,消费者信用卡的授权可能会失败。saga必须执行补偿交易以撤消已经完成的事情。图中显示了AccountingService无法授权消费者信用卡时的事件流。

 

 

 

事件顺序如下:

  1. Order服务在APPROVAL_PENDING状态下创建一个Order并发布OrderCreated事件。
  2. Consumer服务消费OrderCreated事件,验证消费者是否可以下订单,并发布ConsumerVerified事件。
  3. Kitchen服务消费OrderCreated事件,验证订单,在CREATE_PENDING状态下创建故障单,并发布TicketCreated事件。
  4. Accounting服务消费OrderCreated事件并创建一个处于PENDING状态的Credit CardAuthorization。
  5. Accounting服务消费TicketCreated和ConsumerVerified事件,向消费者的信用卡收费,并发布信用卡授权失败事件。
  6. Kitchen服务使用信用卡授权失败事件并将故障单的状态更改为REJECTED。
  7. 订单服务消费信用卡授权失败事件,并将订单状态更改为已拒绝。

可靠的基于事件的通信

在实施基于编排的saga时,您必须考虑一些与服务间通信相关的问题。第一个问题是确保saga参与者更新其数据库并将事件作为数据库事务的一部分发布。
您需要考虑的第二个问题是确保saga参与者必须能够将收到的每个事件映射到自己的数据。

编组的saga的好处和缺点

基于编舞的saga有几个好处

  • 简单:服务在创建,更新或删除业务时发布事件对象
  • 松耦合:参与者订阅事件并且彼此之间没有直接的了解。

并且有一些缺点

  • 更难理解:与业务流程不同,代码中没有一个地方可以定义saga。相反,编排在服务中分配saga的实现。因此,开发人员有时很难理解给定的saga是如何工作的。
  • 服务之间的循环依赖关系:saga参与者订阅彼此的事件,这通常会创建循环依赖关系。例如,如果仔细检查图示,您将看到存在循环依赖关系,例如订单服务、会计服务、订单服务。虽然这不一定是个问题,但循环依赖性被认为是设计问题。
  • 紧密耦合的风险:每个saga参与者都需要订阅所有影响他们的事件。例如,会计服务必须订阅导致消费者信用卡被收费或退款的所有事件。因此,存在一种风险,即需要与Order Service实施的订单生命周期保持同步更新。

3.2 控制(Orchestration)

控制是实现sagas的另一种方式。使用业务流程时,您可以定义一个控制类,其唯一的职责是告诉saga参与者该做什么。 saga控制使用命令/异步回复样式交互与参与者进行通信。

 

 

  • Order Service首先创建一个Order和一个创建订单控制器。之后,路径的流程如下:
  • saga orchestrator向Consumer Service发送Verify Consumer命令。
  • Consumer Service回复Consumer Verified消息。
  • saga orchestrator向Kitchen Service发送Create Ticket命令。
  • Kitchen Service回复Ticket Created消息。
  • saga协调器向Accounting Service发送授权卡消息。
  • Accounting服务部门使用卡片授权消息回复。
  • saga orchestrator向Kitchen Service发送Approve Ticket命令。
  • saga orchestrator向订单服务发送批准订单命令。

使用状态机建模SAGA ORCHESTRATORS

建模saga orchestrator的好方法是作为状态机。状态机由一组状态和一组由事件触发的状态之间的转换组成。每个transition都可以有一个action,对于一个saga来说是一个saga参与者的调用。状态之间的转换由saga参与者执行的本地事务的完成触发。当前状态和本地事务的特定结果决定了状态转换以及执行的操作(如果有的话)。对状态机也有有效的测试策略。因此,使用状态机模型可以更轻松地设计、实施和测试。

 

 

图显示了Create Order Saga的状态机模型。此状态机由多个状态组成,包括以下内容:

  • Verifying Consumer:初始状态。当处于此状态时,该saga正在等待消费者服务部门验证消费者是否可以下订单。
  • Creating Ticket:该saga正在等待对创建票证命令的回复。
  • Authorizing Card:等待Accounting服务授权消费者的信用卡。
  • OrderApproved:表示saga成功完成的最终状态。
  • Order Rejected:最终状态表明该订单被其中一方参与者们拒绝。

SAGA ORCHESTRATION和TRANSACTIONAL MESSAGING

基于业务流程的saga的每个步骤都包括更新数据库和发布消息的服务。例如,Order Service持久保存Order和Create Order Saga orchestrator,并向第一个saga参与者发送消息。一个saga参与者,例如Kitchen Service,通过更新其数据库并发送回复消息来处理命令消息。 Order Service通过更新saga协调器的状态并向下一个saga参与者发送命令消息来处理参与者的回复消息。服务必须使用事务性消息传递,以便自动更新数据库并发布消息。

让我们来看看使用saga编排的好处和缺点。

基于ORCHESTRATION的SAGAS的好处和缺点

基于编排的saga有几个好处:

  1. 更简单的依赖关系:编排的一个好处是它不会引入循环依赖关系。 saga orchestrator调用saga参与者,但参与者不会调用orchestrator。因此,协调器依赖于参与者,但反之亦然,因此没有循环依赖性。
  2. 较少的耦合:每个服务都实现了由orchestrator调用的API,因此它不需要知道saga参与者发布的事件。
  3. 改善关注点分离并简化业务逻辑:saga协调逻辑本地化在saga协调器中。域对象更简单,并且不了解它们参与的saga。例如,当使用编排时,Order类不知道任何saga,因此它具有更简单的状态机模型。在执行创建订单saga期间,它直接从APPROVAL_PENDING状态转换到APPROVED状态。 Order类没有与saga的步骤相对应的任何中间状态。因此,业务更加简单。

业务流程也有一个缺点

  • 在协调器中集中过多业务逻辑的风险。这导致了一种设计,其中智能协调器告诉哑巴服务要做什么操作。幸运的是,您可以通过设计独立负责排序的协调器来避免此问题,并且不包含任何其他业务逻辑。

除了最简单的saga,我建议使用编排。为您的saga实施协调逻辑只是您需要解决的设计问题之一。



 





 

 

 





 



Guess you like

Origin www.cnblogs.com/tianyamoon/p/11969089.html