Distributed thing design and practice

Data consistency definition

  • anyone
  • any time
  • Anywhere
  • Any access method
  • Any service
  • The data is consistent

Reasons for inconsistent data

  • Data is scattered in many places
    • Multiple DB
    • DB and cache
  • Second-hand trading platform case
    • User, transaction, commodity and other functions

Reasons for distributed things

It started as a single process

image.png

After evolution, monolithic services evolved into microservices, and each service is a separate process

image.png

When the amount of user requests is large, in order to relieve the pressure on the database, a distributed cache is added

image.png

Distributed things case

E-commerce platform to buy goods

Place an order -> reduce inventory -> pay

image.png

This is the problem of distributed things.When an APP wants to buy something, this operation will involve multiple services, which means that multiple databases must be operated.In this way, local things cannot guarantee data consistency, so the problem of distributed things arises.

Distributed things scene

  • E-commerce order scenario
    • Place an order
    • Send message to MQ
  • Consistency guarantee
    • Local things
      • Order operation
      • Send MQ message operation
      • Put in a local thing

What's wrong with the above approach?

image.png

Question: If sending a message is timed out, you don't know whether MQ's return result is success or failure, and the operation of timeout is not an atomic

 

Distributed things classification

  • Rigid distributed things
    • Strong consistency
    • XA model
    • CAP
      • CP
  • Flexible distributed things
    • Final consistency
    • CAP, BASE theory
      • AP

Rigid distributed things

Satisfy the characteristics of traditional things

ACID (Atomicity-atomicity, Consistency-consistency, Isolation-isolation, Durability-persistence)

XA model

  • XA is defined in the X/Open CAE Specification (Distributed Transaction Processing) model, and the XA specification is composed of AP, RM, and TM
  • Among them, the application program (AP for short), AP defines the boundary of things (defines the beginning and end of things) and accesses resources within the boundary things
  • Resource Manager (Resource Manager referred to as RM), RM manages computer shared resources, resources and databases, etc.
  • Transaction Manager (TM), responsible for managing global things, assigning unique identifiers of things, monitoring the execution progress of things, and responsible for the submission of things, rollbacks, failure recovery, etc.

image.png

2PC (two-phase submission-XA specification standard implementation)

  • Case study
    • Organize mountain climbing
  • process
    • The second-stage submission is a standard implementation of the XA specification
    • TM initiates prepare vote
    • After RM agrees, TM then initiates Commit
    • There is an abnormality such as downtime during the Commit process. After the node service restarts, the commit compensation will be performed again according to XA recover
  • Disadvantage
    • Synchronous blocking model
    • Database resource lock time is too long
    • Global lock (isolation level-serialization), low concurrency
    • Not suitable for long things

image.png

Flexible distributed things

  • CAP
    • P must be needed in a distributed environment, CA trade-offs
  • BASE theory
    • Basically Available
    • Soft state
    • Eventual consistency
  • Architectural thinking
    • Flexible things are a compromise to the XA protocol. By reducing the strong consistency requirements, it reduces the lock time of database resources and improves availability.
  • Architecture classic implementation
    • TCC model
    • Saga model

 

TCC model

  • Try-confirm-cancel
  • The TCC model is completely handed over to the business to achieve, and each sub-business needs to implement the Try-Confirm-cancel interface, which is a big intrusion to the business.
    • Resource locking is handed over to the business side
  • try
    • Try to execute the business, complete all inspections, and reserve necessary business resources
  • confirm
    • Really execute the business and no longer do business inspections
  • Cancel
    • Release the business resources reserved in the Try phase
  • Case study
    • Remittance service, collection service case
      • User A remits 500 yuan to user B
    • Remittance service
      • try
        • Check the validity of A account, and check whether the status of A account is "transferring" or "frozen"
        • Check whether the A account balance is sufficient
        • Deduct 500 yuan from account A and set the status to transfer
        • Reserve deduction resources, and store the event of transferring 500 yuan from A to B account in the message or log
      • confirm
        • Do nothing
      • cancel
        • A account increased by 500 yuan
        • Release deduction resources from logs or messages
    • Collection service
      • try
        • Check whether the B account is valid
      • confirm
        • Read logs or messages, add 500 yuan to account B
        • Release deduction resources from logs or messages
      • cancel
        • Do nothing

Saga model

  • Originated from the paper Sagas published by Hector & Kenneth in 1987
  • The Saga model splits a distributed thing into multiple local things, and each local thing has a corresponding execution module and compensation module (corresponding to confirm and cancel in TCC)
  • When any local thing in the Saga thing goes wrong, the previous thing can be restored by calling the relevant compensation method to reach the final consistency of the thing
  • When each Saga sub-thing T1, T2,...TN has a corresponding compensation definition C1, C2,...CN-1, then the Saga system can guarantee
    • Sub-transaction sequence T1, T2,...TN is completed (best case)
    • Or the sequence T1, T2,...TJ, CJ-1,..., C2, C1,0<J<N, can be completed
  • Saga isolation
    • The business layer controls concurrency
      • Lock at the application layer
      • Application layer freezes resources in advance, etc.
  • Saga recovery method
    • Recover backwards, compensate for all completed things, if any of the sub-things fail
    • Recover forward, retry failed things, assuming that each sub-thing will succeed in the end

 

Rigid Distributed Things VS Flexible Distributed Things

 

Rigid Things (XA)

Soft things

Business transformation

no

Have

Rollback

stand by

Implement compensation interface

consistency

Strong consensus (CP)

Eventual consistency (AP)

Isolation

Native support

Implement resource locking interface

Concurrency performance

Severe recession

Slight decline

Suitable for the scene

Short things, low concurrency

Long things, high concurrency

How we practice

  • General problem solving ideas
    • Solve the problem itself
    • Let the problem go away
      • Ballpoint pen refill oil leakage solution
  • The ballpoint pen refill starts to leak oil after writing 2W times. If you want to solve the problem itself, then you need to add better materials and higher-end technology. If you want to make the problem disappear, it is to fix a number of times so that it can only write. 1.5W times, there is no oil and start to be discarded. Two methods like this
  • The first choice is to make the problem disappear, the second choice is to solve the problem itself
  • Option 1: Eliminate distributed things from business scenarios
    • Idea: The core business is processed first, and other businesses are processed asynchronously
  • Option 2: Flexible distributed things

Flexible distributed thing practice

  • General processing ideas
    • Local things-->short things
    • Distributed Things-->Long Things
    • Turn into multiple short things
    • Case study
      • A[Order]->B[Decrease Inventory]->C[Pay]
        • A->DB1
        • B->DB2
        • C->DB3
        • A/B/C are all successful
        • A/B succeeds, C fails
          • make up
  • Business scene
    • Asynchronous scene
      • Drive distributed things based on MQ messages
    • Sync scene
      • Based on asynchronous compensation distribution

Design of Distributed Things in Asynchronous Scenarios

Asynchronous scene

trading

Place an order and pay

image.png

Option 1: The business party provides a successful check back function for local operations

    • Transaction message: MQ provides distributed transaction functions similar to X/Open XA, and the final consistency of distributed transactions can be achieved through MQ transaction messages
    • Half message: A message that is not delivered temporarily. The sender has successfully sent the message to the MQ server, but the server has not received the second confirmation of the message from the producer. At this time, the message is marked as "temporarily undeliverable" status , The message in this state is half message
    • Message review: due to network interruption, producer application restart and other reasons, the secondary confirmation of a certain transaction message is lost. When the MQ server finds that a certain message is half-message for a long time through scanning, it mainly actively asks the message producer. The final status of the message (Commit or Rollback), that is, message review
  • MQ distributed thing design scheme

image.png

  • MQ distributed transaction message design
    • MQ thing message design thing message is an asynchronous guarantee type thing, the two thing branches are asynchronously decoupled through MQ, the design process of MQ thing message also draws on the two-stage submission theory, the overall interaction process is as shown in the figure above
    1. The thing initiator first sends a prepare message to MQ
    2. Execute local things after sending the prepare message successfully
    3. Return commit or rollback according to the execution result of local things
    4. If the message is a rollback, MQ will delete the prepare message without sending it. If it is a commit message, MQ will send the message to the consumer
    5. If the execution side hangs or times out during the execution of local things, the MQ server side will keep asking the producer to get the status of the thing
    6. The consumption success mechanism on the consumer side is guaranteed by MQ
  • cost:
    • MQ needs to support semi-message
    • MQ needs to provide message traversal
    • The business side needs to provide a back-check interface
  • Business party access steps

image.png

  • advantage
    • Universal
  • Disadvantage
    • The business side needs to provide a back-check interface, which is a big
    • Sending a message is not idempotent
    • The consumer needs to deal with idempotence

 

Solution 2: Local transaction message table

  • Local operations and sending messages through strong consistency of local transactions
    • Local transaction table
    • Local transaction message table
      • mqMessages(msgid,content,topic,status)

image.png

  • The sender message is not idempotent
    • At least once (at least once)
    • Once Only (only sent once)
    • At more once
  • Consumer processing message idempotent
    • Distributed lock
  • A->B->C
    • A/B succeeds, C fails
      • Record error log
      • Call the police
      • Manual intervention
  • advantage
    • Small business invasion

Compared with the provision of a message back-check interface (RockectMQ), the actual asynchronous scenario is still more used in the local message transaction table.

 

Synchronous scene distributed thing design

  • Sync scene
    • Homepage recommended product list
      • Commodity information
      • User Info
      • Social information
    • Buy goods
      • Place an order->A
      • Reduce inventory->B
      • Payment->C

image.pngimage.png

Driven by the business logic layer

  • solution
    • Distributed things based on asynchronous compensation
    • Three key points of architecture design

image.png

Start recording the parameters of the call request. If the compensation interface is based on the parameters after failure, the interface needs to be idempotent

  • Overall architecture design

image.png

Scenario: A places an order, B reduces inventory, and C pays. When calling the interface, A first uses the Proxy to store the transaction ID, status, parameters and other information, and then executes the local transaction, then B and C follow the same process if they are all successful , Then the state of the thing is changed to 2, which means success. If you can have more parameters when C fails, the thing ID will compensate A and B.

Business logic layer Proxy design (based on AOP implementation)

    • Add thing annotations to the logic layer call @Around("execution(**(..)) && @annotation(TX)")
    • Before the real business logic is called, the Proxy generates a globally unique TXID to indicate the transaction group. The TXID is stored in the ThreadLocal variable. It is written before the method starts and cleared after completion, and the TXID is written to the remote database and the transaction group is made to start. status
    • Before the business logic layer calls the data access layer, it records through the RPCProxy proxy, the current call request parameters
    • If the business is normal, after the call is completed, the call record of the current method is deleted or archived
    • If the business is abnormal, query the call chain for reverse compensation

image.png

  • Data access layer design
    • Atomic interface
    • Compensation interface
      • Who will provide it?
        • Provided by the business party
      • Idempotence guarantee
        • Use local resource lock to lock the only resource
    • Based on the principle interface method, add an annotation to the method name to mark the compensation method name
    • @Compensable(cancelMethod = "cancelRecord")

image.pngimage.png

  • Distributed transaction compensation service
    • Thing group table (database table TDB)    
      • Record the state of the thing group
      • txid state timestamp
    • Transaction call group table (database table TDB)
      • Record every call in the thing group and related parameters
      • txid actionid callmethod pramatype params
    • Compensation strategy
      • Call execution failed, modify transaction group status
      • Distributed transaction compensation service asynchronous execution compensation

Successful Cases of Distributed Things

  • The normal process of creating an order transaction group for second-hand transactions
    • Lock inventory ->Reduce red envelope ->Create order
  • The proxy layer transparently records the call request parameters
    • Record the beginning and end of the domain of things
    • When all remote calls are successful
    • No intrusion into business logic

image.png

Distributed transaction failure case

  • The abnormal process of creating an order transaction group for a second-hand transaction
    • The microservice data access layer failed, and the agent changed the transaction group status
    • Normal execution of microservice business
    • Transaction compensation service performs compensation asynchronously

image.png

Ok, here is the end of the distributed things... Take a break, hey, it's time to find a job again, you can contact me if you need it

Guess you like

Origin blog.csdn.net/m0_50180963/article/details/115264513