Distributed Transaction (ACID)


隔离级别

读未提交  (RU)  可以读取正在修改的数据

读已提交 (RC)  可以读取修改后的数据,不可读取正在修改的数据

可重复读 (RR)   多次读取 前后都一致,别的事务插入修改的数据不会读到

串化     (SR)   

事务并发带来隔离问题

1 脏读:事务A读取事务B正在更新的数据,然后B回滚操作,那么事务A读取到的数据是脏的。

2 不可重复: 事务A多次读取同一批数据,事务B在事务A多次读取过程中,对数据做了更新提交操作。
导致事务A前后读取数据不一致。

3 幻读:事务A修改数据,事务B插入一条件数据,事务A再读的时候发现还有一笔数据没有修改。

隔离级别与并发问题

读未提交  (RU)  可脏读,可幻读,不可重复读

读已提交 (RC)  不可, 可幻读,不可重复读

可重复读 (RR)   不可 , 可幻读,   可重复读

串化     (SR)      不可     不可    可

How to solve the stand-alone database?

In order to solve the ACID problem of the stand-alone database, various database vendors basically use transaction + lock + MVCC to guarantee.

Atomicity: transaction + lock mode to achieve

Consistency: It is also a matter to resolve

Isolation level: ORACLE realizes RC isolation level through UNDO+transaction+lock, while avoiding phantom reading and non-repeatable reading

What is a transaction? A transaction is a concept that guarantees the consistency of data before and after modification. 100 rows of data cannot be modified. 50 rows are modified successfully, and the other 50 rows fail to be modified.

Nor can it cause inconsistencies in the business logic before and after the data. For example, the account A deducted 50 successfully, and the account B increased by 50 failed.

Therefore, it can be seen from the above that transactions are composed of single statement and multiple statements.

By default, a single statement automatically starts the transaction, and multiple statements need to be opened manually

Use the following command to pack multiple statements together.

start trasction

commit;

Through the transaction prompt, the database will either modify the rows modified by the statement together, or modify them together and fail.

What if there is partial success and partial failure? Roll back the modification successfully.

Thus, the atomicity requirement is fulfilled through the transaction.

Distributed Transaction (ACID)

Then in a distributed database, our transactions or modified statements are all issued to the following 4 or more shards.

Independent stand-alone database for each shard.

If a shard is issued, it is not bad, you can make full use of the inherent mechanism of the following database to achieve atomicity

What if there are multiple fragmented transactions below?

For example, when a single modification statement is issued to 4 shards, the default is to split a large transaction into 4 sub-transactions.

Send the 4 sub-transactions to the following database for execution.

Then the same is true for the case where part of the rows of a stand-alone database is modified successfully, but part of the modification is unsuccessful.

For example, two shards are successfully modified, and the other two shards fail to be modified.

The reason for the failure may be that the fragmented database is hung up, waiting for a lock, and the sub-transaction is killed when encountering a deadlock.

Then register the execution of each transaction in different shards in the upper layer, and mark OK if the execution is successful, otherwise it will be FAILE

GTID,SQL_ID,SHADING01,SHARDING02,SHAGDING03,SHADING04

00X1 S02, OK , OK FAILE FAILE

The partial modification is successful and the OK shards are rolled back,

In addition, a TIEM_OUT should be set for the issued transaction. After the timeout, the execution is considered to be failed, and the ROLLBACK command is issued for the successful.

For multi-statement transactions, we add a SQL_ID, requiring all SQL in this transaction to be executed on all shard nodes to be OK, and any FAILE must be rolled back.

Distributed Transaction (ACID)

consistency

It should not be a problem in distributed mode, because each query is issued to the database after it is read using the MVCC of the database.

When a transaction lock record is encountered, the old data will also be read in the MVCC UNDO.

You will also encounter such problems. For example, two shards are read at T1 time, and the data of the next two shards are read at T2 time, and the data of these two shards have been modified and submitted.

Because the time of sending to the next two shards is different, the data of the four shards are inconsistent before and after.

The design of ZTE DB is to create a GTID above, and each table below has a hidden column GTID, and each query also obtains the GTID, and then compares it with the GTID of the row

If the data is updated, the reading fails. Without using the MVCC mechanism, writing actually blocked reading.

In this way, it is necessary to modify the source code of MYSQL PG, improve the interface, improve the rollback query based on the GTID value, and the MVCC snapshot read.

The same repeated reading and phantom reading are more difficult to achieve in a distributed environment, because the following database MVCC mechanism (MYSQL, PG, ORACLE) cannot be used

Therefore, it is necessary to modify the MYSQL core, add a distributed MVCC interface, above the middleware of the computing node, and issue transactions or query statements through this interface.

Each transaction and query statement carries a time indicator such as GTID or SCN. The following database adds a column for each row of data to store GTID (SCN, TIMESTAMP).

The statement of the distributed interface determines whether to read the snapshot according to the hidden GTID. Instead of a distributed interface, use the original MYSQL transaction ID, or ORACLE's SCN.

Guess you like

Origin blog.51cto.com/15080028/2643029