Database transactions and locks

The transaction of the database refers to a set of business processing logic. This set of logic processing requirements either succeed or fail (atomicity), the data state before execution and the data state after execution are always consistent (consistency), and the processing between transactions Will not affect each other (isolation), the data must be persistent after the transaction is executed (). A specific database product will use some technical means to implement these characteristics of transactions. For example, undo log is used to achieve data rollback, and different database locking strategies are used to achieve different isolation. A clear understanding of the implementation details of these features helps to properly design and implement complex business logic. Next, we will mainly introduce how database transaction isolation is achieved.

 

The isolation of transactions is essentially how to control concurrent reads and writes of database data. The concurrent reading and writing of the database will bring the following problems:

  • dirty read
  • non-repeatable read
  • hallucinations
  • Lost updates due to transaction rollback (first type of lost updates)
  • Lost updates due to transaction writes overwriting each other (the second type of lost updates)

Therefore, the control of concurrent access is actually converted into solving the above concurrency problems. For solving different concurrency problems that require different degrees of use of locks, transaction isolation is divided into

  • Read uncommitted (no matter what, the transaction may read data before other transactions have committed)
  • Read Commit (RC): Dirty reads resolved
  • Repeatable Read (RR): Dirty reads and non-repeatable reads are resolved)
  • Serializable: Solve dirty reads, non-repeatability and phantom reads

(The content described below is subject to mysql)

Database locks are divided into:

  • Shared lock (read lock or S lock): If transaction T adds a shared lock to an object, then any transaction can add a shared lock to the data to read data, but cannot add X lock or modify data
  • Exclusive lock (write lock or X lock): If transaction T imposes an exclusive lock on an object, only T transaction can modify data, and any other transaction is not allowed to add S or X lock until T releases the lock on A

Depending on the locked object, there are:

  • Table lock: lock the entire table, generally used in DDL operations
  • Index record lock: lock a single data record
  • Gap lock: lock a data interval
  • next-key-lock: record lock + gap lock

According to the simplest approach, to achieve the isolation effect of RC and RR, all read data must be locked with a shared lock, and written data must be locked with an exclusive record. To prevent phantom reads, even table-level locking and serialization are required. This efficiency is Unacceptable, so in order to improve concurrency efficiency, many database software have implemented the MVCC (multi-version concurrency control) method. The basic implementation principles are as follows:

Each data record adds three hidden fields, including the creation version number, the delete version number, and the rollback pointer

Every time the DB system opens a new transaction, it will generate an incremental unique transaction ID for the transaction, and generate a read view, which contains the list of active transaction IDs of the current system. Under the RR isolation level, the MVCC behavior

1. The insert operation assigns the record's creation version number to the current version

2. The delete operation assigns the deleted version number of the record to the current transaction ID

3 The update operation marks the current record as deleted, the deleted version number is assigned as the current transaction ID, and a new record is created, and the created version number of the new record is assigned as the current transaction ID

4. The select operation needs to avoid dirty reading and non-repeatability, so the read data needs to include the data that has not changed before the transaction is opened, or is deleted after the transaction is opened, delete_version==null && create_version<up_limit_tid or delete_version>up_limit_tid

 

The MVCC behavior in RC is that the read view is updated before each read is performed, so the changes made by the latest commit transaction are read.

 MVCC implements concurrent non-blocking reads. This non-blocking read is actually a snapshot read without locking.

     as simple 

  • select * from xxx where ?

Relative snapshot read and current read , current read needs to be locked:

  •  select * from table where ? lock in share mode;
  • select * from table where ? for update;
  • insert into table values (…);
  • update table set ? where ?;
  • delete from table where ?;

 

With the support of MVCC, the locking of each isolation level is as follows:

Read Committed (RC)

Snapshot reads are not locked

For the current read, the RC isolation level ensures that the read records are locked ( record locks ) , and there is a phantom read phenomenon.

If an update or delete statement does not go through the index, RC will lock the entire table, and then unlock the records that do not meet the conditions. In fact, MySQL violates the specification of the two-phase lock protocol for efficiency.

 

Repeatable Read (RR)

Snapshot reads are not locked

 

For the current read, the RR isolation level ensures that the read records are locked ( record locks ) , and at the same time, the read range is guaranteed to be locked. New records that meet the query conditions cannot be inserted ( gap locks ) , and there is no phantom read. Phenomenon.

 

 

Solutions for missing updates :

 The root cause of lost updates is that write operations based on the same version of data are allowed to succeed, and optimistic locking can avoid the first and second types of lost updates

1. Pessimistic lock method: select * from table for update;

2. The way to optimistically lock the version number

 

Auction related

The choice of transaction isolation level :

There is a need for repeatable reading, there is no need to prevent phantom reading, and it is necessary to prevent lost updates;

1. Optimistic locking solves the problem of lost updates

2. The customized update interface solves the requirement of repeatable reading. This method can also solve the problem of phantom reading under certain circumstances.

Through these methods, the isolation level can also meet the requirements under RC. Compared with RR, there is no gap lock, which can improve the concurrency capability of the system.

 

Margin critical state

The fundamental solution is to use pessimistic locking

 

If there is a high concurrency scenario, the optimistic locking of the price can be written as an inventory-like scheme

在更新拍品记录的语句中将乐观锁版本号的比较改成 bid_price>current_price and status>'bidding',这样在出价价格比较分散的情况下能够一定程度提高系统并发能力,在用版本号做为乐观锁的情况下,用户A出价1000,用户B出价2000,当用户A和用户B冲突并且A抢先更新了乐观锁,B的出价就会失败,哪怕B出价更高。 

 

拍卖系统数据库高可用方案

数据库高可用方案:

  • 一写多读:单master(负责写),多slave(负责读)当做了一写多读后,因为从多个读库中均衡读取数据,如果某一台读库宕机,那么隔离屏蔽,这样就做到了读的高可用。但写库还是一个单点,并且读库和写库之间数据同步有一定延迟。这种方案一旦master挂掉了拍卖的竞拍业务也就挂了,导购相关的读业务还能继续提供服务;
  • 主库冷备+多读:正常工作的时候同一写多读一样,当主库发生故障,写切换到备份机上,这时候对于新数据的写可以继续,对于历史数据的读也可以继续,对于历史数据的更新可能无法进行,当主库故障,业务很大程度上能够继续,不至于完全挂掉,这种部署方式对于一口价这样的业务基本是可行的,但是拍卖的出价是一个强一致上下文相关的过程,凡是读写分离的方式主备之间可能存在数据不一致的方案都不能采用
  • 双master:同时存在两个master提供写,master之间采用数据强同步,这样能够做到一个写节点挂了另一个节点就能顶上去。缺点是两个master之间的数据同步超时或者抖动当次业务处理就可能判定为失败,因此双master的重点是两个master所在的机房距离尽可能近,阿里机房距离在50km以内的差不多延迟在1-2ms,如果800km就可能达到几十ms,一次出价操作涉及到5次左右的数据写,如果距离近差不多增加10-20ms左右的延迟,问题不是很大;
  • 三master:在数据同步上当前master向两个stand by的master做数据强同步,只要有其中一个成功了就可以返回,在机房之间网络出现波动的时候可以提高一定的可用性,数据延迟也不会增加

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326988521&siteId=291194637