Getting database (MySQL): Transaction

  • Logical architecture MySQL services
  • ACID transaction
  • Transaction Log
  • Transaction isolation level
  • Deadlock
  • MVCC
  • MySQL transaction

 First, the logical architecture MySQL services

Before looking at the SQL transaction to understand the logic of MySQL service organization must be a prerequisite, it will tell you why transaction mechanism. And by understanding how transactions work, it can solve any problem it can not solve any problem, where the root cause behind these problems? Both require a trigger from the logical architecture services.

1.1MySQL service logic chart:

Source: https://blog.csdn.net/fuzhongmin05/article/details/70904190 , about MySQL performance optimization also can refer to this blog.

MySQL generally divided into three logical architecture, top-level service (also commonly referred to as client) is not unique to MySQL, most based tool or service network client / server has a similar architecture. Such as connection handling, authorization and authentication, and security.

The second-tier architecture is the core of MySQL service functions, including query parsing, analysis, optimization, caching, and all built-in functions (eg, date, time, math and encryption functions), all cross-Storage Engine functions are realized in this layer : stored procedures, triggers, and other views.

The lowest level for the storage engine for data storage and retrieval from MySQL. And Linux file system under similar, each storage engine has its advantages and disadvantages. Services through the intermediate layer to communicate with the storage engine API, the API interfaces shielded differences between different storage engine, such that these differences query process transparent to the upper layer. Storage Engine API contains dozens underlying function, such as for performing a "start transaction" or "a row extracted primary key" and other operations. But not to resolve storage engine SQL (except the InnoDB, it parses the foreign key is defined as MySQL server itself does not implement this feature), it will not communicate with each other among different storage engine, but simply request the respective upper server .

1.2 Concurrency Control:

Each client initiates a new request by the server-side connection / Threading Tools is responsible for receiving the client's request and opened a new memory space, create a new thread in the server's memory, when each user connects to the server when will generate a new thread in the process address space pool for responding to client requests, user-initiated queries are running threads of memory space, there is also the result of the cache and returned to the server. Thread reuse and destruction are achieved by the connection / thread processing manager.

Simply put, the user initiates a request, the connection / thread processor opens up memory space, provide a mechanism to start the query.

When there are multiple user initiates a request, there must be a possible concurrent, MySQL concurrency control at two levels: the service layer and the storage engine layer. In email box Unix system as an example, a mbox all messages in the mailbox are crossed together each other end to end. This format for reading e-mail messages and analyze very friendly, and delivery of mail is also very easy, as long as the end of the file append new message content can be.

But if the two processes at the same time delivering the mail to the same mailbox, right-two will cross affixed to the end of the mailbox file, which is concurrent data has caused errors. Good mail delivery system will be designed to prevent data corruption by a lock (lock). If a customer into the delivery of mail, and the mail has been locked by another customer, it must wait until the lock is released in order to be delivered.

This locks the program works well, although the actual application environment, but does not hold support concurrent processing. Because any one time, only one process can modify the data mailbox, which is a problem in a large number of the mailbox system.

 1.3 lock mechanism:

 1.3.1 Read-Write Lock

When multiple users at the same time the process of reading the same e-mail is not a problem, but if there is a mailbox user is reading, another illustration of the user delete a message number 25, what would be the result not determined . Read customer may withdraw an error, it may be read to the mailbox data inconsistencies. If the mailbox seen as a table in the database, the e-mail as a row, the same problem also exists in the database. Solution to this problem is to concurrency control, in dealing with concurrent read or write, to solve the problem by locking system consisting of two types of locks. These two types of locks are usually called shared locks and exclusive locks, also known as read and write locks.

Read lock: read lock is shared, or that do not block each other. Multiple users to read the same resource at the same time, without any interference, but read lock will block write.

Write lock: write lock is exclusive, which means that write locks block other write and read.

1.3.2 lock granularity:

Lock granularity refers to a range of data read and write locks, ideally try locking part of the data to be modified, but not lock all resources. For example, locking the row to be better than the lock table at any time, on a given resource, the less amount of data locked, the higher the degree of concurrent systems, can be (as long as the conflict if not conflict with each other, that is, deadlock will be introduced later).

But after all the ideal ideal, after all, is the need to lock the consumption of resources. Lock various operations comprising: obtaining a lock, check whether the lock has been lifted, release the lock, etc., will increase the cost of the system. If the system takes a lot of time to manage locks, rather than access to data, the performance of that system will also be affected.

So, which you need to seek a balance between cost and safety lock. Most commercial database systems do not provide more choice, are generally applied row-level locks on the table, and in a variety of complex ways, in order to provide better performance at a relatively large number of lock cases. However, MySQL offers a variety of options, each of the MySQL storage engines can implement your own strategy and lock lock granularity.

Table lock: to write an entire table (insert, delete, modify, update, etc.) before were locked, only when there is no write locks, read other users to get a read lock, do not block each other read locks.

Row-level locks: the value of the current operating row lock.

Table lock is the lowest cost strategy, but yet a very big restrictions on concurrency, in certain scenarios, the table lock may also have good performance. For example certain types of READ LOCAL table supports concurrent operation. Further, there are write locks than the read lock higher priority, so a write lock request may be inserted in front of the read lock, whereas read lock can not be inserted into the front of the write lock. It can be more suitable for table locks write larger scene. Despite their own management storage engine locks, MySQL itself will still use a variety of effective lock table to achieve different purposes. For example, the server is as ALTER TABLE statements using a table lock value class, while ignoring the storage engine lock mechanism.

Row-level locking can support concurrent processing to maximize achieve a row-level locking in InnoDB and XtraDB and some other storage engine. In row-level locking storage engine layer only achieved, but MySQL does not implement the service layer.

 Second, the transaction ACID

A transaction is a set of SQL queries atoms, or a separate unit of work.

When reading the sentence transaction concept, you might still do not know what business is? Why use a transaction? This is because the transaction itself is a very complex process, which involves more than just the contents of the operation data, but the contents of the entire process operational data it contains. Before looking at the transaction look at a classic case:

Assuming that a bank's database two tables: check (checking) tables and savings (savings) table. Now Jane from the user's checking account transfer of $ 200 to her savings account that requires at least three steps:

1. Check the checking account balance of more than $ 200.

2. minus $ 200 from the checking account balance.

3. The increase of $ 200 in a savings account balance.

Accordance with the procedures normal circumstances, this three SQL statements can be executed in order to accomplish this function, but in front of the concurrent learned that a plurality of users open process that may occur while operating these two records, with the following possible scenarios:

Case 1: In step 1 and step 2 is turned on between the other concurrent user processes modify the balance, the balance of the situation appears less than $ 200, this time will inevitably lead to step 2 fails.

Case 2: The system crashes after step 2 implementation, step 3 is not achieved, this will inevitably lead to user loss of $ 200.

Case three: After step 2 implementation, other users open process deletes all the checking account balances, which may lead the bank to Jane White of $ 200.

Here temporarily does not discuss how specific database system to solve these problems, in the actual implementation this is a very complex data storage problem, not a blog will be able to speak clearly, from application development level, being there is no need to understand these are necessary, because the solution of these problems have a strict standard, from application development level, we only need to know the standard solution standard applies what scenes, what problems might arise under what circumstances, the de facto standard and database systems in accordance with standard realize how we can design our database.

The general case of three steps into a transaction will be executed using transaction does not guarantee that these three steps can complete perform normal, unless the system through rigorous testing ACID.

ACID: four basic elements of the database transaction properly performed, respectively, means (Atomic: atomicity, consistency: consistency, isolation: isolation, Persistence: durability).

Atomic: A transaction is an indivisible unit of work, which operate either submit the success or failure of a rollback, it is impossible to perform part of the operation. For example, in the case of a case, Step 2 will not fail to Step 3.

Consistency: Database total conversion from one state to another consistent state consistency. For example, in the case of two cases in the system between steps 2 and 3 Ben collapse, checking accounts will not lose $ 200, because there is no step 3 will not be submitted to the enforcement branch, as to how this is achieved, will explain the contents of the latter.

Isolation: a firm to make changes before final submission, other transactions are not visible. For example, in the case Step 2 is completed, but not yet implemented step 3, another user process to query the records check, its record query is not recorded minus $ 200. However, this can only say that in theory, it is related to the transaction isolation level, it will be explained in detail later.

Durability: Once a transaction is committed, it will be made permanently modify their saved to the database. And even if my system crashes, the modified data will not be lost. Persistent any course is divided into many levels, some persistence strategy can be very strong data security, and some may not, so persistence is also only a relatively speaking, or data backup and what significance does it have?

 Third, the transaction log

The transaction log is used to store the changes to the database record files, record insert, update, delete, commits, rollbacks, and database schema changes is an important component of data backup and recovery, as well as SQL Remote or [Copy Agent] to copy data necessary.

The transaction log can be said is the basis for the transaction, the transaction log that in the end what is it? The transaction log is how to work?

The transaction log contains: redo log (redo log), undo log (rollback log), log group (log group).

redo log : Mysql sql statement transaction will be all the data involved in the operation to redo log recorded in the redo log operation and then synchronization to the corresponding data file. That before modifying records in the database file, you must ensure that all corresponding modification operations have been recorded in the redo log. redo log is divided into two parts: redo log buffer and redo file log.

  • redo log buffer (the redo log buffer): log write buffer memory, which is much faster than written directly to disk, but the memory can not meet the persistent demand, if Ben collapse this time the system will result in data loss.
  • redo log file (redo log files): redo log file is written to the log disk, compared to the redo log buffer is much slower, but the log is written to disk can guarantee the basic needs of persistence.

In order to take into account the performance and durability, the general is the first transaction log is written to memory (redo log buffer), and then log synchronous memory to disk (redo log file), even if the memory disk io operation is relatively much slower, but relatively Database discrete write, but still much faster, because of redo log file is a contiguous space on the disk. When the recording operation to redo log file to re-synchronize the database file, which is written to the database transaction entire process data.

Although the redo log buffer to write data to the redo log file to ensure data persistence, but this will significantly reduce performance, you can modify the policy is written redo log file from the redo log buffer by innodb_flush_log_at_trx_commit parameters, but it will lose a lasting resistance, it is possible to lose data, specifically how to brush write policy, trade-offs depending on the actual situation.

physical redo log log, as recorded in the database redo log page operation, and deletions not a logical change in the search, redo log idempotent.

undo log before rolling log can be understood as a backup before the data modification, if the transaction halfway through, there is no part of the successful implementation of SQL, the database can be revoked under undo log, restore all modified data from logically to modify: way, the insert 100 before such data, then delete them, and if the update data 50, to update them in accordance with undo log, therefore, undo log is logical logs, and operation of the physical page log redo log records different .

Group log : redo log group contains multiple redo logs, a log group when the log file is full, it will be written to the log redo log of the next group in the redo log file, the log group when all redo log file are filled, then the redo log and then written to the first redo log file, overwriting the original redo log, so that the new redo log write. If the device crashes where the redo logs, then redo log will be lost, this does not guarantee redo log is available at all times, so log group also supports log mirroring group, generally on the log group with redundant capability on the device, such as radi1.

redo log stored in the redo log, the undo log stored in the internal database special segment, this segment is referred to undo segment (segment undo), located in the shared segment undo tablespace. Regardless of redo log or undo log all means to recover the database.

In MySQL, innodb storage engine supports transactions, myisam storage engine does not support transactions, regardless of redo log or undo log, all innodb product, but there is another important MySQL binary log, which is binlog, it is time MySQL master-slave replication environment necessary for the log, but not the storage engine level binlog log, MySQL database storage engine generates a binary log any changes you make to the database.

Redolog binlog timing and write to the disk is different, innodb continue to write the redo log redo log file when the transaction is carried out, binlog turn disk write only after the completion of the transaction commits.

View Innodb storage engine log configuration parameters:

show global variables like '%innodb%log%'

Specific Parameter Description logs can refer to this blog: http://www.zsythink.net/archives/1216 , detailed information about can also refer to the official document: https://dev.mysql.com/doc/refman/5.7/ en / innodb-storage-engine.html

 Fourth, the transaction isolation level

Can be resolved on an already learned a transaction log atomic transactions, persistence, things ACID left an essential element (isolation) is not involved, then what is the isolation it? Why do we need isolation, isolation level What is the difference?

Refers to the internal transaction isolation something other transaction operations are isolated and can not interfere with each other concurrently executing various things.

  • READ UNCOMMITTED (uncommitted read)
  • READ COMMITTED (read committed)
  • REPEATABLE READ (repeatable read)
  • SERIALIZABLE (serializable)

 4.1: READ UNCOMMITTED (uncommitted read)

Read uncommitted transaction represents a modification of data even if the transaction is not committed, other transactions are also visible. Suppose a user data operation process has been started Affairs A, while another user process open transaction B, A transaction executed a data modification statement, but the transaction has not been submitted, B transaction immediately went to perform the modification just read A transaction the data in the rEAD UNCOMMITTED isolation level read data is just a modified transaction data.

From a performance point of READ UNCOMMITTED not much better than the other levels, but lack many of the benefits of other levels, unless it is really necessary rarely use.

Dirty read does not mean reading other transactions executed SQL data modification statement, but did not submit data on transactions will occur, but rather when reading data due to perform but did not modify the data submitted by the occurrence of a transaction rollback, which when dirty read only known, in fact, refers to a dirty read inconsistent data read and written to the database of cases.

4.2: READ COMMITTED (read committed)

When a transaction begins, we can only "see" the firm made the changes have been submitted, that any firm to do is, before submitting to other transactions are not visible. This level is sometimes called non-repeatable read, because twice in the same query internal affairs, may get different results.

4.3: REPEATABLE READ (repeatable read)

This level ensures that the same transaction multiple times to read the same record structure is the same, but in theory, Repeatable Read isolation level does not solve another problem phantom read.

Magic Reading: When the transaction is recorded in a range of reading, another transaction and insert a new record in this range, when the record range before the transaction is read again, it will produce phantom line (Phantom Row) . XtraDB and InnoDB storage engine to solve the problem of phantom read by multi-version concurrency control (MVCC, Multiversion Concurrency Control).

Repeatable read is MySQL's default isolation level things.

4.4: SERIALIZABLE (serializable)

Serializable isolation level is the highest. It serial execution by forcing things, avoid the problem of phantom read said earlier. Simply put, SERIALIZABLE will have a lock on each row of data is read, it may cause problems with a lot of timeouts and lock contention. Practical applications rarely use this isolation level, only in very necessary to ensure consistency of data and can accept no concurrent case, only consider that level.

———————————————————————————————————————

ANSI SQL isolation levels:

 V. Deadlock

Deadlock refers to two or more things occupy each other on the same resources, and the other occupied resource lock requests, leading to a vicious circle phenomenon. When more things locking resources in a different order, it could lead to a deadlock. When multiple transactions simultaneously locking the same resource, it will produce a deadlock. Such as the following example:

Transaction 1:

start transaction ;
update t_employee set sal=5000 where ename='JONES';
update t_employee set sal=4500 where ename='BLAKE';
commit ;

事务2:

start transaction ;
update t_employee set sal=6000 where ename='BLAKE';
update t_employee set sal=3800 where ename='JONES';
commit ;

可以尝试启动两个控制台进入数据库然后按照以下顺序执行语句:

 执行到第三步时,两个进程都会等待一会儿,然后报超时错误:

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

为了解决死锁的问题,数据库系统实现了各种死锁检测和死锁超时机制。越复杂的系统比如InnoDB存储引擎,越能检测到死锁的循环依赖,并立即返回一个错误。除了这种解决方法以外就是当查询达到锁超时的设定后放弃锁请求,这种方式不太友好。InnoDB目前处理死锁的方法是,将有最少行级排他锁的事务进行回滚。

锁的行为和顺序是和存储引擎相关的,以同样的顺序执行语句,有些存储引擎会产生死锁,有些则不会。死锁产生有双重原因:有些是因为真正的数据冲突,这种情况通常很难避免,但有些则完全是由于存储引擎的实现方式导致。

上面示例测试的表事务隔离级是REPEATABLE READ,存储引擎是InnoDB。

 

 关于死锁更详细的解析和解决方案可以参考这篇博客:https://blog.csdn.net/qq_36132127/article/details/81272293

 六、MVCC

MVCC全称Multi-Version Concurrency Control,即多版本并发控制协议。MySQL的大多数事务型存储引擎都不是简单的行级锁,为了提升并发性能的,它们一般都同时实现了并发控制。而且不仅仅是MySQL,包括Oracle、PostgreSQL等其他数据库系统也都实现了MVCC,但各自的实现机制不尽相同,因为MVCC没有统一的实现标准。

MVCC很多情况下避免了枷锁操作,因此开销更低。虽然实现机制不同,但大都实现了非阻塞的读操作。写也只锁定必要的行。MVCC是通过保存数据在某个时间点的快照来实现,不管一个事务要执行多长时间,事务内部每个时间点看到的数据都是一致的。

前面说过MVCC在不同存储引擎实现是不同的,典型的有乐观(optimistic)并发控制和悲观(pessimistic)并发控制。

InnoDB的MVCC是通过在每行记录后面保存两个隐藏的列来实现的,这两个列分别保存行的创建时间和过期时间(或称为“删除时间”)。而且存储并不是实际时间值,而是系统版本号(system version number)。下面来看看InnoDB的SQL操实际是如何来操作这两个隐藏列的:

INSERT :为新插入的每一行保存当前系统版本号作为行的版本号。

DELETE:为删除的每一行保存当前系统版本号作为行的删除标示。

UPDATE:插入一行新的记录,保存当前系统版本号作为行的版本号,同时保存当前系统版本号到原来的行作为删除标识。(也就是说修改数据实际上是新增一行记录,删除原来的记录)。

SELECT:查找版本早于或等于当前事务版本号的数据行,并且删除把呢不能还未定义。意思就是查找早于当前事务版本的记录,这是在事务内部未对查找的记录删除或修改;如果在查询前当前事务新增的记录是可以被查询到的,在查询前事务内部修改的记录也是可以查询到的,查询前事务内部的删除也可以被查看到。

根据上面的说明,也就是说可重复读并不意味着在事物内部两次相同的读语句可以完全获得相同的结果。

 六、MySQL的事务

在MySQL中默认情况下,每一条SQL语句都当作一个单语句事务,并默认自动提交,可以通过下列语句查看全局和当前会话是否开启自动提交功能。

show global variables like 'autocommit%';
show session variables like 'autocommit%';

并且可以手动关闭当前会话的默认提交:

set autocommit=0;

如果手动关闭了默认提交,执行SQL数据操作语句后就需要手动使用commit提交。

 

 除了默认SQL语句自动默认采用事务机制,可以通过手动开启事务并提交,下面是MySQL事务控制语句的语法:

START TRANSACTION | BEGIN [work]
COMMIT [WORK] [AND [NO] CHAIN] [[NO] RELEASE]
ROLLBACK [WORK] [AND [NO] CHAIN] [[NO] RELEASE]

START TRANSACTION 或BEGIN:表示开启一个事务。当由于在存储过程中BEGIN END作为SQL语句的包裹关键字,所以为了区分一般使用START TRANSACTION作为事务开启语句。

COMMIT或COMMIT WORK:表示提交事务。也就是说START TRANSACTION 与COMMIT之间的SQL语句对数据的操作称为永久性操作。

ROLLBACK或ROLLBACK WORK:表示事务回滚。也就是撤销事务中在回滚语句之前的数据操作,之后的数据不会被撤销,这一点需要注意。

除了以上三个语句以外,还可以使用标识符用来控制回滚。

SAVEPOINT #表示创建标识符
ROLLBACK TO SAVEPOINT#表示回滚到这个标识符的记录
RELEASE SAVEPOINT #表示删除一个保存点

本来想着写一个示例,但是在感觉没有这个必要了,如果需要的话可以参考下面这两篇博客:

http:/www.zsythink.net/archives/1216

https://www.cnblogs.com/Yiran583/p/7125455.html

Guess you like

Origin www.cnblogs.com/ZheOneAndOnly/p/12148946.html