MySQL technical insider InnoDB storage engine study notes Chapter 7 transactions

Transactions can ensure that the database is converted from one consistent state to another consistent state. When the database is submitted, it can ensure that either all modifications are saved or all modifications are not saved.

InnoDB engine transactions are fully in line with ACID characteristics:
1. Atomicity (atomicity), which means that the entire transaction is an indivisible unit of work.
2. Consistency means that the integrity constraints of the database before and after the transaction have not been destroyed.
3. Isolation (isolation) means that the impact of a transaction is invisible to other transactions before the transaction is committed, and it is achieved through locks.
4. Durability. After the transaction is committed, the result is permanent. Even if a downtime occurs, the database data can be restored.

Atomicity, consistency, and durability are accomplished through redo and undo.

InnoDB's transaction log is implemented through redo log files and the log buffer of the InnoDB engine. When a transaction is started, an LSN (Log Sequence Number) of the transaction will be recorded, and when the transaction is executed, it will be sent to InnoDB The transaction log is inserted into the log buffer of the engine. When the transaction is committed, the log buffer of the InnoDB engine must be written to disk (the default implementation is innodb_flush_log_at_trx_commit=1). That is, before writing data, you need to write the log first. This method is called Write-Ahead Logging (WAL).

InnoDB uses the WAL method to ensure transaction integrity, which means that the data pages on the disk and the pages in the memory buffer pool are not synchronized. For the modification of the pages in the memory buffer pool, first write them to the redo log file, and then Writing to disk again is an asynchronous method. The command to view the gap between the current disk and the log:

SHOW ENGINE INNODB STATUS;

Examples of disc content and log content gap:

CREATE TABLE z (
    a      INT,
    PRIMARY KEY(a)
) ENGINE = InnoDB;

DELIMITER $$

CREATE PROCEDURE load_test(count INT)
BEGIN
DECLARE i INT UNSIGNED DEFAULT 0;
START TRANSACTION;
WHILE i < count DO
INSERT INTO z SELECT i;
SET i = i + 1;
END WHILE;
COMMIT;
END; $$

DELIMITER ;

First look at the current redo log situation:
Insert picture description here
the Log sqquence number in the above figure represents the current LSN, Log flushed up to represents the LSN flushed to the redo log file, and Last checkpoint at represents the LSN flushed to the disk, and then the stored procedure is called to insert the data :
Insert picture description here
The values ​​of Log flushed up to and Last flushed up to this time are not equal. Although the values ​​of Log sequence number and Log flushed up to in the above example are equal, they may not be equal in the production environment, because a transaction flushing from the log buffer to the redo log file does not only occur when the transaction is committed, but may also occur when the transaction is committed. Occurs after switching the redo log after a redo log file in a redo log file group is full.

When the database is modified, in addition to redo, it will also generate undo. When the executed transaction or statement fails, or rollback is requested with the ROLLBACK command, the undo information can be used to roll back the data to the way it was before the modification. Undo is stored in the undo segment in the shared tablespace.

Undo is used to restore the database logically to its original appearance, not physically. For example, a transaction modifies a few records in a page, but at the same time there are other transactions modifying other records in the same page. Therefore, a page cannot be rolled back to the beginning of the transaction, which will affect other transactions. Therefore, when our transaction inserts a large amount of data, resulting in a new segment allocation, that is, the table space will increase. At this time, ROLLBACK will roll back the inserted transaction, but the table space size will not shrink.

Even if the transaction that modifies the data is committed, the undo page will exist for a period of time. This is because the undo page recovery is performed in the master thread, and the master thread does not recover all undo pages at once.

Under the default configuration of the MySQL command line, the transaction is automatically committed, so start a transaction in the command line to use BEGIN, START TRANSACTION, SET AUTOCOMMIT=0, which is the same as SQL server, and Oracle does not commit automatically by default.

MySQL transaction control statement:
1. START TRANSACTION | BEGIN: display to start a transaction.
2. COMMIT: almost equivalent to COMMIT WORK, commit the transaction. When the parameter completion_type is 0 (default setting), COMMIT and COMMIT WORK are exactly the same; when the parameter is 1, COMMIT WORK is equivalent to COMMIT AND CHAIN, which means that a transaction with the same isolation level will be automatically opened immediately; when the parameter is 2 When, COMMIT WORK is equivalent to COMMIT AND RELEASE, which means that the connection with the server is automatically disconnected after the transaction is committed.
3. ROLLBACK: almost equivalent to ROLLBACK WORK, the difference is the same as COMMIT WORK, roll back the transaction, undo the modification in the transaction.
4. SAVEPOINT identifier: Create a savepoint in a transaction, and there can be multiple savepoints in a transaction.
5. RELEASE SAVEPOINT identifier: delete a transaction savepoint, when the identifier does not exist, execution of this statement will throw an exception.
6. ROLLBACK TO [SAVEPOINT] identifier: used with the SAVEPOINT command to roll back the transaction to the specified save point. If the identifier does not exist, an exception will be thrown. This command will not end the transaction like ROLLBACK, after which you need to end the transaction explicitly.
7. SET TRANSACTION: Set the isolation level of the transaction.

In a stored procedure, you cannot use BEGIN to explicitly open a transaction, because MySQL recognizes BEGIN in the stored procedure as BEGIN… END, so you can only use START TRANSACTION to open a transaction in the stored procedure.

The test with completion_type set to 1:
Insert picture description here
Insert picture description here
It can be seen from the above figure that the two operations of inserting 2 are in one transaction, but a transaction such as BEGIN is not used to explicitly start a transaction.

The test with completion_type set to 2:
Insert picture description here
Insert picture description here
If a statement fails in a transaction, the statement in this transaction will be automatically rolled back, but the entire transaction will not be rolled back. The statement in the same transaction before this statement is not affected. The user manually COMMIT or ROLLBACK.

The following SQL statements will produce implicit COMMIT operations:
1. DDL statements: ALTER DATABASE… UPGRADE DATA DIRECTORY NAME (update the directory name associated with the database), ALTER EVENT (modify events, events can perform timing tasks), ALTER PROCEDURE, ALTER TABLE, ALTER VIEW, CREATE DATABASE, CREATE EVENT, CREATE INDEX, CREATE PROCEDURE, CREATE TABLE, CREATE TRIGGER, CREATE VIEW, DROP DATABASE, DROP EVENT, DROP INDEX, DROP PROCEDURE, DROP TABLE, DROP TABLE, DROP TRIGGER, DROP VIEW, RENAME TABLE, TRUNCATE TABLE.
2. The operation of implicitly modifying the mysql library: CREATE USER, DROP USER, GRANT, RENAME USER, REVOKE, SET PASSWORD.
3. Management statements: ANALYZE TABLE, CACHE INDEX, CHECK TABLE, LOAD INDEX, INTO CACHE, OPTIMIZE TABLE, REPAIR TABLE.

In SQL server, even DDL statements can be rolled back.

TRUNCATE TABLE has the same result as DELETE for the entire table, but the former is a DDL statement and cannot be rolled back.

InnoDB engine supports transactions, so while considering the number of requests per second (QPS, Question Per Second), we should also pay more attention to the transaction processing capacity (TPS, Transaction Per Second) per second. The TPS calculation method is (com_commit + com_rollback) / time, Among them, com_commit and com_rollback are MySQL variables. Only the explicitly committed transactions will be counted into two variables: the
Insert picture description here
parameters handler_commit and handler_rollback can be used to count the explicit and implicit transaction commits of the InnoDB engine in MySQL 5.1. Operation, but there is a problem in InnoDB Plugin.

ISO C and ANIS SQL specify four transaction isolation level standards, but few database developers follow these standards, and Oracle does not support read uncommitted and repeatable read transaction isolation levels. The four isolation levels defined by the SQL standard are: 1. READ
UNCOMMITTED 2. READ
COMMITTED
3. REPEATABLE READ
4. SERIALIZABLE

READ UNCOMMITTED is called browse access. READ COMMITTED is called cursor stability. REPEATABLE READ is not protected by phantom reads (when the same transaction does the same query twice, the row that was not seen the previous time is viewed in the next time) protection, under this transaction isolation level, the snapshot read is read at the beginning of the transaction in MVCC Therefore, the ordinary SELECT operation cannot read the changes made by other transactions in the current transaction process, and if the statement executed in the current transaction is a statement that needs to modify data or a locked statement, such as UPDATE, DELETE, SELECT… LOCK IN SHARE MODE, SELECT… FOR UPDATE. The reading at this time is called the current reading, because these statements read not the snapshot, but the real data. At this time, there will be a situation where the SELECT cannot find the data but can delete the data. That is, phantom reading occurs. Therefore, in order to prevent phantom reading from current reading, you need to manually lock the SELECT statement. In this way, for range search, MySQL will use Next-Key Lock to lock the search range, and other transactions cannot modify these statements. , So that even if the current read for the same transaction, there will be no phantom read. Therefore, the InnoDB engine can reach the SQL standard SERIALIZABLE isolation level under REPEATABLE READ, so the SERIALIABLE transaction isolation level is mainly used for distributed transactions of the InnoDB engine.

The lower the transaction isolation level, the fewer locks the transaction requests or the shorter the lock holding time.

Set the current session or global transaction isolation level:

SET [GLOBAL | SESSION] TRANSACTION ISOLATION LEVEL
{
READ UNCOMMITTED
| READ COMMITTED
| REPEATABLE READ
| SERIALIZABLE
}

If you want to set the default transaction isolation level when MySQL starts, you need to modify the MySQL configuration file:

[mysqld]
transaction-isolation = READ-COMMITTED

View the transaction isolation level of the current session:

SELECT @@tx_isolation;

View the global transaction isolation level:

SELECT @@global.tx_isolation;

Under the SERIALIABLE transaction isolation level, the InnoDB engine automatically adds LOCK IN SHARE MODE after each SELECT operation, so consistent non-locking reads are no longer supported under this transaction isolation level.

Under the READ COMMITTED transaction isolation level, Gap Lock is only required for unique constraint checking and foreign key constraint checking. This transaction isolation level can only work in MySQL 5.1 when the binary log format is ROW, if the binary log format is STATEMENT.

MySQL 5.0 does not support binary logs in ROW format. At this time, the parameter innodb_locks_unsafe_for_binlog can be set to 1, to use the READ COMMITTED transaction isolation level when the binary log is STATEMENT. However, it is unsafe, which can cause master and Data inconsistency between slaves, such as the following table:
Insert picture description here
start a transaction A on the master, do not commit first:
Insert picture description here
start a transaction B on the master, this time commit:
Insert picture description here
then commit transaction A:
Insert picture description here
view the data on the master:
Insert picture description here
view on the slave Data: It
Insert picture description here
can be seen that the data is inconsistent. There are two reasons for this problem. First, under the READ COMMITTED transaction isolation level, the transaction does not have a Gap Lock lock, so transaction B can insert another data within the range of less than or equal to 5; Secondly, the binary log file in statememt format records the SQL statements generated on the master. Therefore, the actual execution on the master is to delete first and then insert, while the binary log file records the first insert and then delete, which is logically inconsistent. . The REPEATABLE READ can avoid the first situation. After MySQL 5.1 version, the ROW format binary log record format is supported, avoiding the second situation.

It is recommended to use the binary log in ROW format, because it records row changes, rather than simple SQL statements, which can avoid some asynchrony.

The InnoDB engine supports XA transactions, and the realization of distributed transactions can be supported through XA transactions. Distributed transaction refers to allowing multiple independent transaction resources to participate in a global transaction. The transaction resource is usually a relational database system, but it can also be other resources. A global transaction requires that all participating transactions are either committed or rolled back. When using distributed transactions, the transaction isolation level of the InnoDB engine must be SERIALIABLE.

XA transactions allow distributed transactions between different databases, as long as each node participating in the global transaction supports XA transactions.

Distributed transaction consists of one or more resource managers, a transaction manager, and an application program:
1. Resource manager: Provides a method for accessing transaction resources, usually a database is a resource manager.
2. Transaction manager: Coordinating the various transactions participating in the global transaction, and it needs to communicate with all resource managers participating in the global transaction.
3. Application: Define transaction boundaries and specify operations in global transactions.
Insert picture description here
Distributed transactions use a two-stage commit method. In the first stage, all nodes participating in the global transaction begin to prepare and tell the transaction manager that they are ready to commit; in the second stage, the transaction manager tells the resource manager to perform ROLLBACK or COMMIT, if any node shows that it cannot be submitted, all nodes are told to roll back.

Java's JTA (Java Transaction API) can well support MySQL's distributed transactions.

The innodb_support_xa parameter can be used to check whether XA transaction support is enabled, and the default is ON.

The support for XA transactions is in the MySQL engine layer. Therefore, even if it does not participate in external XA transactions, different engine layers within MySQL will use XA transactions. Assuming that a local transaction is started with START TRANSACTION, it goes to the table of the NDB Cluster engine. Insert a record at t1, insert a record into the table t2 of the InnoDB engine, and then COMMIT, which is also coordinated by XA transaction inside MySQL, so as to ensure the atomicity of the operations of the two tables.

Bad business habits:
1. Commit in a loop, a bad stored procedure:

CREATE PROCEDURE load1(count INT UNSIGNED)
BEGIN
DECLARE s INT UNSIGNED DEFAULT 1;
DECLARE c CHAR(80)     DEFAULT REPEAT('a', 80);
WHILE s <= count DO
	INSERT INTO t1 SELECT NULL, c;
	COMMIT;
	SET s = s + 1;
END WHILE;
END;

In the above example, COMMIT can be removed, because the InnoDB engine automatically commits by default. Even if COMMIT is removed, there is a problem. When an error occurs, the database will stay in one position. For example, we want to insert 10,000 data, but in An error occurred when 5000 pieces of data were inserted. At this time, 5000 pieces of data have been inserted. Another problem is that the performance of the above stored procedure is low, and it is not as fast as putting the entire process into one transaction as follows:

CREATE PROCEDURE load1(count INT UNSIGNED)
BEGIN
DECLARE s INT UNSIGNED DEFAULT 1;
DECLARE c CHAR(80)     DEFAULT REPEAT('a', 80);
START TRANSACTION;
WHILE s <= count DO
	INSERT INTO t1 SELECT NULL, c;
	SET s = s + 1;
END WHILE;
COMMIT;
END;

The above method of putting the insertion process into the same transaction is much faster, because each commit must write a redo log. The first stored procedure can remove COMMIT and call it in the following way, which can also improve performance:

BEGIN;
CALL load2(10000);
COMMIT;

MySQL's InnoDB engine does not have other databases that require transactions to be released as soon as possible and cannot occupy transactions for a long time. There is also no classic problem in Oracle databases that does not have enough UNDO to generate Snapshot Too Old, so it should not be repeated in a cycle. Perform the commit operation.

2. Use automatic submission. After the transaction is explicitly opened, under the default settings (completion_type parameter is 0), MySQL will automatically execute the SET AUTOCOMMIT=0 command, and execute SET AUTOCOMMIT=1 after COMMIT or ROLLBACK.

The automatic submission of APIs in different languages ​​is different. MySQL C API automatically submits by default, and MySQL Python API will automatically execute SET AUTOCOMMIT=0.

3. Use automatic rollback. The InnoDB engine supports automatic rollback of transactions by defining a HANDLER. If an error occurs in the stored procedure, it will be automatically rolled back, such as the following stored procedure:

CREATE PROCEDURE sp_auto_rollback_demo()
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION ROLLBACK;
START TRANSACTION;
INSERT INTO b SELECT 1;
INSERT INTO b SELECT 2;
INSERT INTO b SELECT 1;
INSERT INTO b SELECT 3;
COMMIT;
END;

The above stored procedure first defines an exit type handler, and rolls back when an error is caught. There is only one column of INT primary key column in table b, so an error will be reported when the second 1 is inserted, so an automatic rollback operation will be performed:
Insert picture description here
but this does not know whether the stored procedure has an error, the following way can know whether the stored procedure has an error:

CREATE PROCEDURE sp_auto_rollback_demo()
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION 
	BEGIN ROLLBACK; SELECT -1; END;    -- HANDLER后可加一个BEGIN END语句块,发生异常时会调用此语句块
START TRANSACTION;
INSERT INTO b SELECT 1;
INSERT INTO b SELECT 2;
INSERT INTO b SELECT 1;
INSERT INTO b SELECT 3;
COMMIT;
END;

Run the above stored procedure:
Insert picture description here
but the above method can only know that an exception has occurred, but not what kind of error has occurred. SET XACT_ABORT ON can be used in the SQL server database to roll back a transaction when an exception occurs, and an exception will also be thrown. Developers can capture this exception to obtain detailed information. In MySQL, we should use programs to control transactions instead of controlling transactions in stored procedures, so that exceptions can be caught in the program:
Insert picture description here
Insert picture description here

Guess you like

Origin blog.csdn.net/tus00000/article/details/113795242