1. Affairs
A database transaction is a database operation sequence for accessing and manipulating various data. It is composed of the entire execution process from the beginning of the transaction to the end of the transaction. Transaction processing can be used to maintain the integrity of the database and ensure that batches of SQL are either executed in full Either all of them are not executed. Of course, only databases or tables that use the Innodb database engine have transactions in mysql.
The characteristics of the transaction:
1. Atomicity: In all operations of a transaction, either all of them are executed, or none of them are executed. If an error occurs in a certain link during the execution, it will be rolled back to the state before the transaction began to execute, ensuring that the transaction has no happened before.
2. Persistence: After the transaction is completed, the modification to the data is permanent, even if the system fails, it does not matter
3. Isolation: The database allows multiple transactions to read, write and modify the number, so it may lead to cross-execution and result in inconsistent data obtained in a transaction. Isolation can prevent such events from happening. Transaction isolation includes There are four isolation levels: read uncommitted, read committed, non-repeatable read, and serializable.
4. Consistency: To ensure the integrity of the database before and after the transaction starts, the data written and read must conform to all preset rules. The previous three features are all to ensure consistency.
Implementation principle of business
Mysql has many log files, such as binary files, error logs, query logs, etc., and the Innodb engine provides two types of logs that specifically implement transactions, one is redolog (redo log), and the other is undolog (rollback log) , where redolog is used to ensure the durability of transactions, and undolog is used to ensure the atomicity and isolation of transactions.
1. Realization of atomicity
transaction isolation level
Mysql is a software with client/server architecture, so there will be multiple clients connecting to one server, so the server will process multiple transactions at the same time, which may lead to the same data accessed by different transactions Yes, in theory, during the execution of a transaction, other transactions need to wait in line. After the transaction is committed, other transactions can access the data, but this has a relatively large impact on performance, so the principle of transaction isolation level is proposed.
Transaction isolation level implementation principle (MVCC)
MVCC, also known as multi-version concurrency control, cooperates with undolog and version chain to allow the read-write and write-read functions of transactions to be executed concurrently, thereby improving system performance.
MVCC prevents the database from locking the read operation and improves the concurrent processing capability of the database. With the help of MVCC, the read-committed and repeatable-read isolation levels can be achieved.
Innodb's MVCC is implemented by adding two hidden columns at the end of each row of records, one holds the transaction id, and the other holds the rollback pointer.
What is ReadView
Snapshot read is used to determine which transaction in the version chain is visible to the current transaction.
Content contained in readview:
- m_ids. When generating ReadView, the transaction id list of active read and write transactions in the current system, that is, not yet committed .
- min_trx_id. When generating ReadView, the smallest transaction id among the active read and write transactions in the current system; that is, the smallest value in m_ids.
- max_trx_id. The transaction id value that the system should assign to the next transaction when generating a ReadView.
- creator_trx_id. The transaction id of the transaction that generated this ReadView.
How to judge which version of the record is visible to the current transaction through readview?
1. If the txr_id of the accessed version= creator_trx_id in the current readview , it means that the current transaction is accessing the record that has been modified by itself, so it is visible
2. If txr_id<min_trx_id in readview , it means that this version of the transaction has submitted the transaction before the current transaction generates readview, so it is visible to the current transaction.
3. If txr_id> max_trx_id in readview , it means that the transaction of this version is opened after the current transaction generates readview, so the current transaction is not visible.
4. If txr_id is between min_trx_id and max_trx_id , then you need to judge whether txr_id is in m_ids, then there are two situations:
(1) If txr_id is in m_ids, it means that when readview is created, the transaction of this version is still active, so it cannot be accessed.
(2) If txr_id is not in m_ids, it means that when readview is created, the transaction of this version has been submitted, so it can be accessed.
The timing of Readview generation
The biggest difference between read committed ( READ COMMITTED ) and repeatable read ( REPEATABLE READ ) lies in the timing of their generation of readview.
1. Read committed: In a transaction, a readview is generated before each data read.
2. Repeatable reading: In a transaction, readview is generated only when data is read for the first time, and each read reads data from the same readview
MVCC summary
MVCC is to control the behavior of concurrent transactions accessing the same record through version chain and Readview or version chain and undolog. Mysql judges whether the version accessed by the current transaction id is by comparing several ids in the transaction list with the current transaction id. Visible, the situation where the version is visible includes:
Whether the transaction id of the current version is less than, greater than, or equal to several ids in the transaction list.
The corresponding readview will be generated before each data reading and the readview will be generated when the data is read for the first time, respectively corresponding to the read committed ( READ COMMITTED ) and repeatable read ( REPEATABLE READ ).
Mysql lock
Locks in Mysql are divided into table locks , row-level locks , and gap locks .
Table lock: Table lock is the most granular lock in Mysql, which means to lock the entire table currently being operated, and is suitable for a large number of batch operations, such as: table reconstruction and full table backup, etc., through LOCK TABLE and UNLOCK TABLES The statement is realized.
At the same time, because the table lock needs to live in the entire table, the concurrency performance is poor, and the locking itself needs to consume resources (acquiring locks, checking locks, releasing locks, etc.), so when there are many locked data, you can choose Use table locks to save a lot of resources. Different storage engines in Mysql use different locks. MyIsam supports table locks, and InnoDB supports table locks and row locks.
Row lock: Row lock is the smallest granularity lock in Mysql. It only locks the currently operated row, and other transactions can access the data of other rows. It is suitable for scenarios with high concurrency, through SELECT ... FOR UPDATE and SELECT. .. The LOCK IN SHARE MODE statement is implemented, but the cost of locking is also high, and deadlock may occur, but the probability of lock conflict is the lowest.
Row-level locks are also divided into shared locks and exclusive locks.
1. Shared lock (Shared Lock): Also called read lock, referred to as S lock, multiple transactions can hold shared locks at the same time, and transactions holding shared locks can be executed concurrently, that is, read locks will not block reading lock, but if a transaction holds a shared lock, other transactions cannot obtain the exclusive lock of the row, and can only wait for the shared lock to be released.
2. Exclusive lock (Exclusive Lock): Also called a write lock, only one transaction can hold the lock at the same time, and the transaction holding the exclusive lock can either read or modify the row data, and any other transaction can Shared and exclusive locks on the row can no longer be acquired until the exclusive lock is released.
Gap lock: Gap lock locks an interval. In order to solve the problem of phantom reading, InnoDB introduces a gap lock, which also meets the requirements of the serialization isolation level.
Phantom reading: Phantom reading refers to a transaction querying in the same range, and the next query finds rows that were not queried in the previous query.
For example, if there are only 101 records in the user table, and their userid values are 1, 2, ..., 100, 101, the following SQL: select * from user where userid > 100 for update; is a range condition retrieval, InnoDB It will not only lock the records with eligible userid value 101, but also lock the "gap" where userid is greater than 101 (but these records do not exist), preventing other transactions from adding data at the end of the table
lock conflict
When multiple users access the database concurrently, if multiple users request to modify the same data at the same time, lock conflicts will occur. Lock conflict means that in a transaction, if you want to access a resource that has been locked, you need to wait for the lock to be released, which causes the transaction to wait and reduces the performance of the database.
Lock conflicts are generally divided into two types: shared locks and exclusive locks. Shared lock (Shared Lock), also known as read lock, is a shared lock mechanism. Multiple transactions can hold shared locks at the same time without preventing other transactions from obtaining shared locks. It is used to ensure data consistency during concurrent reading. Exclusive lock, also known as write lock, is a mutual exclusion locking mechanism. Once a transaction acquires an exclusive lock, other transactions cannot obtain shared locks and exclusive locks, which is used to ensure the atomicity of transaction operations.
SQL optimization
1. Try not to use select * query when querying, but use specific fields.
1. It can save resources, reduce network and IO overhead, because we need to read data from disk, the fields I use will increase network overhead and IO overhead.
2. It may also affect the security of data. If we have a class containing accounts, passwords, etc., using select * may cause user information to be leaked. Or some private information was added later.
3. Covering indexes will not be used.
2. Avoid using or to connect conditions in the where clause.
Because the use of or may cause the engine to give up using the index, thereby performing a full table scan
select id from t where num=10 or num=20
The correct way to use it is as follows:
select id from t where num=10
union all
select id from t where num=20
3. Fuzzy query will also lead to full table scan
select id from t where name like '%abc%'
4. Try to use numeric values instead of string types
如:select id from t where num is null
You can set a default value of 0 on num to ensure that there is no null value in the num column in the table,
Sql execution plan (explain)
Explain: Use explain to simulate the optimizer to execute SQL queries, so as to know how MySQL processes your SQL statements. Analyze the performance bottleneck of your query statement or table structure. '
The role of Explain:
Table reading order, data reading operation type, which indexes can be used, which indexes are actually used, references between tables, how many rows of each table are queried by the optimizer
EXPLAIN SELECT * FROM USER WHERE id = 1
SELECT identifier. This is the query sequence number for the SELECT
If the id is the same, it can be considered as a group and executed sequentially from top to bottom; in all groups, the larger the id value, the higher the priority, and the earlier the execution
EXPLAIN SELECT * FROM employee e WHERE e.deptId = (SELECT id FROM dept d WHERE d.id = 1)
2. select_type
3.type