Summary of Mysql database interview questions (2022 version)

title address
Summary of Java Virtual Machine Interview Questions (2022 Edition) https://blog.csdn.net/qq_37924396/article/details/125881033
Summary of Java Collection Interview Questions (2022 Edition) https://blog.csdn.net/qq_37924396/article/details/126058839
Summary of Mysql database interview questions (2022 version) https://blog.csdn.net/qq_37924396/article/details/125901358
Summary of Spring Interview Questions (2022 Edition) https://blog.csdn.net/qq_37924396/article/details/126354473
Summary of Redis interview questions (2022 version) https://blog.csdn.net/qq_37924396/article/details/126111149
Summary of Java Concurrency Interview Questions (2022 Edition) https://blog.csdn.net/qq_37924396/article/details/125984564
Summary of Distributed Interview Questions (2022 Edition) https://blog.csdn.net/qq_37924396/article/details/126256455

Article directory

1. Storage engine

1.1 Storage Engine

1. Several common database engines

InnoDB
InnoDB is the engine of choice for transactional databases, supports transaction security tables (ACID), supports row locking and foreign keys, and InnoDB is the default MySQL engine

MyISAM
MyISAM is based on the ISAM storage engine and extends it. It is one of the most commonly used storage engines in web, data warehousing and other application environments.
MyISAM has a high insertion and query speed, but does not support transactions.

Link: Comparison of advantages and disadvantages of several MySQL database engines

2. The difference between InnoDB and MyISAM

1) InnoDB supports transactions, but MyISAM does not.
2) InnoDB supports foreign keys, but MyISAM does not. Therefore converting an InnoDB table with foreign keys to a MyISAM table will fail.
3) Both InnoDB and MyISAM support the index of B+ Tree data structure. But InnoDB is a clustered index, while MyISAM is a non-clustered index.
4) InnoDB does not save the number of data rows in the table, and a full table scan is required when executing select count(*) from table. However, MyISAM uses a variable to record the number of rows in the entire table, and the speed is quite fast (note that there cannot be a WHERE clause).
So why doesn't InnoDB use such a variable? Because of the transactional nature of InnoDB, the number of rows in the same timetable is different for different transactions.
5) InnoDB supports table and row (default) level locks, while MyISAM supports table level locks.
InnoDB's row locks are implemented based on indexes, not on physical row records. That is, if the access does not hit the index, the row lock cannot be used, and it will degenerate into a table lock.
6) InnoDB must have a unique index (such as a primary key). If it is not specified, it will automatically find or generate a hidden column Row_id to serve as the default primary key, while Myisam can have no primary key.

1.2 Storage structure

1.2.1 What are InnoDB pages, regions, and segments?

Page (Page)
First, InnoDB divides the physical disk into pages (page), the size of each page is 16 KB by default, and the page is the smallest storage unit. Pages are divided into many formats according to the needs of upper-layer applications, such as indexes and logs. We mainly talk about data pages, that is, pages that store actual data.

If there is only one level of pages in the extent (Extent)
, the number of pages is very large, and the allocation and recovery of storage space will be very troublesome, because it is very troublesome to maintain the state of so many pages.

Therefore, InnoDB introduced the concept of Extent. By default, an area is composed of 64 consecutive pages, which is 1MB. It is easier to allocate and reclaim storage space through Extent.

Segment (Segment)
Why should the segment be introduced? It starts with the index. We all know that the purpose of the index is to speed up the search, which is a typical method of exchanging space for time.

The leaf nodes of the B+ tree store our specific data, and the non-leaf nodes are index pages. Therefore, the B+ tree divides the data into two parts, the leaf node part and the non-leaf node part, which is the segment we want to introduce, that is to say, each index in InnoBD will create two segments to store the corresponding two parts of data.

Segment is a logical organization, and its hierarchical structure is Segment, Extent, and Page from top to bottom.

1.2.2 What data does a page consist of?

First look at the basic format of the data page, as shown below:

File Header
is used to describe the external information of the data page, such as which tablespace it belongs to, the page numbers of the front and back pages, etc.

Page Header
is used to describe the specific information in the data page, such as how many records exist, the location of the first record, etc.

infimum and supremum records
infimum and supremum are records generated by the system, which are the minimum and maximum record values ​​respectively. The next record of infimum is the record with the smallest key value in the user record, and the previous record of supremum is the record with the largest key value in the user record. next_record field to connect.

User Records
user records, that is, the corresponding data in the database table, here we say the commonly used Compact format.

Besides the data we inserted, InnoDB also has some hidden columns, transaction_id (transaction ID), roll_pointer (rollback pointer) must be added.

row_id is not necessarily, it is generated according to the following strategy: the primary key specified by the user when creating the table is preferred, and if the user does not specify a primary key, the unique key is used. If there is no unique key, the system will automatically generate row_id, which is a hidden column.


The currently free storage in the Free Space page can be inserted into the record.

Page Dictionary
is similar to the directory structure of a dictionary. According to the size of the primary key, a slot is set every 4-8 records to record its position. When searching for data according to the primary key, first find the slot where the data is located in one step, and then in the slot Linear search. This method is more efficient than traversing a linked list of pages from front to back.

The Page Tailer
File Header stores the checksum of the memory before flashing, and the Page Tailer stores the checksum after flashing. When flashing the disk, if an exception occurs, the checksums in the Page Tailer and File Header are inconsistent, which means that there is an error in flashing the disk.

1.2.3 The process of inserting records in the page?

1) If there is enough space in Free Space, directly allocate space to add records, point the next_record of the last record before inserting to the currently inserted record, and point the next_record of the currently inserted record to the supreme record.

2) If the Free Space space is not enough, first reorganize the fragments caused by the previous deletion, and then insert the record according to the above steps.

3) If the current page space is still insufficient after defragmentation, re-apply for a page, initialize the page, and insert records according to the above steps

2. Index

1. How many types or categories of indexes are there?

From the physical structure, it can be divided into two types : clustered index and non-clustered index:
clustered index , which puts the index and data together, finds the index and finds the data. The B+ tree index we just saw is a kind of clustering The index
non-clustered index is to separate the data from the index. When searching, you need to find the index first, and then find the corresponding data through the index back to the table. InnoDB has one and only one clustered index, while MyISAM has non-clustered indexes.

ref: The difference between a clustered index and a non-clustered index

It can be divided into the following categories in terms of application :
Ordinary index : the basic index type in MySQL, without any restrictions, allowing duplicate and null values ​​to be inserted into the columns defining the index, purely to improve query efficiency. Created by ALTER TABLE table_name ADD INDEX index_name (column);

Unique index : The values ​​in the indexed column must be unique, but null values ​​are allowed. Created by ALTER TABLE table_name ADD UNIQUE index_name (column);

Primary key index : a special unique index, also known as a clustered index, which does not allow null values ​​and is automatically created by the database for us;

Composite index : The index created by multiple fields in the composite table follows the leftmost prefix matching rule;

Full-text index : it can only be used on the MyISAM engine, and it can only be used on fields of CHAR, VARCHAR, and TEXT types.

2. What is the difference between Hash and B+ tree index?

Hash
1) Hash is faster for equivalent query, but cannot perform range query. Because after the index is built through the Hash function, the order of the index cannot be consistent with the original order, so the range query cannot be supported. Similarly, sorting by index is also not supported.
2) Hash does not support fuzzy query and leftmost prefix matching of multi-column index, because the value of Hash function is unpredictable, such as the calculated value of AA and AB has no correlation.
3) Hash cannot avoid returning to the table to query data at any time.
4) Although the query efficiency is high on the equivalent value, the performance is unstable, because when there is a large number of repetitions in a certain key value, Hash collisions occur, and the query efficiency may be reversed. reduce.

B+ Tree
1) B+ tree is essentially a search tree, which naturally supports range query and sorting.
2) When certain conditions (clustered index, covering index, etc.) are met, the query can be completed only through the index without returning to the table.
3) The query efficiency is relatively stable, because each query is from the root node to the leaf node, and is the height of the tree.

3. Why use B+ tree instead of binary search tree as index?

We know that the search efficiency of a binary tree is O(logn), and when the tree is too high, the search efficiency will decrease. In addition, since our index file is not small, it is stored on disk.
When the file system needs to read data from the disk, it generally reads in units of pages. If there is too little data in a page, the operating system needs to read more pages, and the number of random I/O accesses involving the disk is More. Reading data from disk into memory involves random I/O access and is one of the most expensive operations in a database.
Therefore, the height of this tree will increase sharply with the increase of the amount of data, and every time the data is updated, it needs to maintain a balanced binary tree through left-handed and right-handed, which is not suitable for index files stored on disk.

4. Why use B+ tree instead of B tree as index?

Before that, let's understand the difference between B+ tree and B tree:

B-tree
B-tree stores data in both non-leaf nodes and leaf nodes, so when querying data, the time complexity is at best O(1) and at worst O(log n). The B+ tree only stores data in leaf nodes, and non-leaf nodes store keywords, and the keywords of different non-leaf nodes may be repeated, so when querying data, the time complexity is fixed at O(log n).

B+ tree
B+ tree leaf nodes are connected with each other by linked list, so only need to scan the linked list of leaf nodes can complete a traversal operation, B tree can only be traversed through inorder.

Why are B+ trees more suitable for database indexes than B trees?

1. The B+ tree reduces the number of IOs.
Because the index file is very large, the index file is stored on the disk, and the non-leaf nodes of the B+ tree only store keywords but not data, so a single page can store more keywords, that is, the keys that need to be searched for are read into memory at one time There are more words, and the number of random I/O reads from the disk is relatively reduced.

2. B+ tree query efficiency is more stable
Since the data only exists on leaf nodes, the search efficiency is fixed at O(log n), so the query efficiency of B+ tree is more stable than that of B tree.

3. The B+ tree is more suitable for range search.
The leaf nodes of the B+ tree are connected in an orderly manner with a linked list, so to scan all the data, you only need to scan the leaf nodes once, which is beneficial for database scanning and range query; B-trees also store data in non-leaf nodes , so it can only be scanned in order by inorder traversal. That said, B+ trees are more efficient for range queries and ordered traversals.

ref: Why is B+ tree more suitable for database index than B tree?

5. What is the leftmost matching principle?

The leftmost is first, and any continuous index starting from the leftmost can be matched. At the same time, when a range query (>, <, between, like) is encountered, the matching will stop. If you build a joint index (a, b, c)
for example: b = 2 If you build an index in the order of (a, b), it will not match the (a, b) index; but if the query condition is a = 1 and b = 2 or a=1 (or b = 2 and b = 1) is fine, because the optimizer will automatically adjust the order of a and b. Another example is a = 1 and b = 2 and c > 3 and d = 4 If you create an index in the order of (a,b,c,d), d will not use the index, because the c field is a range query, after it fields will stop matching.

ref left matching principle

6. What is index pushdown?

Index condition pushdown (Index condition pushdown), referred to as ICP, is a technology used to optimize queries introduced in Mysql version 5.6.
In the case of not using index pushdown, when using a non-primary key index for query, the storage engine retrieves the data through the index, and then returns it to the MySQL server, and the server judges whether the data meets the conditions.

After the index is pushed down, if there are some judgment conditions for the indexed columns, the MySQL server will pass this part of the judgment conditions to the storage engine, and then the storage engine will judge whether the index meets the conditions passed by the MySQL server. Only when the conditions are met will the data be retrieved and returned to the MySQL server.

Index condition push-down optimization can reduce the number of times the storage engine queries the underlying table, and can also reduce the number of times the MySQL server receives data from the storage engine.

7. Index failure

1. If Like begins with %, the index is invalid; when the prefix of like does not have % and the suffix has %, the index is valid.
2. The composite index does not use the left column field. Leftmost matching principle
3. There is or in the condition. If you want to use or and want the index to take effect, you can only add an index to each column in the or condition.
4. If the column type is a string, it must be in the condition Use quotation marks to quote the data, otherwise the index will not be used
5. There are mathematical operations on the index column in the where clause, and the index cannot be
used 6. Use functions on the indexed column in the where clause, and the index cannot be used
7. WHERE If there is an inequality sign in the query condition of the sentence (such as: WHERE column!=...), MYSQL will not be able to use the index.
8. If mysql estimates that using the full table scan is faster than using the index, then do not use the index, such as a very small amount of data surface.

8. What is return form?

We are in index, there is a type of index called clustered index and non-clustered index.
In the clustered index, all the data of this row will be stored on the B+ tree, but the non-clustered index will only store the value corresponding to the column and the primary key of the corresponding row.

Is the definition of a clustered index similar to that of a primary key index?

In fact, when we do not define a primary key index, MYSQL will specify the first column from left to some that has a unique index and a non-null constraint to build a clustered index, and use it to replace the primary key index.

Clustered index ≈ primary key index = unique constraint + non-null constraint

So what have we said so much? What exactly is a query back to the table?

For example, there is such a table with id name sex type
id as the primary key, and a common index is established for name.
For example, I write the following query statement

select * from table where name ='ls'
1
Due to the ordinary index established by the name column, the name index will be used, but we found that the name index only stores the name value and the corresponding primary key id, and cannot find what we want Sex type information. So what to do?
It is necessary to go to the primary key index to find the relevant value again.
Because the primary key index stores the complete information of this row, but the name does not.

The following scenario will happen
insert image description here

The query back to the table actually executes two B+ tree queries, which is a time-consuming and laborious operation.
Therefore, we should try to avoid querying back to the table as much as possible.

Let's take a look at the essence of back table query? Is it because ordinary indexes can’t find the complete information we want, and we have to perform query back to the table, and then return to the primary key index or clustered index to query data.

Then the key to solving the query back to the table is: index coverage

How do you understand it?
for example

select id name from table where name ='ls'

This will not execute the query back to the table

select id name sex from table where name ='ls'

This requires performing a query back to the table, because sex cannot be found on the name index at all.

So how to solve it?
That is, name and sex can create a joint index.
This is index coverage, you can feel it. Let the index range cover the range of our select, and there will be no query back to the table.

ref: What is back-to-table query? How to avoid query back to the table?

3. Affairs

3.1 What is a database transaction?

A database transaction is an indivisible sequence of database operations and the basic unit of database concurrency control. The result of its execution must change the database from one consistent state to another. A transaction is a logical set of operations that either are performed or none are performed.

3.2 What are the four characteristics of transactions (ACID)?

Atomicity : A transaction is the smallest unit of execution and cannot be split. The atomicity of the transaction ensures that the actions are either all completed or not at all.
Consistency : Before and after the execution of the transaction, the data remains consistent, and the result of multiple transactions reading the same data is the same
isolation : When accessing the database concurrently, a user The transaction is not interfered by other transactions, and the database is independent
and durable between concurrent transactions : after a transaction is committed. Its changes to the data in the database are persistent, even if the database fails, it should not have any impact on it.

3.3 Concurrency of transactions?

Dirty reads, phantom reads, and non-repeatable reads.
ref : problems caused by concurrent transactions

3.4 What are dirty reads, phantom reads and non-repeatable reads?

Dirty read : One transaction reads data that has not been committed by another transaction. Transaction A reads the data updated by transaction B, and then B rolls back the operation, then the data read by A is dirty data.

Non-repeatable read : The content of the data read twice in a transaction is inconsistent. Transaction A reads the same data multiple times, and transaction B updates and submits the data during the multiple reads of transaction A, resulting in inconsistent results when transaction A reads the same data multiple times.

Phantom read : The amount of data read twice in a transaction is inconsistent. System administrator A changes the grades of all students in the database from specific scores to ABCDE grades, but system administrator B inserts a record of specific scores at this time. When system administrator A completes the change, he finds that there is still a record missing If you change it, it's like hallucinations, which is called phantom reading.

Summary: Non-repeatable reading and phantom reading are easy to confuse. Non-repeatable reading focuses on modification, and phantom reading focuses on adding or deleting. To solve the problem of non-repeatable read, you only need to lock the rows that meet the conditions, and to solve the phantom read, you need to lock the table .

3.5 What are the isolation levels of transactions?

The first isolation level: Read uncommitted (read uncommitted)
If a transaction has started to write data, another transaction is not allowed to write at the same time, but allows other transactions to read this row of data. This isolation level can be passed through "exclusive write lock ", but does not exclude the read thread implementation. This avoids the loss of updates, but dirty reading may occur, that is to say, transaction B reads the uncommitted data of transaction A

Solved the loss of updates, but dirty reads may still occur

The second isolation level: Read committed (read committed)
If it is a read transaction (thread), other transactions are allowed to read and write. If it is a write transaction, other transactions will be prohibited from accessing the row data. This isolation level avoids dirty reads. However, non-repeatable reads may occur. Transaction A reads the data in advance, transaction B immediately updates the data, and submits the transaction, and when transaction A reads the data again, the data has changed.

Fixed lost update and dirty read issues

The third isolation level: Repeatable read (repeatable read) The database enables repeatable read by default, which
means that within a transaction, the same data is read multiple times. When the transaction is not over, other transactions cannot access the data ( Including reading and writing), so that the data read twice in the same transaction is the same, so it is called the repeatable read isolation level, and the transaction that reads the data will prohibit the write transaction (but allow the read transaction) , write transactions prohibit any other transactions (including reading and writing), thus avoiding non-repeatable reads and dirty reads, but phantom reads may sometimes occur. (Transactions that read data) can be implemented through "shared read mirror" and "exclusive write lock".

Solved update loss, dirty reads, non-repeatable reads, but phantom reads still occur

The fourth isolation level: Serializable (serializable)
provides strict transaction isolation. It requires serialized execution of transactions. Transactions can only be executed one after another, but cannot be executed concurrently. If only through "row-level locks", it is impossible To achieve serialization, other mechanisms must be used to ensure that newly inserted data will not be accessed by transactions that perform query operations. Serialization is the highest transaction isolation level, and at the same time the cost is the highest. The performance is very low, and it is generally rarely used. At this level, transactions are executed sequentially, which can not only avoid dirty reads, non-repeatable reads, but also phantom reads

Solve the update lost, dirty read, non-repeatable read, phantom read (virtual read)

isolation level dirty read non-repeatable read Phantom reading
uncommitted read
submit read ×
repeatable read × ×
Serializable × × ×

The default isolation level in MYSQL database is Repeatable read (repeatable read)
ref transaction isolation level

3.6 How are ACID properties implemented?

Atomicity : The statement is either fully executed or not executed at all, which is the core feature of the transaction. The transaction itself is defined by atomicity; the implementation is mainly based on undo log persistence: after the transaction is committed, it is guaranteed that it will not be caused by downtime and
other reasons Data loss; the implementation is mainly based on redo log
isolation : ensure that transaction execution is not affected by other transactions as much as possible; the default isolation level of InnoDB is RR, and the implementation of RR is mainly based on lock mechanism (including next-key lock), MVCC (including data Hidden columns, undo log-based version chain, ReadView)
Consistency : the ultimate goal pursued by transactions, the realization of consistency requires both the guarantee of the database level and the guarantee of the application level

ref The realization principle of ACID characteristics

4. lock

1. What are the functions of database locks and what kind of locks are there?

When the database has concurrent transactions, data inconsistencies may occur. At this time, some mechanisms are needed to ensure the order of access. The lock mechanism is such a mechanism. That is, the role of the lock is to solve the concurrency problem.

From the granularity of locks, locks can be divided into table locks, row locks, and page locks.
Row-level lock : It is a kind of lock with the finest locking granularity, which means that only the row currently being operated is locked. Row-level locks can greatly reduce conflicts in database operations. Its locking granularity is the smallest, but the locking overhead is also the largest.
Row-level locks are expensive, slow to lock, and deadlocks may occur. But the locking granularity is the smallest, the probability of lock conflicts is the lowest, and the concurrency is the highest.

Table-level lock : It is a kind of lock with the largest granularity, which means to lock the entire table currently being operated. It is simple to implement, consumes less resources, and is supported by most MySQL engines.

Page-level lock : It is a lock with a granularity between row-level locks and table-level locks. Table-level locks are fast, but have more conflicts, and row-level locks have fewer conflicts, but are slower. Therefore, a compromised page level is taken to lock a group of adjacent records at a time.

The overhead and locking time are between table locks and row locks, and deadlocks will occur. The locking granularity is between table locks and row locks, and the concurrency is average.

From the nature of use, it can be divided into shared locks, exclusive locks and update locks.
Share Lock : S lock, also known as read lock, is used for all read-only data operations.
S locks are not exclusive, and multiple concurrent transactions are allowed to lock the same resource, but X locks are not allowed while adding S locks, that is, resources cannot be modified. The S lock is usually released immediately after the end of the read, without waiting for the end of the transaction.

Exclusive lock : X lock, also known as write lock, means to write data.
The X lock only allows one transaction to lock the same resource, and it will not be released until the end of the transaction. Any other transaction must wait until the X lock is released to access the page.

Use the select * from table_name for update; statement to generate an X lock.

Update lock : U lock, which is used to schedule X locks on resources, allowing other transactions to read, but not allowing U locks or X locks to be applied.
When the read page is about to be updated, it is upgraded to an X lock, and the U lock cannot be released until the end of the transaction. Therefore, the U lock is used to avoid the deadlock phenomenon caused by the use of shared locks.

Classification and summary of ref database locks

Subjectively divided, it can be divided into optimistic lock and pessimistic lock.

Optimistic Lock : As the name suggests, it is subjectively determined that the resource will not be modified, so the data is read without locking, and only when updating, the version number mechanism is used to confirm whether the resource has been modified.
Optimistic locking is suitable for multi-read application types, which can improve the throughput of the system.

Pessimistic Lock : As the name suggests, it has strong exclusive and exclusive characteristics. Every time data is read, it is considered to be modified by other transactions, so each operation needs to be locked.

2. The concept of optimistic lock and pessimistic lock

Pessimistic locking
always assumes the worst case. Every time you go to get the data, you think that others will modify it, so every time you get the data, you will lock it, so that others will block until it is unlocked if they want to get the data. Many such locking mechanisms are used in traditional relational databases, such as row locks, table locks, etc., read locks, write locks, etc., all of which are locked before operations are performed. Just like for update, another example is the implementation of the synchronized keyword in Java, which is also a pessimistic lock.

Optimistic lock
, as the name suggests, is very optimistic. Every time you go to get the data, you think that others will not modify it, so it will not be locked. However, when updating, you will judge whether others have updated the data during this period. You can use the version Number and other mechanisms. Optimistic locking is suitable for multi-read application types, which can improve throughput. Like the write_condition mechanism provided by the database, it is actually an optimistic locking provided.

3. How does MySQL implement pessimistic locking

1. To achieve pessimistic locking, use select ... for update to lock, and use commit to release the lock after the operation is completed;
2. In the case of innodb engine, the default row-level lock will lock a row when there is a clear field, such as no query condition or condition;
field When it is not clear, the entire table will be locked, and when the condition is a range, the entire table will be locked;
3. When no data can be found, the table will not be locked.

4. What is MVCC and its implementation?

The full English name of MVCC is Multiversion Concurrency Control, which means multi-version concurrency control in Chinese, which can prevent reading and writing from blocking each other, and is mainly used to improve concurrency efficiency when solving non-repeatable reading and phantom reading problems.
The principle is to realize the concurrency control of the database through the management of multiple versions of the data row. Simply put, it is to save the historical version of the data. You can determine whether the data is displayed by comparing the version number. There is no need to lock when reading data to ensure the isolation effect of the transaction.

ref MVCC Principle
ref MVCC Detailed Explanation

5. What is the relationship between isolation level and lock?

1) Read Uncommitted (read uncommitted content)
does not need to add a shared lock to read data, so that it will not conflict with the exclusive lock on the modified data;

2) Read Committed (read the submitted content)
read operations need to add a shared lock, but release the shared lock after the statement is executed;

3) **Repeatable Read**
read operations need to add a shared lock, but the shared lock is not released before the transaction is committed, that is, the shared lock must be released after the transaction is completed;

4) SERIALIZABLE
is the most restrictive, because this level locks the entire range of keys and holds the lock until the transaction completes.

ref The relationship between transaction isolation level and lock in Innodb

5. Cluster

1. Master-slave replication

1. What is master-slave replication?

Master-slave replication is used to establish a database environment exactly the same as the master database, that is, the slave database. The main database is generally a quasi-real-time business database.

2. What is the role of master-slave replication?

  • Read-write separation enables the database to support greater concurrency.
  • High availability, hot backup of data, as a backup database, after the primary database server fails, it can switch to the secondary database to continue working to avoid data loss.

6. Log

1. What are the common logs in MySQL?

Redo log (redo log) : The role of the physical log
is to ensure the durability of the transaction. The redo log records the state after the transaction is executed, and is used to restore the committed transaction data that has not been written to the data file.

Rollback log (undo log) : The role of logical log
is to ensure the atomicity of data. It saves a version of the data before the transaction occurs, which can be used for rollback, and can also provide read under multi-version concurrency control (MVCC), that is, non-locking read.

Binary log (binlog) : logical log
is often used in master-slave synchronization or data synchronization, and can also be used for database point-in-time restoration.

The error log (errorlog)
records information about MySQL starting and stopping, and errors that occur during the running of the server. By default, the system's error log function is turned off, and error messages are output to standard error output.

The general query log (general query log)
records every command received by the server, regardless of whether the command statement is correct or not, so it will bring a lot of overhead, so it is also turned off by default.

The slow query log (slow query log)
records the query statements that take too long to execute and that do not use indexes (default 10s), and only record the successfully executed statements.

The relay log (relay log)
stores the received binlog log content in the slave node for master-slave synchronization.

ref: Understanding of several logs in MySQL

2.Undo Log

1. What is undo log

Undo log, before the database transaction starts, MYSQL will record the data before updating to the undo log file. If the transaction is rolled back or the database crashes, the log information recorded in the undo log can be used to roll back. At the same time, it can also provide reading under multi-version concurrency control (MVCC).

2.undo log life cycle

Undo log generation: Generate undo log before the transaction starts
Destruction: After the transaction is committed, the undo log cannot be deleted immediately, but put into the linked list to be cleaned up, and the purge thread judges whether other transactions are using the table in the undo segment The version information before the previous transaction determines whether the log space of the undo log can be cleaned up.
Note: undo log will also generate redo log, and undo log also needs to implement persistence protection.

3. The function of uodo log log

  • The undo log log can realize the rollback operation of the transaction **
    When we perform data update operations, we will not only record the redo log, but also record the undo log. If the transaction is rolled back for some reason, then MySQL will execute it at this time Rollback (rollback) operation, using the undo log to restore the data to the state before the start of the transaction.

For example, we execute the following delete statement:

delete from book where id = 1; 

At this time, the undo log will generate an opposite insert statement [reverse operation statement]. When the transaction needs to be rolled back, directly execute the sql, and the data can be completely restored to the data before modification, so as to achieve Purpose of transaction rollback.

Another example is that we execute an update statement:

update book set name = "三国" where id = 1;   ---修改之前name=西游记

At this time, the undo log will record an opposite update statement, as follows:

update book set name = "西游记" where id = 1;

If there is an exception in this modification, you can use the undo log log to implement the rollback operation to ensure the consistency of the transaction.

4. The working principle of uodo log

insert image description here
As shown in the figure above:
when transaction A performs an update operation, change id=1 to id=2. First, the cached data in the buffer pool will be modified, and the old data will be backed up to the undo log buffer at the same time, and the sql statement of the restore operation will be recorded. At this time, if transaction B wants to query the modified data, but transaction A has not yet submitted, then transaction B will query the data before transaction A is modified from the undo log buffer, that is, id=1. At this time, the undo log buffer will persist the data to the undo log log (disk operation). After the undo log is persisted, the data will actually be written to the disk, that is, to the ibd file. The commit of the transaction will be executed at the end.

5. The storage mechanism of uodo log

insert image description here

6. Configuration parameters of uodo log


innodb_max_undo_log_size: The maximum value of the undo log file
, the default is 1GB, and the initial size is 10M The log is saved in ibdata1
innodb_undo_directory: the directory location where the undo log is stored
innodb_undo_logs: the number of rollbacks is 128 by default

ref: Undo log log details

3.Redo Log

1. What is Redo Log

The redo log (redo log) is unique to the InnoDB storage engine, which allows MySQL to have crash recovery capabilities.
For example, if the MySQL instance hangs or goes down, when restarting, the InnoDB storage engine will use the redo log to restore the data to ensure the persistence and integrity of the data.

The redo log can be divided into the following two parts

  • The redo log buffer (Redo Log Buffer) stored in memory is easy to lose
  • The redo log file (Redo Log File) stored on the disk is persisted to the disk and is not easy to lose

2. Brushing rules

7. SQL syntax

Guess you like

Origin blog.csdn.net/qq_37924396/article/details/125901358