Mysql lock-MVCC related interview questions-from a certain website

1. Describe the database transaction isolation level?

​ ACID:

Atomicity: undo log (MVCC)

​ Consistency: the core and most essential requirement

Isolation: locks, mvcc (multi-version concurrency control)

​Persistence: redo log

There are four transaction isolation levels of the database, namely read uncommitted, read committed, repeatable read, and serialization. Different isolation levels will cause related problems such as dirty reads, phantom reads, and non-repeatable reads. Therefore, when choosing The isolation level should be determined according to the application scenario and use the appropriate isolation level.

The corresponding situations of various isolation levels and database exceptions are as follows:

isolation level dirty read Non-repeatable Read phantom reading
READ- UNCOMMITTED
READ-COMMITTED ×
REPEATABLE- READ × ×
SERIALIZABLE × × ×

The SQL standard defines four isolation levels:

  • READ-UNCOMMITTED (read uncommitted): Transaction modifications are visible to other transactions even if they are not committed. Transactions are able to read uncommitted data, which is called a dirty read.
  • READ-COMMITTED (read committed): The transaction reads committed data, the default isolation level of most databases. When a transaction is executing, the data is modified by another transaction, causing the information read before and after this transaction to be different. This situation is called non-repeatable read.
  • REPEATABLE-READ (repeatable read): This level is the default isolation level of MySQL. It solves the problem of dirty reads and also ensures that the same transaction reads the same record multiple times. However, this level will still occur. The case of phantom reading. Phantom reading means that when a transaction A reads data in a certain range, another transaction B inserts rows in this range, and transaction A reads data in this range again, a phantom read will occur.
  • SERIALIZABLE (serializable): The highest isolation level, fully compliant with the ACID isolation level. All transactions are executed one after another in order, so that there is no possibility of interference between transactions. In other words, this level can prevent dirty reads, non-repeatable reads and phantom reads.

The implementation of the transaction isolation mechanism is based on the lock mechanism and concurrent scheduling. Among them, concurrent scheduling uses MVVC (Multi-version Concurrency Control), which supports features such as concurrent consistent reading and rollback by saving modified old version information.

Because the lower the isolation level, the fewer locks requested by the transaction, so the isolation level of most database systems is READ-COMMITTED (read submitted content):, but what you need to know is that the InnoDB storage engine uses **REPEATABLE-READ by default. (Rereadable) ** There is no performance penalty.

2. Implementation principle of MVCC

See mvcc documentation for details

3. How to solve mysql phantom reading

​Transaction A reads data according to certain conditions. During this period, transaction B inserts new data with the same search conditions. When transaction A reads again according to the original conditions, it discovers that the newly inserted data by transaction B is called a phantom read.

CREATE TABLE `user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB ;

INSERT into user VALUES (1,'1',20),(5,'5',20),(15,'15',30),(20,'20',30);

Assume the following business scenario:

time Transaction 1 Transaction 2
begin;
T1 select * from user where age = 20; 2 results
T2 insert into user values(25,‘25’,20);commit;
T3 select * from user where age =20;2 results
T4 update user set name=‘00’ where age =20; At this time, the number of affected rows is 3
T5 select * from user where age =20; three results

The execution process is as follows:

1. The data with age 20 is read at T1, and transaction 1 gets 2 records.

2. Another transaction inserts a new record at time T2, and the age is also 20.

3. At T3, transaction 1 reads the data with age 20 again and finds that there are still 2 records. The data inserted by transaction 2 does not affect the transaction reading of transaction 1.

4. At T4, transaction 1 modifies the data with age 20, and finds that the result becomes three, and three pieces of data are modified.

5. At T5, transaction 1 reads the data with age 20 again and finds that there are three results. The third piece of data is the data inserted by transaction 2. At this time, a phantom read occurs.

At this time, everyone needs to think about a question. In the current scenario, why is the problem of phantom reading not solved?

In fact, through the previous analysis, everyone should know about snapshot reading and current reading. Generally, select * from … where … is a snapshot read and will not be locked, while for update, lock in share mode, update, Delete belongs to the current read.If snapshot reads are used in the transaction, phantom reads will not occur. However, phantom reads will occur when snapshot reads and current reads are used together a>.

If they are all currently read, how to solve the phantom reading problem?

truncate table user;
INSERT into user VALUES (1,'1',20),(5,'5',20),(15,'15',30),(20,'20',30);
time Transaction 1 Transaction 2
begin;
T1 select * from user where age =20 for update;
T2 insert into user values(25,‘25’,20); this will block waiting for the lock
T3 select * from user where age =20 for update;

At this time, you can see that transaction 2 is blocked and needs to wait for transaction 1 to submit the transaction before it can be completed. In fact, in essence, the gap lock mechanism is used to solve the phantom read problem.

4. SQL join principle?

MySQL only supports one Join algorithm, Nested-Loop Join, and does not support hash joins and merge joins. However, mysql contains a variety of variants that can help MySQL improve the efficiency of join execution.

1、Simple Nested-Loop Join

​ This algorithm is relatively simple. Take R1 from the driver table to match all columns of the S table, then R2, R3, until all the data in the R table is matched, and then merge the data. You can see that this algorithm requires Although it is simple to access the S table RN times, the overhead is still relatively high.

2、Index Nested-Loop Join

​ Index nested association Since there are indexes on non-driven tables, you no longer need to compare records one by one when comparing. Instead, you can use indexes to reduce comparisons, thereby speeding up queries. This is one of the main reasons why we usually require the related fields to have indexes when doing related queries.

​ When this algorithm is used for link query, the driver table will search according to the index of the associated field. When a matching value is found on the index, the table will be returned to the table for query. That is, only when the index is matched, the table will be returned. . As for the selection of the driver table, the MySQL optimizer will generally choose the driver table with a small number of records. However, when the SQL is particularly complex, incorrect selections cannot be ruled out.

​ In the index nested link method, if the associated key of the non-driven table is the primary key, the performance will be very high. If it is not the primary key, the efficiency will be particularly high if the number of rows returned by the association is large. is low because multiple table return operations are required. First associate the index, and then perform the table return operation based on the primary key ID of the secondary index. In this case, the performance will be relatively poor.

3、Block Nested-Loop Join

​ When there is an index, MySQL will try to use the Index Nested-Loop Join algorithm. In some cases, the Join column may not have an index, so MySQL’s choice at this time will definitely not be the Simple Nested- introduced first. Loop Join algorithm, but the Block Nested-Loop Join algorithm will be used first.

​ Compared with Simple Nested-Loop Join, Block Nested-Loop Join has an additional intermediate processing process, which is the join buffer. Use the join buffer to buffer all query JOIN related columns of the driver table into the JOIN BUFFER, and then batch and non-drive tables For comparison, if this is also implemented, multiple comparisons can be merged into one, which reduces the access frequency of non-driven tables. That is, the S table only needs to be accessed once. In this way, the non-driven table will not be accessed multiple times, and only in this case will the join buffer be accessed.

​ In MySQL, we can set the value of the join buffer through the parameter join_buffer_size, and then perform the operation. By default join_buffer_size=256K, MySQL will cache all required columns into the join buffer during search, including the selected columns, instead of just caching the associated columns. In a SQL with N JOIN associations, N-1 join buffers will be allocated during execution.

5. Explain the database index principle, underlying index data structure, what is stored in the leaf nodes, and what happens when the index fails?

​ You need to watch the video to understand the implementation principle of the index, the underlying data structure, and the data stored in the leaf nodes.

​In case of index failure:

1. The combined index does not follow the leftmost matching principle

2. Using range queries (<,>,like) on the previous index columns of the combined index will cause subsequent indexes to become invalid.

3. Do not do any operations on the index (calculation, function, type conversion)

​ 4. Indexes cannot be used for is null and is not null

5. Use the or operator as little as possible, otherwise the index will become invalid during the connection.

​ 6. Not adding quotation marks to the string will cause the index to become invalid.

7. Inconsistency in the length and coding of the fields in the conditional fields used to associate the two tables will cause index failure.

8. In the like statement, fuzzy queries starting with %

9. If using full table scan in mysql is faster than using index, it will also cause index failure.

6. How does MySQL divide databases and tables?

​ Use mycat or shardingsphere middleware to sub-database and table, choose the appropriate middleware, horizontal sub-database, horizontal sub-table, vertical sub-database, vertical sub-table

When sharding databases and tables, try to follow the following principles:

1. Try not to cut it if possible;

2. If you want to segment, you must choose appropriate segmentation rules and plan in advance;

3. Data segmentation should try to reduce the possibility of cross-database join through data redundancy or table grouping;

​ 4. Since it is difficult for database middleware to grasp the advantages and disadvantages of data Join implementation, and it is extremely difficult to achieve high performance, multi-table Join should be used as little as possible for business reading.

7. What are the data storage engines?

​ You can check the storage engines supported by the corresponding database by using show engines.

8. Describe the difference between InnoDB and MyISAM?

the difference Innodb MyISAM
affairs support not support
foreign key support not support
index Supports clustered indexes and non-clustered indexes Only non-clustered indexes are supported
row lock support not support
table lock support support
Store files frm,ibd frm,myi,myd
Specific number of lines The entire table must be scanned each time to count the number of rows Save the number of rows through variables (queries cannot be conditional)

how to choose?

​ 1. Do you need to support transactions? If so, choose innodb. If not, choose myisam.

2. If most of the requests for the table are read requests, you can consider myisam. If there are both reads and writes, use innodb.

​ Now the default storage engine of mysql has become Innodb. It is recommended to use innodb.

9. Describe the difference between clustered index and non-clustered index?

​ The innodb storage engine must be bound to an index column when inserting data. The default is the primary key. If there is no primary key, a unique key will be selected. If there is no unique key, a 6-byte rowid will be generated, followed by the data. Indexes that are bound together are called clustered indexes, and indexes that are not bound to data are called non-clustered indexes.

The innodb storage engine has both clustered indexes and non-clustered indexes, while the myisam storage engine only has non-clustered indexes.

10. What are the isolation levels of transactions, and what problems are solved?

​Reference question 1

11. Describe the principle of mysql master-slave replication mechanism? What are the main modes of mysql master-slave replication?

​ Refer to the mysql master-slave replication principle document

12. How to optimize SQL? What key data should be seen in the results of the query plan?

​Refer to the execution plan document

13. Why does MySQL choose B+ tree as its storage structure? Why not choose Hash, binary, or red-black tree?

​ Refer to question 5

14. Describe the optimistic locking and pessimistic locking of MySQL, and what types of locks are they?

​ Optimistic locking does not come with the database. If you need to use optimistic locking, you need to implement it yourself. Generally, we will add a version field to the table. Each time the data is updated, version+1 will be added before submission. Determine whether the versions are consistent.

Most locks in MySQL are pessimistic locks, which can be divided into row locks and table locks according to granularity:

Line lock:

​ Shared lock: When reading a row of records, in order to prevent others from modifying it, you need to add an S lock

​ Exclusive lock: When modifying a row of records, in order to prevent others from modifying it at the same time, you need to add an X lock

X S
X Not compatible Not compatible
S Not compatible compatible

​ Record lock: a lock added on the row index

Gap lock: The lock range is the gap between index records, for isolation levels above repeatable read

​ Pro key lock: record lock + gap lock

表锁:

​ Intention lock: Before acquiring the lock of a row, you must acquire the lock of the table. It is divided into intention shared lock and intention exclusive lock.

​ Auto-increment lock: a special table-level lock used for auto-increment fields

​The meaning of lock mode:

​ IX: Intention exclusive lock

​X: Lock the record itself and the gap before the record

S: Lock the record itself and the gap before the record

​ X,REC_NOT_GAP: Only lock the record itself

S, REC_NOT_GAP: only lock the record itself

​X, GAP: Gap lock, does not lock the record itself

S, GAP: gap lock, does not lock the record itself

​ X, GAP, INSERT_INTENTION: Insert intention lock

15. How are the atomicity and durability of MySQL guaranteed?

​ Atomicity is achieved through undolog, and persistence is achieved through redo log.

Guess you like

Origin blog.csdn.net/qq_31536117/article/details/135029611