MySQL interview questions and answers

1. What is MySQL

Relational database stores data directly to the hard disk! The corresponding NoSQL non-relational database, on behalf of (Redis)

2. The difference between MyISAM and InnoDB

The difference between the two default database engines before and after version 5.5 is as follows:

a. Whether to support row-level locks (locking a certain row of data in the table to prevent other transactions from modifying or deleting data, with high granularity and high concurrent processing efficiency): MyISAM (table-level locks) and InnoDB (row-level locks and table- level locks ) lock, the default is a row-level lock)

b. Whether it supports transactions and safe recovery after a crash: MyISAM emphasizes performance, and each query operation is atomic, but does not support transactions. InnoDB provides transaction support transactions, external keys and other functions 

c. Whether to support foreign keys ( fields or attributes used to associate another table in one table ) : MyISAM (not supported) InnoDB (supported)

d. Whether to support MVCC ( concurrency control mechanism): only supported by InnoDB

3. What is an index? What are common indexes? 

The following is a simple example to introduce how to create an index in the database

Classification of data structures used by common indexes:

a.BTree index:

Mysql's BTree index mainly uses the B+ tree in the B tree; however, the storage methods of different engines before and after version 5.5 are different;

MyISAM: The data field of the B+Tree leaf node stores the address of the data record. When searching the index, first search the index according to the B+Tree search algorithm, if the specified Key exists, take out the value of its data field, and then use the value of the data field as the address to read the corresponding data record

InnoDB: Its data files are themselves index files. Compared with MyISAM , index files and data files are separated. The table data file itself is an index structure organized by B+Tree, and the leaf node data field of the tree stores complete data records. The key of this index is the primary key of the data table, so the InnoDB table data file itself is the primary index. This is called a " clustered index (or clustered index)" . The rest of the indexes are used as auxiliary indexes. The data field of the auxiliary index stores the value of the primary key of the corresponding record instead of the address. This is also different from MyISAM.

        When searching based on the primary index, you can directly find the node where the key is located to retrieve the data; when searching based on the auxiliary index, you need to retrieve the value of the primary key first, and then go through the primary index. Therefore, when designing tables, it is not recommended to use too long fields as primary keys, nor is it recommended to use non-monotonic fields as primary keys, which will cause frequent splitting of the primary index

b. Hash index:

For the hash index, the underlying data structure is the hash table. Therefore, when most of the requirements are for single record query, you can choose the hash index, which has the fastest query performance; for most other scenarios, it is recommended to choose the BTree index ;
What problems do concurrent transactions pose?
1. Dirty read:
When a transaction is accessing data and modifies the data, and this modification has not been submitted to the database at this time, if other transactions also access the data, and then use the data, because the data is not yet Committed data, then we think that the data read by other transactions is " dirty ", and the subsequent operations are naturally incorrect
2. Lost modification:
It means that when a transaction reads a piece of data, other transactions also access the data, so if the first transaction modifies the data, and other transactions modify the data, the previous modification results will be lost. This is "lost modification"
3. Non-repeatable read:
It means that a transaction reads the same data multiple times. When the transaction is not over, other transactions also access the data. Then, during the two readings of the data by the first transaction, the modification of the second transaction may cause the second The data read by two accesses of a transaction is different, which causes "non-repeatable read"
4. Phantom reading:
It means that a transaction reads a few rows of data, and then other transactions read this data and consider modifying and inserting it. In the subsequent process, some records that do not exist will be found, as if an illusion has occurred! called " phantom read "
What are the isolation levels of transactions? What is the default isolation level for MySQL?
1. Read submitted (useless and less useful):
The lowest isolation level, which allows reading uncommitted data changes, which may cause dirty reads, phantom reads, or non-repeatable reads
2. Read uncommitted (default level):
  Allows to read data that has been committed by concurrent transactions, which can prevent dirty reads, but phantom reads or non-repeatable reads may still occur 
3. Repeatable read:
The results of multiple reads of the same field are consistent, unless the data is modified by its own transaction, which can prevent dirty reads and non-repeatable reads, but phantom reads may still occur
4. Serializable:
The highest isolation level, fully compliant with the ACID isolation level. All transactions are executed one by one, so that there is no possibility of interference between transactions, that is to say, this level can prevent dirty reads, non-repeatable reads, and phantom reads; (all reads under the serialization isolation level Operations will acquire shared locks, which may lead to performance degradation. Therefore, in actual applications, an appropriate isolation level needs to be selected according to the specific situation )
Lock mechanism and InnoDB lock algorithm
In the database, the lock mechanism is used to control the way of concurrent access to data to ensure the correctness and consistency of transactions! Therefore, a variety of lock algorithms are used to implement the lock mechanism;
InnoDB lock algorithms are mainly divided into two categories: shared locks and exclusive locks
Shared lock : Allows multiple transactions to read the same data at the same time, but does not allow write operations
Exclusive lock : Only one lock is allowed to perform read and write operations, and other transactions cannot perform any operations
InnoDB adopts a two-stage locking protocol, which automatically adds appropriate shared locks or exclusive locks to related data during the execution of transactions. The following are the three lock algorithms implemented:
a. Row-level locks, adding shared locks or exclusive locks to one of the rows, so as to ensure the isolation between transactions
b. Gap lock,
c. Insert the intention lock,

Suppose there is a bank account system, and multiple users can perform account inquiries, transfers, and deposits at the same time. The following are metaphors for downstream-level locks, gap locks, and insertion intention locks in this scenario:

  • Row-level lock: Just like each account is placed in an independent cabinet, each user can only lock a certain cabinet that he needs to operate, and other users cannot modify or delete it until the user completes the operation and releases it until out of the closet.

  • Gap lock: like there is an empty locker between each account, if a user locks an account's locker, other users cannot lock the empty locker in front of or behind the account, even if there is no actual locker in those lockers account. Similarly, in a gap lock, if a transaction locks a range of data rows, other transactions cannot insert new data rows in this range, even if these data rows do not actually exist. In the case of a bank account system, this means that if a user is performing a transfer or deposit operation to an account, other users cannot insert any operations before or after that account, even if they want to query information about that account.

  • Insert intent lock: Just like a bank records the intent of the transaction when processing the transfer request, it means that the transaction may modify the account on a certain branch. Other transactions can continue to read and modify existing data, but cannot insert new operations before or after the account, and the intent lock will always exist until the transaction is committed or rolled back. This kind of lock can be used to avoid the problem of deadlock, because it can help the transaction coordinate the access sequence to the data. In the scenario of a bank account system, if a user intends to deposit or transfer money to an account, it will request an insert intent lock, indicating that it will modify the account. Other users can continue to query and operate, but cannot insert any operations before or after the account until the user completes the operation and releases the intent lock.

Large table optimization
1. First of all, it is nothing more than limited data range query
2. Then separate the reading and writing of the database. The main library is responsible for writing and the slave library is responsible for reading
3. Vertical partition splits a table with many columns into multiple tables according to the splitting of a table column
4. Horizontal partitioning keeps the structure of the data table unchanged, and uses some strategies to store data fragments, so that each piece of data is dispersed into different tables or libraries, thereby achieving the purpose of distribution; horizontal partitioning can support a very large amount of data
How to deal with the id primary key after sub-database sub-table?
1. Because after splitting into multiple tables, each table is accumulated from 1, which will cause some problems, so a globally unique id is needed to support
2. The processing method of id primary key is generally as follows: a. Database self-increment id; b. Use redis to generate id
How a SQL statement is executed in MySQL
1. Lexical analyzer: split the SQL string into individual tokens (tokens) such as keywords, table names, column names, etc.
2. Grammatical analyzer: Check whether the SQL string conforms to the grammatical structure according to the grammatical rules, and generate a parse tree
3. Optimizer: optimize the query and select the best execution plan. The optimizer will consider various optimization strategies (use of indexes, table connection order, where condition rewriting, etc.)
4. Executor: Execute the query statement according to the execution plan generated by the optimizer. The executor will read the data matching the query for each row, and process and return the result
5. Return the result set: the executor returns the result set to the client

Guess you like

Origin blog.csdn.net/weixin_64625868/article/details/130997585