[Interview] What is the difference between InnoDB and MyISAM in MySQL?

foreword

Many students use MySQL as their own database, but they may have used SQL statements and some ORM writing methods the most, but they don't know much about the underlying implementation. For example, in the above question, what are InnoDB and MyISAM, respectively, may not be very clear. However, in the interview questions of some large companies (such as Tencent), such questions may appear frequently, so a correct understanding of such questions is very important.

In fact, InnoDB and MyISAM are two "storage engines" of MySQL.

1. Database storage engine

The database storage engine is the underlying software organization of the database, and the database management system (DBMS) uses the data engine to create, query, update and delete data. Different storage engines provide different storage mechanisms, indexing techniques, locking levels, and other functions. Using different storage engines, you can also obtain specific functions.

Second, how do you know what engine your database uses?

SHOW ENGINES;

3. The principle of storage engine

First, answer the question "What is the data structure of the index used by the MyISAM and InnoDB engines" that may be asked in the interview:

Both are B+ trees, but the difference is:

  • The content stored in the data structure of the B+ tree in MyISAM is the address value of the actual data . Its index is separate from the actual data, but the index is used to point to the actual data. This type of index is called a nonclustered index.
  • The actual data is stored in the data structure of the B+ tree in InnoDB , and this kind of index is called a clustered index.

4. B tree and B+ tree

So what is a B+ tree?
insert image description here
B+ tree is a variant of B tree, for B tree:

The B tree belongs to the multi-fork tree, also known as the balanced multi-way search tree, and its rules are:

  • All node keywords are arranged in ascending order, and follow the principle of small left and large right
  • Number of child nodes: the number of child nodes of non-leaf nodes>1, and <=M, and M>=2, except for empty trees (Note: M order represents how many search paths a tree node has at most, M=M paths, when M=2 is a binary tree, and M=3 is a 3-fork)
  • Number of keywords: the number of keywords in the branch node is greater than or equal to ceil(m/2)-1 and less than or equal to M-1 (Note: ceil() is a function that rounds towards positive infinity, such as ceil(1.1) results in 2)
  • The pointer of the leaf node is null and the leaf node has the same depth

And for B+ tree:

  • The B+ tree is an upgraded version of the B tree. Compared with the B tree, the B+ tree makes full use of the node space, making the query speed more stable, and its speed is completely close to the binary search.

5. MyISAM

Back to MyISAM, its index structure is shown in the figure below, because the index file of MyISAM only saves the address of the data record. In MyISAM, there is no difference in structure between the primary index and the secondary index (Secondary key): The
insert image description here
index retrieval algorithm in MyISAM is to first search the index according to the B+Tree search algorithm. If the specified Key exists, take out the value of its data field, and then use the value of the data field as the address to read the corresponding data record.

6. InnoDB

For InnoDB, the table data file itself is an index structure organized by B+Tree, and the leaf node data domain of this tree stores complete data records.
insert image description here

Since InnoDB uses the primary key of the database as the index key, the InnoDB data table file itself is the primary index, and because the InnoDB data file needs to be aggregated according to the primary key, the table using InnoDB as the data engine needs to have a primary key. If not explicitly specified, MySQL will try to automatically select a column that can uniquely identify the data as the primary key. If it cannot be found, an implicit field will be generated as the primary key. The length of this field is 6 bytes, and the type is a long integer.

7. The difference between InnoDB and MyISAM

  • InnoDB supports transactions, but MyISAM does not. For InnoDB, each SQL language is encapsulated into a transaction by default and submitted automatically. This will affect the speed, so it is best to put multiple SQL languages ​​between begin and commit to form a transaction;

  • InnoDB supports foreign keys, while MyISAM does not. Converting an InnoDB table containing foreign keys to MYISAM will fail;

  • InnoDB is a clustered index. The data file is tied to the index and must have a primary key. The efficiency of indexing through the primary key is very high. However, the auxiliary index requires two queries. First, the primary key is queried, and then the data is queried through the primary key. Therefore, the primary key should not be too large, because if the primary key is too large, other indexes will also be large. And MyISAM is a non-clustered index, the data file is separated, and the index stores the pointer of the data file. Primary key indexes and secondary indexes are independent.

  • InnoDB does not save the specific number of rows in the table, and requires a full table scan when executing select count(*) from table. However, MyISAM uses a variable to save the number of rows in the entire table. When executing the above statement, you only need to read the variable, and the speed is very fast;

  • Innodb does not support full-text indexing, while MyISAM supports full-text indexing, and MyISAM has higher query efficiency

MyISAM InnoDB
Differences in composition: Each MyISAM is stored as three files on disk. The name of the first file starts with the name of the table, and the extension indicates the file type. .frm files store table definitions. Data files have the extension .MYD (MYData). Index files have the extension .MYI (MYIndex). The disk-based resource is the InnoDB tablespace data file and its log file. The size of the InnoDB table is only limited by the size of the operating system file, generally 2GB
Transaction processing aspects: The MyISAM type table emphasizes performance, and its execution is faster than the InnoDB type, but it does not provide transaction support InnoDB provides transaction support transaction, foreign key (foreign key) and other advanced database functions
SELECT UPDATE,INSERT,Deleteoperate If you perform a lot of SELECT, MyISAM is the better choice 1. If your data performs a large number of INSERT or UPDATE, for performance reasons, you should use InnoDB table

2. When DELETE FROM table, InnoDB will not recreate the table, but delete it row by row.

3. The LOAD TABLE FROM MASTER operation does not work for InnoDB. The solution is to first change the InnoDB table to a MyISAM table, and then change it to an InnoDB table after importing data, but it does not apply to tables with additional InnoDB features (such as foreign keys)
Actions on ****AUTO_INCREMENT Internal handling of one AUTO_INCREMEN column per table. MyISAM automatically updates this column** for INSERT and UPDATE operations. This makes AUTO_INCREMENT columns faster (at least 10%). After the value at the top of the sequence is deleted, it cannot be reused. (When an AUTO_INCREMENT column is defined as the last column of a multicolumn index, there can be cases where values ​​removed from the top of the sequence are reused). The AUTO_INCREMENT value can be reset with ALTER TABLE or myisamch For AUTO_INCREMENT type fields, InnoDB must contain only the index of the field, but in MyISAM tables, you can create a joint index with other fields for better and faster auto_increment processing If you specify an AUTO_INCREMENT column for a table, the InnoDB table handle in the data dictionary contains a counter called the auto-increment counter, which is used to assign new values ​​to the column. The auto-increment counter is only stored in main memory, not on disk. For the algorithm implementation of this counter, please refer to How AUTO_INCREMENT columns work in InnoDB
The specific number of rows in the table select count() from table, MyISAM simply reads the number of saved rows. Note that when the count() statement contains the where condition, the operations of the two tables are the same InnoDB does not save the specific number of rows in the table, that is, when executing select count(*) from table, InnoDB needs to scan the entire table to calculate how many rows there are
Lock table lock Provides row lock (locking on row level) and non-locking read in SELECTs consistent with the Oracle type. In addition, the row lock of the InnoDB table is not absolute. If MySQL cannot determine the range to be scanned when executing a SQL statement, the InnoDB table will also lock the entire table. For example, update table set num=1 where name like “%aaa%”

When selecting a storage engine, an appropriate storage engine should be selected according to the characteristics of the application system. For complex application systems, multiple storage engines can also be selected for combination according to the actual situation. The following are the usage environments of several commonly used storage engines.

  • InnoDB: is Mysql's default storage engine for transaction processing applications and supports foreign keys. If the application has relatively high requirements for the integrity of the transaction, requires data consistency under concurrent conditions, and data operations include many update and delete operations in addition to insertion and query, then the InnoDB storage engine is a more suitable choice. The InnoDB storage engine can not only effectively reduce the locking caused by deletion and update, but also ensure the complete submission and rollback of transactions. For systems with high data accuracy requirements such as billing systems or financial systems, InnoDB is the most suitable choice.
  • MyISAM: If the application is mainly read and insert operations, with few update and delete operations, and the integrity and concurrency requirements of transactions are not very high, then this storage engine is very suitable to choose.
  • MEMORY: Save all data in RAM, and provide access to several blocks in situations where records and other similar data need to be quickly located. The disadvantage of MEMORY is that there is a limit to the size of the table. A table that is too large cannot be cached in memory. The second is to ensure that the data in the table can be restored. After the database is terminated abnormally, the data in the table can be restored. MEMORY tables are usually used for small tables that are not updated frequently to get access results quickly.
  • MERGE: Used to logically combine a series of equivalent MyISAM tables and refer to them as an object. The advantage of the MERGE table is that it can break through the size limit of a single MyISAM table, and by distributing different tables on multiple disks, the access efficiency of the MERGE table can be effectively improved. This is ideal for storing VLDB environments such as data warehousing.

Summarize

For interview questions, you are generally only asked to answer the differences between InnoDB and MyISAM in use. However, if you need to delve deeper into why there are those differences, you need to understand the underlying implementation principles. By the way, you also need to have a certain understanding of the B+ tree. I believe that after reading this article, readers can already have a clearer understanding of the principles behind it, and it is one step closer to getting the desired Offer.

Guess you like

Origin blog.csdn.net/u011397981/article/details/131911557