MySQL Advanced - Storage Engines and Indexes

 navigation:

[Dark Horse Java Notes + Stepping Pit Summary] JavaSE+JavaWeb+SSM+SpringBoot+Riji Takeaway+SpringCloud+Dark Horse Tourism+Grain Mall+Xuecheng Online+Nioke Interview Questions_java dark horse notes

Table of contents

1. Storage engine 

1.1. View and set storage engine commands

1.2. InnoDB engine

1.2.1. Introduction

1.2.2. Advantages

1.2.3, ACID characteristics of InnoDB transactions

1.2.4, InnoDB architecture

1.3. MyISAM engine 

1.3.1, MyISAM engine introduction 

1.3.2, InnoDB vs. MyISAM

1.4. Other engines

2. Index

2.1. Introduction 

2.2, B+ tree

2.2.1, B+ tree introduction

2.2.2. Demonstration of innoDB's B+ tree clustered index, storing data and directories

2.3, index scheme of innoDB

2.3.1, clustered index

2.3.2, non-clustered index (also known as auxiliary index, secondary index)

2.3.3, the difference between clustered index and non-clustered index

2.3.4, joint index

2.4, MyISAM index scheme

2.5. Comparison between MyISAM and InnoDB

2.6. The cost of indexing

2.7. Hash structure

2.7.1. Hash Structure Introduction

2.7.2, innoDB adaptive hash index

2.7.3. The difference between Hash index and B+ tree index

2.8, B-tree

2.8.1. Introduction 

2.8.2 Differences between B+ tree and B tree

2.9, red-black tree

Three, innoDB data storage structure

3.1, page

3.1.1, page: the basic storage unit of the database

3.1.2, the size of the data page

3.1.3, page structure

3.2, the relationship between row, page, area, segment and table space

3.3, innoDB row format

3.3.1. Four row formats

3.3.2. Commands specifying the line format


1. Storage engine 

1.1. View and set storage engine commands

Check what storage engine mysql provides:

show engines;

View the default storage engine: 

show variables like '%storage_engine%';

or:

SELECT @@default_storage_engine;

Specify the storage engine when creating the table:

CREATE TABLE 表名(
建表语句;
) ENGINE = 存储引擎名称;

 Modify the storage engine of the table

ALTER TABLE 表名 ENGINE = 存储引擎名称;

1.2. InnoDB engine

1.2.1. Introduction

InnoDB: supports foreign keys and transactions, row locks are suitable for high concurrency, cache indexes and data, have high memory requirements (because indexes and records need to be cached), are suitable for storing large amounts of data, and have better performance for adding, deleting, and modifying (row-level locks are highly concurrency), Disk-intensive (since there are multiple non-clustered indexes, the index may be larger than the record space).

The disk files corresponding to the table stored in the InnoDB index file in the database include two files ending with *.frm and *.ibd;

  • The frm file is the stored table structure and table definition information;
  • The *.ibd file stores the data and index information in the table;

Features: 

  • MySQL has included the InnoDB storage engine since 3.23.34a. After it is greater than or equal to 5.5, the InnoDB engine is used by default .
  • InnoDB is MySQL's default transactional engine, which is designed to handle a large number of short-lived transactions . It can ensure the complete commit (Commit) and rollback (Rollback) of the transaction.
  • In addition to adding and querying, update and delete operations are also required, so the InnoDB storage engine should be preferred.
  • Unless there is a very specific reason to use another storage engine, the InnoDB engine should be preferred .
  • InnoDB is designed for maximum performance in handling huge data volumes .
  • In previous releases, dictionary data was stored in metadata files, non-transactional tables, etc. These metadata files are now removed. For example: .frm , .par , .trn , .isl , .db.opt etc. all do not exist in MySQL8.0.
  • Table name.frm stores the table structure (for MySQL8.0, it is merged into table name.ibd); table name.ibd stores data and indexes
  • InnoDB adds, deletes, and modifies better performance; MyISAM query performance is better.
  • MyISAM only caches indexes and does not cache real data; InnoDB not only caches indexes but also caches real data , which requires high memory, and memory size has a decisive impact on performance .

1.2.2. Advantages

The InnoDB storage engine has many advantages in practical applications, such as convenient operation , improved database performance, and low . If the server crashes due to hardware or software reasons, no additional action is required after restarting the server. The InnoDB crash recovery function automatically finalizes the previously submitted content, then cancels the uncommitted process, and continues to execute from the crash point after restarting.

The InnoDB storage engine maintains a buffer pool in the main memory , and frequently used data will be processed directly in the memory. This caching method is applied to a variety of information, speeding up the processing process.

On dedicated servers, up to 80% of physical memory is used in the buffer pool. If you need to insert data into different tables, you can set foreign keys to strengthen data integrity. Update or delete data, the associated data will be updated or deleted automatically. If an attempt is made to insert data into the slave table, but there is no corresponding data in the master table, the inserted data will be automatically removed. If the data on disk or in memory is corrupted, the checksum mechanism will warn you before using the dirty data. When the primary key of each table is properly set, operations related to these columns will be automatically optimized. Insert, update and delete operations are optimized by doing change buffering automatically. InnoDB not only supports current reading and writing, but also buffers changed data to the data stream disk.

InnoDB's performance benefits extend beyond large tables with long-running queries . When the same column is queried multiple times, the adaptive hash index will improve the query speed . With InnoDB, tables and related indexes can be compressed, and indexes can be created or dropped without affecting performance or availability. For large text and BLOB data, this storage layout is more efficient using the dynamic row form. The internal workings of the storage engine can be monitored by querying the tables in the INFORMATION_SCHEMA library. In the same statement, InnoDB tables can be mixed with other storage engine tables. Even though some operating systems limit the file size to 2GB, InnoDB can still handle it. When dealing with large data volumes , InnoDB takes CPU into account to achieve maximum performance

1.2.3, ACID characteristics of InnoDB transactions

image-20210724165045186

Isolation: Transactions are isolated from each other.

Persistence: Once the transaction is successful, the data must fall into the database.

The ACID model is a set of database design rules that emphasize reliability , which is important for business data and mission-critical applications. MySQL includes components similar to the InnoDB storage engine, which is closely connected to the ACID model, so that when an accident occurs, the data will not crash and the results will not be distorted. If you rely on the ACID model, you don't have to use consistency checks and crash recovery mechanisms. If you have additional software protection, extremely reliable hardware or the application can tolerate a small amount of data loss and inconsistency, you can adjust the MySQL settings to rely only on some of the ACID properties to achieve higher performance.

The following explains the four aspects of the same role of the InnoDB storage engine and the ACID model:

1. Atomic aspects The atomic aspects of ACID mainly involve InnoDB transactions, and the features related to MySQL mainly include:

  • Autocommit settings.
  • COMMIT statement.
  • ROLLBACK statement.
  • Operate the table data in the INFORMATION_SCHEMA library.

2. In terms of consistency, the consistency of the ACID model mainly involves the internal InnoDB processing process that protects data from crashing. The features related to MySQL mainly include:

  • InnoDB double write cache.
  • InnoDB crash recovery.

3. Isolation Isolation is the level applied to transactions, and the features related to MySQL mainly include:

  • Autocommit settings.
  • SET ISOLATION LEVEL statement.
  • Low-level information about InnoDB locks.

4. Persistence The persistence of the ACID model mainly involves MySQL software features that interact with hardware configurations. Due to the complexity and variety of hardware, there are no specific rules to follow when it comes to durability. Features related to MySQL are:

  • InnoDB doublewrite cache, configured through the innodb_doublewrite configuration item.
  • The configuration item innodb_flush_log_at_trx_commit.
  • Configuration item sync_binlog.
  • The configuration item innodb_file_per_table.
  • Write cache for storage devices.
  • Battery-backed cache for storage devices.
  • The operating system running MySQL.
  • Continuous power supply.
  • backup strategy.
  • For distributed or hosted applications, the most important thing is the location of the hardware device and network conditions.

1.2.4, InnoDB architecture

1. Buffer pool The buffer pool is a part of the main memory space used to cache used table and index data. Buffer pools make frequently used data available directly in memory, increasing speed.

2. The change cache The change cache is a special data structure that caches changes to secondary index pages when the affected index page is not in the cache. When the index page is read by other operations, it will be loaded into the cache pool, and the cached changes will be merged. Unlike clustered indexes, secondary indexes are not unique. When the system is mostly idle, purge runs periodically, flushing updated index pages to disk. During update cache merging, query performance can be significantly degraded. In memory, the update cache occupies a portion of the InnoDB buffer pool. On disk, the update cache is part of the system tablespace. The data type of the update cache is managed by the innodb_change_buffering configuration item.

3. Adaptive hash index Adaptive hash index combines load and enough memory to make InnoDB run like an in-memory database without reducing transactional performance or reliability. This feature is configured via the innodb_adaptive_hash_index option, or disabled via the --skip-innodb_adaptive_hash_index command line at service startup.

4. Redo log cache The redo log cache stores the data to be put into the redo log. The redo log cache size is configured through the innodb_log_buffer_size configuration item. The redo log cache periodically flushes log files to disk. A large redo log cache enables large transactions to run without writing to disk.

5. System tablespace The system tablespace includes the InnoDB data dictionary, double-write cache, update cache, and undo log, as well as table and index data. For multi-table sharing, the system tablespace is regarded as a shared tablespace.

6. Double-write cache The double-write cache is located in the system table space and is used to write data pages flushed from the cache pool. Only after flushing and writing to the double-write cache, InnoDB will write the data page to the appropriate location.

7. Undo log The undo log is a collection of undo records related to a transaction, including how to undo the latest changes of the transaction. If other transactions want to query the original data, the unchanged data can be traced back from the undo log records. Undo logs exist in undo log fragments, which are contained in rollback fragments.

8. Tablespace with one file per table The tablespace with one file per table means that each separate tablespace is created in its own data file, not in the system tablespace. This feature is enabled through the innodb_file_per_table configuration item. Each tablespace is represented by a separate .ibd data file, which is created by default in the database directory.

9. The general table space uses the CREATE TABLESPACE syntax to create a shared InnoDB table space. A general tablespace can create tables that can manage multiple tables and support all row formats outside of the MySQL data directory.

10. Undo tablespace An undo tablespace consists of one or more files that contain undo logs. The number of undo tablespaces is configured by the innodb_undo_tablespaces configuration item.

11. Temporary table space The temporary table space created by the user and the disk-based internal temporary table are all created in the temporary table space. The innodb_temp_data_file_path configuration item defines related paths, names, sizes, and attributes. If the value is empty, an auto-extending data file will be created by default in the directory specified by the innodb_data_home_dir variable.

12. Redo logs Redo logs are disk-based data structures used during crash recovery to correct data. During normal operation, the redo logs encode data for requests that change InnoDB table data. Unfinished changes are automatically redone during initialization after an unexpected crash.

1.3. MyISAM engine 

1.3.1, MyISAM engine introduction 

MyISAM: does not support foreign keys and transactions, table locks are not suitable for high concurrency, cache indexes and data addresses, low memory requirements (because no cache records), better query performance (because InnoDB must maintain MVCC consistency when querying, and more caches record), saving disk (because the disk does not store complete records).

The disk files corresponding to tables stored in the database by MyISAM index files include three files ending with *.frm, *.MYD, and *.MYI;

  • The frm file is the stored table structure and table definition information;
  • The MYD file stores the data in the table;
  • The MYI file stores the index information of the table;

MyISAM features:

  • MyISAM provides a large number of features, including full-text indexing, compression, spatial functions (GIS), etc., but MyISAM does not support transactions, row-level locks, and foreign keys . One undoubted defect is that it cannot be safely recovered after a crash .
  • The default storage engine before 5.5
  • The advantage is that the access speed is fast , there is no requirement for transaction integrity or applications based on SELECT and INSERT
  • There is additional constant storage for statistics . Therefore, the query efficiency of count(*) is very high
  • Table name.frm stores table structure; table name.MYD stores data (MYData); table name.MYI stores index (MYIndex)
  • Application scenarios: read-only applications or read-based businesses

1.3.2, InnoDB vs. MyISAM

InnoDB: supports foreign keys and transactions, row locks are suitable for high concurrency, cache indexes and data, have high memory requirements (because indexes and records need to be cached), are suitable for storing large amounts of data, and have better performance for adding, deleting, and modifying (row-level locks are highly concurrency), Disk-intensive (since there are multiple non-clustered indexes, the index may be larger than the record space).

MyISAM: does not support foreign keys and transactions, table locks are not suitable for high concurrency, cache indexes and data addresses, low memory requirements (because no cache records), better query performance (because InnoDB must maintain MVCC consistency when querying, and more caches record), saving disk (because the disk does not store complete records).

Compared

InnoDB

MyISAM

features

Support for foreign keys and transactions

Foreign keys and transactions are not supported

row table lock

Row lock, only locks a certain row during operation, does not affect other rows, suitable for highly concurrent operations

Table lock, even if one record is operated, the entire table will be locked, which is not suitable for highly concurrent operations

cache

Caching indexes and data has high memory requirements, and memory size has a decisive impact on performance

Only the index is cached, the real data is not cached

focus point

Transactions: concurrent writes, transactions, larger resources

Performance: saving resources, less consumption, simple business, fast query

use by default

5.5 and later

Before 5.5

1.4. Other engines

  • Archive engine: for data archiving. Ideal for storing large amounts of independent, historical data as they are not read very often. It has high insertion speed, but its query support is relatively poor.
  • Blackhole engine: write operations are discarded, read operations return empty content
  • CSV engine: When storing data, separate data items with commas
  • Memory engine: tables placed in memory. Stores all data in RAM for fast access in environments where non-critical data needs to be looked up quickly, formerly known as the HEAP engine.
  • Federated Engine: Access remote tables. Provides the ability to connect to separate MySQL servers and create a logical database from multiple physical servers, ideal for distributed or data mart environments.
  • Merge engine: manage a collection of tables composed of multiple MyISAM tables
  • NDB engine: a dedicated storage engine for MySQL clusters. A highly redundant storage engine uses multiple data machines to jointly provide services to improve overall performance and security. It is suitable for applications with large data volume and high security and performance requirements.

2. Index

2.1. Introduction 

MySQL's official definition of index is: Index (Index) is a data structure that helps MySQL obtain data efficiently .

An index is an ordered data structure for fast lookups .

The nature of the index: the index is a data structure . You can simply understand it as a " sorted fast search data structure" that satisfies a specific search algorithm. These data structures point to data in a certain way, so that advanced search algorithms can be implemented on top of these data structures.

Indexes are implemented in storage engines , so the indexes of each storage engine are not necessarily identical, and each storage engine does not necessarily support all index types. The index of the innoDB storage engine is a B+ tree . At the same time, the storage engine can define the maximum number of indexes and the maximum index length for each table. All storage engines support at least 16 indexes per table, with a total index length of at least 256 bytes. Some storage engines support more indexes and larger index lengths.

advantage: 

(1) Similar to the bibliographic index built by a university library, it improves the efficiency of data retrieval , reduces the IO cost of the database, and reduces the number of disk I/Os . This is also the main reason for creating an index.

(2) By creating a unique index , the uniqueness of each row of data in the database table can be guaranteed.

(3) In terms of realizing the referential integrity of data, it can speed up the connection between tables . In other words, the query speed can be improved when the dependent child table and the parent table are jointly queried .

(4) When using grouping and sorting clauses for data query, the query speed can be significantly improved , because the index is "sorted", which reduces the time of grouping and sorting in the query, and reduces the CPU consumption.

shortcoming

Adding indexes also has many disadvantages, mainly in the following aspects:

(1) It takes time to create and maintain indexes, and as the amount of data increases, the time spent will also increase.

(2) The index needs to occupy disk space . In addition to the data space occupied by the data table , each index also occupies a certain physical space and is stored on the disk. If there are a large number of indexes, the index file may be faster than the data file. Maximum file size.

(3) Although the index greatly improves the query speed, it will reduce the speed of updating the table . When adding, deleting and modifying the data in the table , the index should also be maintained dynamically , which reduces the speed of data maintenance.

Therefore, when choosing to use an index, the advantages and disadvantages of the index need to be considered comprehensively.

2.2, B+ tree

2.2.1, B+ tree introduction

B+ tree is a tree data structure, usually used in the file system of database and operating system. The characteristic of the B+ tree is that it can keep the data stable and orderly, and its insertion and modification have a relatively stable logarithmic time complexity. B+ tree elements are inserted from the bottom up, and the bottom layer is level 0, which is just the opposite of the binary tree. 

The m-order B+ tree has the following characteristics: 

  1. The number of keywords of each non-leaf node is equal to the number of children;
  2. The number of root node keywords is 2 to m, and the number of non-root node keywords is ⌈m/2⌉ to m;
  3. all leaves are on the same layer;

 B+ tree structure:

The lower the number of layers, the fewer IO times, and the faster the query. 

Precautions for InnoDB's B+ tree index:

1. The location of the root page remains unchanged for ten thousand years.
2. The uniqueness of directory entry records in internal nodes
. 3. A page stores at least 2 records 

In fact, a data page can store 100 records, and a directory page can store 1,000 records; a 4-layer B+ tree can store 10 million records. The B+ tree we use will not exceed 4 layers, and each page can be used internally. Binary search is faster to find.

InnoDB non-clustered index situation:

The size of an InnoDB page is 16KB . The primary key type of a general table is INT (occupies 4 bytes) or BIGINT (occupies 8 bytes ), and the pointer type is generally 4 or 8 bytes , that is to say, a page (B +A node in the Tree) stores approximately 16KB/(8B+8B)=1K key values ​​(because it is an estimate, for the convenience of calculation, the value of K here is 10^3. That is to say, a depth of 3 B+Tree index can maintain 10^3*10^3*10^3= 100 million records (here it is assumed that a data page also stores 10A3 row record data)

In actual situations, each node may not be fully filled, so in the database, the height of B+Tree is generally 2~4 layers . MySQL's lnnoDB storage engine is designed to keep the root node resident in memory, that is to say, only 1 to 3 disk IO operations are needed at most when searching for a row record of a certain key value (the root node is cached, not counting the number of IOs) .

Data page size:

The data page size of MyISAM is fixed at 1KB, that is to say, the data of the MyISAM storage engine is managed in blocks of 1KB.

The data page size of the InnoDB storage engine is adjustable, and the default is 16KB. Prior to MySQL 5.7, the default data page size for InnoDB was 8KB. The data page size of InnoDB can be set through parameters  innodb_page_size , and the value range is 4KB, 8KB, 16KB and 32KB.

2.2.2. Demonstration of innoDB's B+ tree clustered index, storing data and directories

Each record is in Compact row format:

CREATE TABLE index_demo(
 c1 INT,
 c2 INT,
 c3 CHAR(1),
 PRIMARY KEY(c1)
 ) ROW_FORMAT = Compact;

Demonstration: Assume that a data page can only store three pieces of data, and a directory page can only store four pieces of data. The storage status is as follows: 

Single directory (two-level B+ tree): 

 A large directory nests multiple small directories (3-layer B+ tree):

2.3, index scheme of innoDB

2.3.1, clustered index

The clustered index is not a separate index type, but a data storage method (based on the primary key mapping directory and sorted B+ tree, all user records are stored in the leaf nodes ), which is the so-called index as data ( Because the record exists in the B + leaf node ), the data is the index. 

Features:

1. Use the size of the record primary key value to map between layers and sort within layers, which includes three meanings:

  • The records in the page are arranged in a one-way linked list in order of the size of the primary key .
  • Each page storing user records is also arranged in a doubly linked list according to the order of the primary key size of the user records in the page .
  • The pages storing the directory entry records are divided into different levels, and the pages in the same level are also arranged in a doubly linked list according to the order of the primary key size of the directory entry records in the page .

2. The leaf nodes of the B+ tree store complete user records .

The so-called complete user record means that the values ​​of all columns (including hidden columns) are stored in this record.

advantage:

  • Data access is faster because the clustered index stores the index and data in the same B+ tree , so getting data from the clustered index is faster than the non-clustered index
  • The clustered index is very fast for the sort lookup and range lookup of the primary key
  • According to the order of the clustered index, when the query displays a certain range of data, because the data are closely connected, the database does not need to extract data from multiple data blocks, so a lot of io operations are saved .

shortcoming:

  • The insertion speed depends heavily on the insertion order . Inserting in the order of the primary key is the fastest way, otherwise page splits will occur, seriously affecting performance. Therefore, for InnoDB tables, we generally define an auto-incrementing ID column as the primary key
  • Updating a primary key is expensive because the row being updated will be moved. Therefore, for InnoDB tables, we generally define the primary key as non-updatable
  • Secondary index access requires two index lookups, the first to find the primary key value, and the second to find row data based on the primary key value

2.3.2, non-clustered index (also known as auxiliary index, secondary index)

The clustered index is based on the primary key mapping directory and the sorting between pages within the page. It is only useful when the query condition is the primary key.

When you want to check the second column , you have to clustered index B+ tree , which maps directories and sorts based on non-primary key columns. The leaf nodes store the values ​​​​of the non-primary key fields and the values ​​​​of the primary key fields. Find the primary key value through the second column, and then return to the table (return to the table of the clustered index) to check the complete record according to the primary key value.

Non-clustered index is a data storage method (B+ tree based on non-primary key field mapping directory and sorting, leaf nodes store the value of non-primary key field and the value of primary key field ).

Note: It is recommended to use the auto-increment strategy for the innoDB primary key, starting from 0. because:

  • The primary key of innoDB is not recommended to be too long, because each secondary index must store the primary key, and if the primary key is too long, it will consume disk space and performance (each data page can only be 16KB, and the primary key space will cause the number of records stored on each page to change. less, resulting in deeper B+ tree levels);
  • The insertion speed of the B+ tree depends heavily on the insertion order. The non-monotonic primary key will cause the B+ tree to be frequently split and adjusted to the auto-increment order, resulting in poor performance.

In practice, a MySQL table will have a clustered index for looking up the primary key or back to the table, and multiple non-clustered indexes (auxiliary and secondary indexes) for looking up non-primary key fields.

Why is there no complete record in the leaf nodes of the non-clustered index B+ tree, and it is unnecessary to go back to the table?

If there are 100 columns in the table, 99 non-clustered indexes are needed. If all records are complete, it would be a waste of disk space.

2.3.3, the difference between clustered index and non-clustered index

  1. Leaf node: The leaf node of the clustered index stores our data records , and the leaf node of the non-clustered index stores the data location . Nonclustered indexes do not affect the physical storage order of data tables.
  2. Quantity: A table can only have one clustered index , because there can only be one sorted storage method, but there can be multiple non-clustered indexes , that is, multiple index directories to provide data retrieval.
  3. Efficiency: When using a clustered index , the data query efficiency is high , but if the data is inserted, deleted, updated, etc., the efficiency will be lower than that of the non-clustered index . Because of the complete records stored in the clustered index, it is slow to move;

2.3.4, joint index

We can also use the size of multiple columns as a sorting rule at the same time, that is, create a non-clustered index for multiple columns at the same time .

For example, we want the B+ tree to be sorted according to the size of the c2 and c3 columns, which contains two meanings:

  • Each directory record includes c2 field, c3 field, and page number, and each user record includes c2 field, c3 field, and primary key;
  • First sort each record and page according to column c2;
  • When the c2 column of the record is the same , the c3 column is used for sorting

Notice:

The B+ tree built with the size of the c2 and c3 columns as the sorting rule is called a joint index, which is also a non-clustered index in essence . Its meaning is different from the expression of building indexes for columns c2 and c3 respectively. The differences are as follows:

  • Building a joint index will only create a B+ tree as shown in the above figure .
  • Creating indexes for columns c2 and c3 respectively will create two B+ trees with the size of columns c2 and c3 as the sorting rules .

2.4, MyISAM index scheme

The MyISAM engine uses B+Tree as the index structure, and the data field of the leaf node stores the address of the field to be queried and the data record .

The MyISAM engine does not have a secondary index, only a clustered index.

2.5. Comparison between MyISAM and InnoDB

The index methods of MyISAM are all "non-clustered", which is different from InnoDB which contains a clustered index.

① Search times: In the InnoDB storage engine, we only need to search the clustered index once according to the primary key value to find the corresponding record, but in MyISAM , we need to perform a return table operation, which means that the index established in MyISAM It is equivalent to all secondary indexes.

② Whether "index is data": InnoDB's data file itself is an index file, while MyISAM index file and data file are separated, and the index file only saves the address of the data record.

③ Leaf node data field storage content: InnoDB's non-clustered index data field stores the value of the corresponding record primary key, while the MyISAM index records the fields and addresses to be queried. In other words, all non-clustered indexes of InnoDB refer to the primary key as the data field.

④ Query speed: MyISAM 's table return operation is very fast , because it directly fetches data from the file with the address offset. In contrast, InnoDB obtains the primary key and then returns to the table to find records in the clustered index. It's not slow, but it's still not as good as directly using the address to access.

⑤ Whether there must be a primary key: InnoDB requires that the table must have a primary key (MyISAM may not). If not explicitly specified, the MySQL system will automatically select a column that can be non-null and uniquely identify the data record as the primary key. If there is no such column, MySQL automatically generates an implicit field for the InnoDB table as the primary key. The length of this field is 6 bytes, and the type is a long integer. 

2.6. The cost of indexing

The index is a good thing, but it cannot be built randomly, it will consume space and time:

space cost

Every time an index is created, a B+ tree must be built for it. Each node of each B+ tree is a data page. A page will occupy 16KB of storage space by default. A large B+ tree consists of many data pages Composition, that is a large piece of storage space .

time cost

Every time you add, delete, or modify data in the table , you need to modify each B+ tree index . And as we said, the nodes at each level of the B+ tree are sorted according to the values ​​of the index columns in ascending order to form a doubly linked list. Whether it is the records in the leaf nodes or the records in the inner nodes (that is, whether it is a user record or a directory entry record), a one-way linked list is formed according to the order of the index column values ​​from small to large. The addition, deletion, and modification operations may damage the ordering of nodes and records, so the storage engine needs additional time to perform operations such as record shifting, page splitting, and page recycling to maintain the ordering of nodes and records. If
we build a lot of indexes, the B+ tree corresponding to each index needs to perform related maintenance operations, which will slow down the performance.

2.7. Hash structure

2.7.1. Hash Structure Introduction

Hash table + chain address method to deal with conflicts + when the length of the linked list is greater than 8, it will be converted into a red-black tree;

Hash itself is a function, also known as a hash function, which can help us greatly improve the efficiency of retrieving data. The Hash algorithm converts input into output through a certain deterministic algorithm (such as MD5, SHA1, SHA2, SHA3). The same input can always get the same output, assuming a slight deviation in the input content, there will usually be different results in the output.

Advantages: The time complexity of adding, deleting, modifying and checking is O(1), which is faster than B+ tree in terms of efficiency;

Disadvantages: range search efficiency is poor, degraded to O(n); sorting efficiency is extremely slow, data storage is out of order; each node calculates the hash value by combining all fields, and cannot index a single field; it is not recommended for situations with many repeated values When using, it is time-consuming to constantly compare and organize the chain address or red-black tree when there is a conflict;

2.7.2, innoDB adaptive hash index

InnoDB does not support hash indexes, but supports adaptive hash indexes . If a certain data is frequently accessed , when certain conditions are met, the address of this data page will be stored in the Hash table . In this way, the next time you query, you can directly find the location of this page. In this way, the B+ tree also has the advantages of the Hash index.

2.7.3. The difference between Hash index and B+ tree index

1. Hash index cannot perform range query , but B+ tree can. This is because the data pointed to by the Hash index is unordered , while the leaf node of B+ is an ordered linked list.

2. The Hash index does not support the leftmost principle of the joint index (that is, some indexes of the joint index cannot be used), while the B+ tree can. For a joint index, when calculating the Hash value of the Hash index, the index keys are merged and then the Hash value is calculated together , so the Hash value is not calculated separately for each index. Therefore, if one or several indexes of the joint index are used, the joint index cannot be used.

3. The Hash index does not support ORDER BY sorting , because the data pointed to by the Hash index is out of order , so it cannot play the role of sorting optimization, while the B+ tree index data is ordered, which can optimize the ORDER BY sorting of the field role. Similarly, we cannot use Hash index for fuzzy query, but when B+ uses LKE for fuzzy query, fuzzy query after LIKE (such as % at the end) can be

2.8, B-tree

2.8.1. Introduction 

The English of the B tree is Balance Tree, which is a multi way balanced search tree . It is abbreviated as B-Tree (note that the horizontal bar indicates the meaning of connecting these two words, not the minus sign). Its height is much smaller than the height of the balanced binary tree, after all, it is multi-fork rather than binary.

B-tree is a multi-way balanced search tree, and each node of it can include M child nodes at most, and M is called the order of B-tree. Each disk block includes keywords and pointers to child nodes . If a disk block contains x keywords , then the number of pointers is x+1 (for example, a node has two keywords 17 and 35, and the primary key values ​​of its three child nodes are less than 17, and between 17 and 35 , greater than 35). For a 100-order B-tree, if there are 3 layers, it can store up to about 1 million index data.

summary:

1. If the B tree causes the tree to be unbalanced when inserting and deleting nodes, it will maintain the self-balancing of the tree by automatically adjusting the position of the nodes .

2. Keyword sets are distributed throughout the tree, that is, both leaf nodes and non-leaf nodes store data . The search may end up at a non-leaf node

3. Its search performance is equivalent to doing a binary search in the complete set of keywords. 

2.8.2 Differences between B+ tree and B tree

1. Number of keywords for non-leaf nodes: k-order tree, B+ tree non-leaf nodes have k values ​​and k children, B-tree non-leaf nodes have k-1 values ​​and k children;

2. Record storage location: all B+ tree records exist in leaf nodes, and B tree records exist in all nodes;

3. Non-leaf node function: B+ tree non-leaf nodes store indexes, B-tree non-leaf nodes store records;

4. The relationship between the leaf nodes: all the leaf nodes of the B+ tree form an ordered doubly linked list; there are no pointers between the leaf nodes of the B tree, they are only ordered;

Both B-tree and B+ tree can be used as the data structure of the index, and the B+ tree is used in MySQL.

However, B-tree and B+ tree have their own application scenarios. It cannot be said that B+ tree is completely better than B-tree, and vice versa.

The intermediate nodes of the B+ tree do not store data directly. The benefits are:

The query efficiency is higher (shorter and fatter than the B tree), the number of IOs is less, more stable, and the query range is larger.

In order to reduce IO, will the index tree be loaded in one go?

No, the data pages will be loaded one by one, the large catalog page will be loaded first, then the small catalog page, and then the record page will be loaded. 

  1. The database index is stored on the disk. If the amount of data is large, the size of the index will inevitably be large, exceeding several gigabytes .
  2. When we use the index to query, it is impossible to load all the indexes of several G into the memory. What we can do is: load each disk page one by one , because the disk page corresponds to the node of the index tree.

What is the storage capacity of the B+ tree? Why do you say that generally you only need 1 to 3 disk IOs to search for row records?

The page size in the lnnoDB storage engine is 16KB , the primary key type of a general table is INT (occupies 4 bytes) or BIGINT (occupies 8 bytes) , and the pointer type is generally 4 or 8 bytes, that is to say, a About 16KB/(8B+8B)=1K key values ​​are stored in the page (a node in B+Tree) (because it is an estimate, for the convenience of calculation, the value of K here is 10^3. That is to say, a depth A B+Tree index of 3 can maintain 10^3*10^3*10^3= 100 million records (here it is assumed that a data page also stores 10A3 row record data)

In actual situations, each node may not be fully filled, so in the database, the height of B+Tree is generally 2~4 layers. MySQL's lnnoDB storage engine is designed to keep the root node resident in memory, that is to say, only 1 to 3 disk IO operations are needed at most when searching for a row record of a certain key value (the root node is cached, not counting the number of IOs)

2.9, red-black tree

Why does the database use B+ trees instead of red-black trees? 

Because the B+ tree is multi-fork, and the red-black tree is binary, the B+ tree is shorter and fatter, with higher query performance and fewer IO times.

Red-black tree: an approximately balanced binary tree , the height difference between the left and right subtrees may be greater than 1, the search efficiency is slightly lower than that of a balanced binary tree, but the addition and deletion efficiency is higher than that of a balanced binary tree, suitable for frequent insertion and deletion.

  • Nodes are either black or red;
  • The root node is black, and the leaf node is a black empty node (often omitted);
  • Any adjacent node cannot be red at the same time;
  • All paths from any node to each of its leaves contain the same number of black nodes;
  • The query performance is stable O(logN), and the height is up to 2log(n+1);

Three, innoDB data storage structure

3.1, page

3.1.1, page: the basic storage unit of the database

A page is the basic unit of interaction between disk and memory .

InnoDB divides data into thousands of pages, and the default page size in InnoDB is 16KB

The page is used as the basic unit of interaction between disk and memory, that is, at least 16KB of content is read from disk to memory at a time, and at least 16KB of content in memory is flushed to disk at a time. That is to say, in the database, no matter if one row or multiple rows are read, the pages where these rows are located are loaded .

That is to say, the basic unit of database management storage space is page , and the smallest unit of database I/0 operation is page. Multiple row records can be stored in a page.

Records are stored in rows, but the reading of the database is not in units of rows, otherwise one read (that is, one /0 operation) can only process one row of data, and the efficiency will be very low.

3.1.2, the size of the data page

The data page size of MyISAM is fixed at 1KB, that is to say, the data of the MyISAM storage engine is managed in blocks of 1KB.

The data page size of the InnoDB storage engine is adjustable, and the default is 16KB. Prior to MySQL 5.7, the default data page size for InnoDB was 8KB. The data page size of InnoDB can be set through the parameter innodb_page_size, and the value range is 4KB, 8KB, 16KB and 32KB.

 

3.1.3, page structure

Page a, page b, page c...page n These pages may not be connected in physical structure , as long as they are connected through a bidirectional table . The records in each data page will form a one-way linked list according to the order of the primary key value from small to large, and each data page will generate a page directory for the records stored in it . When searching for a record through the primary key, you can Use the dichotomy method in the page directory (ordered tables can use the dichotomy method) to quickly locate the corresponding slot, and then traverse the records in the group corresponding to the slot to quickly find the specified record.

Different database management systems (DBMS for short) have different page sizes. For example, in MySQL's nnoDB storage engine, the default page size is 16KB. View page size:

show variables like '%innodb_page_size%' 

 

The internal structure of the page:

If the pages are divided by type, the common ones are data pages (save B+ tree nodes), system pages , undo pages , and transaction data pages . The data page is the page we use most often.

The 16KB storage space of the data page is divided into seven parts, namely the file header (File Header), page header (Page Header), maximum and minimum records (Infimum+supremum), user records (User Records), free space ( Free Space), page directory (PageDirectory) and file tail (File Tailer).

A schematic diagram of the page structure is shown below:

The functions of these 7 parts are as follows, we briefly summarize them as shown in the table below 

We can divide these 7 structures into 3 parts:

Part 1: File Header (file header) and File Trailer (file tail)

Part 2: User Records (user records), maximum and minimum records, Free Space (free space)

Part 3: Page Directory (page directory), Page Header (page header) 

3.2, the relationship between row, page, area, segment and table space

In addition, in the database, there are also the concepts of Extent , Segment and Tablespace .

The relationship between rows, pages, regions, segments, and table spaces is shown in the following figure:

Extent is a storage structure one level larger than a page. In the innoDB storage engine, an extent will allocate 64 consecutive pages . Because the page size in nnoDB is 16KB by default, the size of an area is 64*16KB= 1MB .

A segment (Segment) consists of one or more areas , which is a contiguously allocated space in the file system (64 consecutive pages in innoDB), but it is not required to be adjacent to each other in the segment. A segment is an allocation unit in a database , and different types of database objects exist in different segments. When we create data tables and indexes, corresponding segments will be created accordingly. For example, a table segment will be created when a table is created, and an index segment will be created when an index is created.

Tablespace (Tablespace) is a logical container , and the object stored in the tablespace is a segment . There can be one or more segments in a tablespace, but a segment can only belong to one tablespace. The database consists of one or more table spaces, which can be divided into system

 

3.3, innoDB row format

3.3.1. Four row formats

Our usual data is inserted into the table in units of rows. The storage method of these records on the disk is also called row format (also called record format).

The InnoDB storage engine designs four different types of row formats, namely Compact, Redundant, Dynamic, and Compressed row formats.

Compact 

In MySQL 5.1 version, the default setting is Compact row format. A complete record can actually be divided into two parts: the additional information of the record and the real data of the record.

 

 

Dynamic

In MySQL 8.0, the default row format is Dynamic.

The Dynamic and Compressed row formats are quite similar to the Compact row format, but there are differences when dealing with row overflow data:

  • The two record formats, Compressed and Dynamic, use a complete row overflow method for the data stored in the BLOB. As shown in the figure, only a 20-byte pointer (the address of the overflow page) is stored in the data page, and the actual data is stored in the Off Page (overflow page).
  • The two formats of Compact and Redundant will store a part of the data (768 prefix bytes) at the real data of the record.

Compressed 

Another function of the Compressed row record format is that the row data stored in it will be compressed with the zlib algorithm, so it can be very effective for storing large-length data such as BLOB, TEXT, and VARCHAR.

 

Redundant 

Redundant is the row record storage method of InnoDB before MySQL 5.0. MySQL 5.0 supports Redundant to be compatible with the page format of previous versions. 

 

As can be seen from the figure above, unlike the Compact line record format, the header of the Redundant line format is a field length offset list, which is also placed in reverse order according to the order of the columns.

3.3.2. Commands specifying the line format

View the default row format of MySQL8:

SELECT @@innodb_default_row_format;

You can also use the following syntax to view the row format used by a specific table:

SHOW TABLE STATUS like '表名'\G

Specify the row format in a statement that creates or modifies a table:

CREATE TABLE 表名 (列的信息) ROW_FORMAT=行格式名称

ALTER TABLE 表名 ROW_FORMAT=行格式名称

Example:

 CREATE TABLE record_test_table (

    col1 VARCHAR(8),

    col2 VARCHAR(8) NOT NULL,

    col3 CHAR(8),

    col4 VARCHAR(8)

 ) CHARSET=ascii ROW_FORMAT=COMPACT;

Insert two records into the table:

INSERT INTO record_test_table(col1, col2, col3, col4) 

VALUES

('zhangsan', 'lisi', 'wangwu', 'songhk'), 

('tong', 'chen', NULL, NULL);

Guess you like

Origin blog.csdn.net/qq_40991313/article/details/130308116