Revealing the MySQL index world: concepts, classifications, and application scenarios all in one place

1. Index concept

MySQL index is a data structure used to improve database query performance. It allows database systems to retrieve rows of data more efficiently, reducing the time spent searching for specific data in large data sets. An index acts like a table of contents in a book, speeding up access to data in a database table by providing a mapping between keywords and actual data locations.

2. Index type

MySQL supports multiple types of indexes, which can be divided according to different classification criteria. The following are common index types in MySQL, introduced according to their classification:

1. Classification according to data structure:

  • B+Tree index: B+Tree (a variant of B-tree) index is a common index structure and is widely used in database management systems. B+Tree indexes are used in practical applications by database systems such as MySQL, and are mainly used to improve the retrieval efficiency of data in database tables.
  • Hash index: An index built using a hash algorithm, suitable for equality queries, but not suitable for range queries and sorting.
  • Full-text index: Index for full-text search, supporting keyword search on text data.

2. Classification according to nature & purpose:

Primary key index, unique index and ordinary index are common index types in databases. They have different characteristics and uses:

  1. Primary Key Index:
    • Properties: The primary key index is a special unique index used to uniquely identify each row in the table.
    • Uniqueness: The value of the primary key index column must be unique, and duplicate primary key values ​​are not allowed.
    • Main purpose: The primary key index is usually used as the primary key of the table to uniquely identify each row of records in the table.
CREATE TABLE example (
   id INT PRIMARY KEY,
   name VARCHAR(50)
);
  1. Unique Index:
    • Properties: A unique index requires that all values ​​in the indexed column are unique, but a NULL value is allowed.
    • Uniqueness: Index values ​​for different rows must be unique, and a NULL value is allowed.
    • Main purpose: A unique index is used to ensure that the data in a column or column group in a table does not contain duplicates.
CREATE TABLE example (
   id INT,
   email VARCHAR(50) UNIQUE,
   PRIMARY KEY (id)
);
  1. Non-Clustered Index:
    • Properties: Ordinary index is the most basic index type and has no uniqueness constraints.
    • Uniqueness: allows duplicate index values.
    • Main purposes: Ordinary indexes are used to speed up the retrieval of data in the table, and can be used for operations such as equality queries and range queries.
CREATE TABLE example (
   id INT,
   name VARCHAR(50),
   INDEX name_index (name)
);

3. Classify according to scope:

Single-column indexes and joint indexes are two common index types that differ in how they are used in databases and in their performance impact.

1. Single-Column Index:

  • Properties: A single-column index is an index created for a single column in the table.
  • Uniqueness: A single-column index can be a unique index or a normal index. A unique index requires that the values ​​in the indexed columns are unique, while a normal index allows duplicate index values.
  • Purpose: Single-column indexes are usually used to speed up operations such as equality queries and range queries on a single column.
CREATE TABLE example (
   id INT PRIMARY KEY,
   name VARCHAR(50),
   INDEX name_index (name)
);

2. Composite Index:

  • Properties: A joint index is an index created for multiple columns in a table. These columns are combined into an index in a certain order.
  • Uniqueness: The joint index can be a unique index, which requires all values ​​in the combined index column to be unique, or it can be an ordinary index.

  • Purpose: Union indexes are often used to speed up queries on combinations of multiple columns, especially when equality queries or range queries involving multiple columns are involved.
CREATE TABLE example (
   id INT,
   category VARCHAR(50),
   price DECIMAL(10,2),
   INDEX category_price_index (category, price)
);

4. Classification according to storage location:

Clustered indexes and non-clustered indexes are two commonly used index types in MySQL.

1. Clustered index(clustered index)

A clustered index is the only index in a table and determines the physical order of data in the table. The leaf nodes of the clustered index store the data rows in the table. Therefore, when querying based on the clustered index, you can directly locate the data rows that meet the query conditions.

Clustered indexes are mainly used in the following scenarios:

  • Primary key: The primary key index is a special form of clustered index. The leaf nodes of the primary key index store the values ​​of the primary key columns in the table. Therefore, when querying based on the primary key index, you can directly locate the data rows that meet the query conditions.
  • Range query: Range query refers to querying data that satisfies a certain condition range. For range queries, clustered indexes can effectively improve query efficiency.

2. Non-clustered index(non-clustered index)

The leaf nodes of the non-clustered index store pointers to data rows. Therefore, when querying based on the non-clustered index, you need to first locate the leaf nodes, and then locate the data rows based on the pointers in the leaf nodes.

Non-clustered indexes are mainly used in the following scenarios:

  • Non-primary key: Non-clustered indexes can be created on columns other than the primary key column.
  • Equal value query: Equivalent query refers to querying data that meets a certain condition. For equivalent queries, nonclustered indexes can improve query efficiency.
  • Join operation: A join operation refers to querying data in two or more tables. For join operations, creating a non-clustered index on the join condition column can improve the efficiency of the join operation.

InnoDB and MyISAM are two commonly used storage engines in MySQL. The following differences exist between InnoDB and MyISAM support for clustered and nonclustered indexes:

InnoDB

  • InnoDB supports clustered indexes, and each table can only have one clustered index. The leaf nodes of the clustered index store the data rows in the table. Therefore, when querying based on the clustered index, you can directly locate the data rows that meet the query conditions.
  • InnoDB also supports non-clustered indexes. The leaf nodes of non-clustered indexes store pointers to data rows.

MyISAM

  • MyISAM does not support clustered indexes, so the data in the table is stored randomly.
  • MyISAM supports non-clustered indexes, and the leaf nodes of the non-clustered index store pointers to data rows.

Therefore, the corresponding relationship between clustered indexes and non-clustered indexes and InnoDB and MyISAM is as shown in the following table:

Index type

InnoDB

MyISAM

clustered index

support

not support

nonclustered index

support

support

3. Usage scenarios

MySQL indexes are a common method in databases to improve query speed. An index is a special data structure in a database that sorts the values ​​of one or more columns in a database table and stores the sorted results in an index file. When a user executes a query, MySQL will search the index file based on the query conditions and quickly locate the corresponding record based on the results in the index file.

The usage scenarios of MySQL indexes mainly include the following aspects:

  • Large table query: For large tables, if no index is used, MySQL needs to scan row by row starting from the table header until it finds records that meet the query conditions. This can be very time consuming for large tables. After using the index, MySQL can directly locate records that meet the query conditions based on the index file, thereby improving query speed.
  • Join operation: A join operation refers to querying data in two or more tables. The efficiency of the join operation depends heavily on the join conditions. If there is an index on the column of the connection condition, MySQL can directly locate the records in the two tables that meet the connection condition according to the index file, thereby improving the efficiency of the connection operation.
  • Data sorting: If the data needs to be sorted, MySQL needs to sort the data of the entire table. This can be very time consuming for large tables. After using the index, MySQL can sort the data directly based on the index file, thereby improving sorting efficiency.
  • Multiple column query: If the query condition involves multiple columns, MySQL needs to check whether the value of each column meets the query condition one by one. This can be very time consuming for multi-column queries. After using the index, MySQL can quickly locate records that meet the query conditions based on the index file, thereby improving query efficiency.

When using MySQL indexes, you need to pay attention to the following points:

  • Indexes will occupy a certain amount of storage space. Therefore, when creating an index, you need to consider the impact of the index on storage space.
  • Indexes affect how quickly data is updated. Because the index file needs to be updated regularly, updating the data will affect the update of the index.
  • The use of indexes needs to be based on actual query requirements. If query conditions are rarely used, creating an index may reduce query efficiency.

4. Related topics

  • 1. What is an index? What does an index do?
    • An index is a data structure that can help the database quickly query, sort, group, join and other operations. The function of the index is to improve the performance of the database, reduce disk I/O operations, and save query time.
  • 2. What types of indexes are there?
    • There are many types of indexes, the common ones are as follows:
      • Primary key index: The primary key index is a unique index, which ensures the uniqueness of each row of data in the table. The primary key index is the identifier of the table, and a table can only have one primary key index.
      • Unique Index: A unique index is an index that does not allow duplicate values, which ensures the uniqueness of the index column. There can be multiple unique indexes, but each unique index can only have one column or a combination of multiple columns.
      • Ordinary index: An ordinary index is the most basic index. It has no restrictions and can be created on any column. Ordinary indexes can improve query speed, but will reduce the speed of insertion, update, and deletion.
      • Compound index (joint): A composite index is an index composed of multiple columns, which can be queried based on multiple conditions. The order of composite indexes is important and affects the efficiency of the index.
      • Full-text index: Full-text index is an index for text-type columns. It can perform word segmentation, matching, sorting and other operations on text content. Full-text indexing is suitable for scenarios such as search engines, but its maintenance costs are high.
  • 3. What data structures does the index have?
    • There are many index data structures, the common ones are as follows:
      • B+ tree index: B+ tree index is a multi-way balanced search tree. Each node of it can have multiple child nodes, and all its leaf nodes are at the same level. , and connected to each other. The advantages of the B+ tree index are fast query speed, convenient range query, and high space utilization. The disadvantage of the B+ tree index is that insertion and deletion operations will cause tree adjustments and the maintenance cost is high. B+ tree index is the default index type of MySQL and is suitable for most scenarios.
      • Hash index: A hash index is an index based on a hash table. Each node of it has a hash value. Its query speed is only the same as that of the hash table. Related to the calculation of the H value. The advantage of hash index is that the query speed is extremely fast and it is suitable for equivalent queries. The disadvantage of hash index is that it does not support range query, sorting, grouping and other operations, and it is prone to hash conflicts and low space utilization. Hash indexes are suitable for in-memory databases, such as MySQL's Memory engine.
      • Full-text index: Also known as full-text index, it is a special index in MySQL, used for text data Retrieve. Full-text indexes enable queries based on similarity of text data rather than exact matches.
  • 4. What are clustered indexes and non-clustered indexes? What's the difference?

Clustered indexes and non-clustered indexes are two commonly used index types in MySQL. The differences between clustered indexes and non-clustered indexes are mainly reflected in the following aspects:

  • The structure of the index:The leaf nodes of the clustered index store the data rows in the table, while the leaf nodes of the non-clustered index store pointers to the data rows.
  • Function of index:Clustered index determines the physical order of data in the table, while non-clustered index can be used to improve query efficiency.

Number of indexes:A table can only have one clustered index, but can have multiple non-clustered indexes.

  • 5. What is table return query?

When a query uses an index and needs to obtain the values ​​of other columns in the table in the query results, a table return query is required. Table back query means that the database finds the actual data row based on the pointer in the index and retrieves the values ​​of other columns from the table.

In MyISAM, because the index and data are stored separately, if the columns that the query needs to obtain are not in the index, additional steps are required to retrieve the values ​​of these columns in the table. This can cause a performance penalty, especially when the query involves a large number of table-back operations.

  • 6. Why does MyISAM not support clustered indexes?

The design of MyISAM that does not support clustered indexes is mainly based on its performance optimization and design principles. Here are some of the main reasons why MyISAM does not support clustered indexes:

    • Table-level Locking: MyISAM uses table-level locking instead of row-level locking. Table-level locking is simpler and more efficient for read-intensive operations. In the case of table-level locking, clustered indexes may lead to more complex locking strategies, while non-clustered indexes are more straightforward, making table-level locking easier to implement.
    • Simple storage structure: The data and indexes of the MyISAM storage engine are stored separately, which can simplify the storage structure and reduce storage and maintenance overhead. The design of non-clustered index is more in line with the design concept of MyISAM.
    • Does not support transactions: MyISAM does not support transaction processing, so it does not need to guarantee transaction characteristics (such as atomicity, consistency, isolation, durability) like InnoDB. Regardless of transactional characteristics, nonclustered indexes are easier to implement and provide better performance.
    • Full-text search: MyISAM provides good support for full-text search on non-clustered indexes. Full-text searches typically require the use of nonclustered indexes.

Guess you like

Origin blog.csdn.net/citywu123/article/details/134789558
Recommended