Is there a performance problem in MySQL full table scan?

To find a row of data in a table, how many implementation methods does the database have?

There are two answers, full table scan or index lookup .

Full table scanning is to obtain query results by reading the data in the entire table. The biggest problem with this method is that as the amount of data increases, the performance of scanning disk data drops sharply. For MySQL, we can use the EXPLAIN command to view the execution plan of the SQL statement, for example ( sample data ):

EXPLAIN
SELECT *
FROM employee;

Name         |Value   |
-------------+--------+
id           |1       |
select_type  |SIMPLE  |
table        |employee|
partitions   |        |
type         |ALL     |
possible_keys|        |
key          |        |
key_len      |        |
ref          |        |
rows         |25      |
filtered     |100.0   |
Extra        |        |

From the output of the above query plan, it can be seen that the value of the type field is ALL, which means full table scan.

Index lookup is to quickly locate data through indexes (usually B+ trees, B* trees). The following example looks up employee information by primary key:

EXPLAIN
SELECT *
FROM employee
WHERE emp_id = 10;

Name         |Value   |
-------------+--------+
id           |1       |
select_type  |SIMPLE  |
table        |employee|
partitions   |        |
type         |const   |
possible_keys|PRIMARY |
key          |PRIMARY |
key_len      |4       |
ref          |const   |
rows         |1       |
filtered     |100.0   |
Extra        |        |

The value of the type field in the output is const, indicating that the data is searched through the primary key or unique index, and at most one record is returned. This is a very fast access method, so it is equivalent to a constant (constant).

Index range scans may also be used when looking up data through an index. For example:

EXPLAIN
SELECT *
FROM employee
WHERE emp_id BETWEEN 10 AND 12;

Name         |Value      |
-------------+-----------+
id           |1          |
select_type  |SIMPLE     |
table        |employee   |
partitions   |           |
type         |range      |
possible_keys|PRIMARY    |
key          |PRIMARY    |
key_len      |4          |
ref          |           |
rows         |3          |
filtered     |100.0      |
Extra        |Using where|

The value of the type field in the output is range, indicating that the data is obtained through the range scan of the primary key index.

Generally speaking, finding data through indexing is more efficient than full table scanning. For specific analysis, please refer to this article . However, there are still some cases where a full table scan is a better choice, including:

  • The amount of data in the table is so small that a full table scan is faster than an index lookup. Especially the range scan based on the auxiliary index, because after scanning the index, you need to go back to the table to query the data, which is random IO. For example, a configuration table with less than 10 data entries can quickly obtain data through full table scanning.
  • There is no filter condition based on the index field in the query statement; or the query condition based on the index needs to return a large part of the data in the table, resulting in a faster full table scan. One application scenario is aggregation analysis in data warehouses, which usually needs to summarize the data in the entire table.
  • The cardinality (distinct values) of the index is too small, such as a separate index for the gender field. In this case, MySQL will think that the index needs to find a lot of records, and the performance is not as good as a full table scan.

A full table scan in the execution plan does not necessarily mean that there is a performance problem in the query, but it may also be the correct choice after analysis by the MySQL optimizer. If we confirm that the full table scan is not the optimal method, some technical means can be used to help the optimizer choose other implementation methods, such as using the ANALYZE TABLE command to update statistics, using the index hint FORCE INDEX to force the use of an index, or using the system variable max_seeks_for_key to control the maximum number of records that the index scan seeks.

Guess you like

Origin blog.csdn.net/horses/article/details/131228811