MariaDB SELECT with index used but looks like table scan

cwweng :

I have a MariaDB 10.4 with a hung table (about 100 million rows) for storing crawled posts. The table contains 4x columns, and one of them is lastUpadate (datetime) and indexed.

Recently I try to select posts by lastUpdate. Most of them returns fast with index used, but some takes minutes with fewer records returned and looks like a table scan.

This is the query explain without conditions.

> explain select 1 from SourceAttr;
+------+-------------+------------+-------+---------------+---------------+---------+------+----------+-------------+
| id   | select_type | table      | type  | possible_keys | key           | key_len | ref  | rows     | Extra       |
+------+-------------+------------+-------+---------------+---------------+---------+------+----------+-------------+
|    1 | SIMPLE      | SourceAttr | index | NULL          | idxCreateDate | 5       | NULL | 79830491 | Using index |
+------+-------------+------------+-------+---------------+---------------+---------+------+----------+-------------+

This is the query explain and number of rows returned for the slow one. The number of rows in the explain is almost equals to the above one.

> select 1 from SourceAttr where (lastUpdate >= '2020-01-11 11:46:37' AND lastUpdate < '2020-01-12 11:46:37');
+------+-------------+------------+-------+---------------+---------------+---------+------+----------+--------------------------+
| id   | select_type | table      | type  | possible_keys | key           | key_len | ref  | rows     | Extra                    |
+------+-------------+------------+-------+---------------+---------------+---------+------+----------+--------------------------+
|    1 | SIMPLE      | SourceAttr | index | idxLastUpdate | idxLastUpdate | 5       | NULL | 79827437 | Using where; Using index |
+------+-------------+------------+-------+---------------+---------------+---------+------+----------+--------------------------+

> select 1 from SourceAttr where (lastUpdate >= '2020-01-11 11:46:37' AND lastUpdate < '2020-01-12 11:46:37');
394454 rows in set (14 min 40.908 sec)

The is the fast one.

> explain select 1 from SourceAttr where (lastUpdate >= '2020-01-15 11:46:37' AND lastUpdate < '2020-01-16 11:46:37');
+------+-------------+------------+-------+---------------+---------------+---------+------+---------+--------------------------+
| id   | select_type | table      | type  | possible_keys | key           | key_len | ref  | rows    | Extra                    |
+------+-------------+------------+-------+---------------+---------------+---------+------+---------+--------------------------+
|    1 | SIMPLE      | SourceAttr | range | idxLastUpdate | idxLastUpdate | 5       | NULL | 3699041 | Using where; Using index |
+------+-------------+------------+-------+---------------+---------------+---------+------+---------+--------------------------+

> select 1 from SourceAttr where (lastUpdate >= '2020-01-15 11:46:37' AND lastUpdate < '2020-01-16 11:46:37');
1352552 rows in set (2.982 sec)

Any reason what might cause this ?

Thanks a lot.

Bill Karwin :

When you see type: index it's called an index scan. This is almost as bad as a table-scan.

Notice the rows: 79827437 in the EXPLAIN of the two slow queries. This means it's examining over 79 million items in the scanned index, either idxCreateDate or idxLastUpdate. So it's basically examining every index entry, which takes nearly as long as examining every row of the table.

Whereas the quick query says rows: 3699041 so it's estimating less than 3.7 million rows examined. More than 20x fewer.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=372467&siteId=1
Recommended