2、MySQL 索引失效的场景

索引失效的场景：

1、没有 where 条件

直接看 SQL 语句

2、where 条件中所在的列没有建立索引

show index from t;

3、从表中取得数据超过某个阈值。通常认为是 20~30%，即使 where 条件和索引都满足，也不会走索引

看表的行数、看下索引列的 cardinality 值，card 值只能直观反映 = 操作符返回的行数。
对于>=、<=、like、between and 的情况，card 值不能直观判断返回值的数据量。
有时候可以尝试着执行一下，但要注意，不是执行真正的 SQL，而是主要是为了得到 where 访问条件返回的行数，所以可以使用下面的技巧实现需求的转换：
select  x.c_id,s.s_name from xuanke x join student s on x.stu_id=s.stu_id  where s.s_name like ‘abc%’; ——————>
select count(1) from student where s_name like ‘abc%’;
上面的 sql 是不是原始 SQL，而是你想得到某个结果而自己写的 SQL。

4、多列索引没有使用前导列

show index from t1;

5、索引本身失效

如何查找失效索引？
如果索引失效，重建索引！
MySQL 中目前没法查看索引的状态信息。
Use information_schema，show tables，有 innodb_sys_indexs 系统所有的索引信息，
但索引状态信息实际没法看，show index from 语法其实访问的是
information_schema.STATISTICS 数据字典。

6、where 条件列上不干净

比如在列上有函数：MySQL 中不支持，Oracle 中支持在列上建立函数索引
select ... from t1 where upper(name) like 'ABC%';
比如在 where 条件中存在运算：
select ... from t1 where id-100<30;

7、小表也尽量走索引，由于 gap 锁的存在

select 尽量走索引，对于 dml 一定要走索引（否则就是全表锁，导致业务串行化）

8、使用了 ignore index hints

使用了 hints，强制忽略了某个索引，导致没走索引

9、统计信息不真实(严重不真实)

统计信息可能会出现严重不真实导致不走索引：表中有 1000 万，索引唯一值有 20 万，但是旧统计信息中唯一值的数量才 2，导致不走索引，走全表扫描。
如何判断统计信息是否真实：
show table status like 't1';
show index from t1;

手工收集统计信息：
analyze table t1;

例如
对一个表做了 truncate 以后，系统在随后的时间里面，启动了一个自动收集统计信息的作业，这个表的行数更新变成 0 行。随后，对这个表进行数据的
导入，导入 1000 万行，这时候 MySQL 不会去看表中真实的行数，还是会看统计信息的 0 行，这时候会出现问题，需要手工收集统计信息。
但是对于递增式的、每日规律变化的情况，统计信息没有必要每日收集。

对于统计信息的收集：
学会使用 UE、notepad++等工具批量手工收集统计信息：
Analyze table table_name；
利用 select concat(‘analyze table ',TABLE_NAME,';') from tables where table_schema=’tpcc1000’;
复制所有的表，利用 UE 编辑器的列编辑模式，直接去掉不必要的列，加上 analyze table 和分号变成语句，直接在 MySQL 里执行就好了。
 
举例：
MySQL> select concat('analyze table ',TABLE_NAME,';') from information_schema.tables where table_schema='TENNIS';
+-----------------------------------------+
| concat('analyze table ',TABLE_NAME,';') |
+-----------------------------------------+
| analyze table COMMITTEE_MEMBERS; |
| analyze table MATCHES; |
| analyze table PENALTIES; |
| analyze table PLAYERS; |
| analyze table TEAMS; |

在 ultraedit 里使用列模式，直接编辑，编辑完直接复制粘贴到 MySQL 里执行就行。
analynize table COMMITTEE_MEMBERS ;
analynize table MATCHES ;
analynize table PENALTIES ;
analynize table PLAYERS ;
analynize table TEAMS ;
 
MySQL 自动收集统计信息的参数：
比如在 show table status like ‘customer’；的时候会自动收集，最好是手工收集。
1、自动存储参数：innodb_stats_persistent
2、变化量大的情况下，自动收集参数：innodb_stats_auto_recalc：
MySQL> show variables like '%stat%';
| innodb_stats_sample_pages | 8 —

—不管你访问多少，每次都是随机扫 8 个页，看看里面有多少行。
| innodb_stats_persistent_sample_pages | 20 ——
analyze 是采样 20 个页，一般可以设置成 64 个页，所以 analyze 准确一些，建议定期手工收集一下。

| innodb_stats_persistent | ON ——
收集完统计信息以后，把收集的信息永久保存到数据字典里面去，数据库重新启动的时候这个统计信息还在，还可以继续使用，如果是 off，存到内存里面去，下次启动就没了，所以这个参数一定要是 on。
| innodb_stats_auto_recalc | ON —

—当 update 或 delete 等操作产生大的影响时，如果这个参数是 on，会触发统计信息的自动收集。例如：变化超过 20%就触发自动收集，有时候会关闭。

统计信息：
1、表的行数
2、索引列的唯一值的数量

关于统计信息需要知道：
1、只有在统计的时候，才会更新对应的数据
2、统计信息使用来生成执行计划的
3、统计信息没有必要和表、索引保持实时更新
比如：一个表行数是 1000 万，索引列唯一值的数量是 20 万，走索引效果很好；如果这个表的行数变成了 2000 万，索引列唯一值的数量变成了 40万，不影响走索引的效果，所以一般在对表做 dml 时不会主动更新统计信息，因为这样会加重系统的负担。

4、统计信息总是近似的反应表和索引的信息
如何手工修改表的行数以及 cardinality 值：为了欺骗 MySQL 是否走索引
需要注意：在执行 show table status like 语句时会自动收集统计信息。

1、根据 mysql.innodb_table_stats 数据字典修改 n_rows 值：
[root@localhost][mysql]> select * from mysql.innodb_table_stats limit 1;
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name        | last_update         | n_rows |clustered_index_size  | sum_of_other_index_sizes |
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
| TENNIS        | COMMITTEE_MEMBERS | 2016-05-05 01:47:33 |16      | 1                    | 0                        |
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
1 row in set (0.00 sec)
[root@localhost][mysql]>

2、根据 mysql.innodb_index_stats 数据字典修改 cardinality 值：
[root@localhost][tpcc1000]> select * from mysql.innodb_index_stats where database_name='tpcc1000' and table_name='customer' and stat_description='c_first';
+---------------+------------+------------+---------------------+--------------+------------+-------------+------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+------------------+
| tpcc1000      | customer   | id_first   | 2016-11-03 15:18:10 | n_diff_pfx01 | 299855     | 20          | c_first |
+---------------+------------+------------+---------------------+--------------+------------+-------------+------------------+
1 row in set (0.00 sec)
[root@localhost][tpcc1000]> update mysql.innodb_index_stats set stat_value=10 where database_name='tpcc1000' and table_name='customer' and stat_description='c_first';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0

[root@localhost][tpcc1000]> select * from mysql.innodb_index_stats where database_name='tpcc1000' and table_name='customer' and stat_description='c_first';
+---------------+------------+------------+---------------------+--------------+------------+-------------+------------------+
| database_name | table_name | index_name | last_update         | stat_name    | stat_value | sample_size | stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+------------------+
| tpcc1000      | customer   | id_first   | 2016-12-28 05:51:27 | n_diff_pfx01 | 10         | 20          | c_first          |
+---------------+------------+------------+---------------------+--------------+------------+-------------+------------------+
1 row in set (0.01 sec)

[root@localhost][tpcc1000]>
mysql> select * from xuanke where c_id >1000;
Empty set (0.31 sec)
mysql> explain select * from xuanke where c_id >1000;
+----+-------------+--------+-------+-------------------+-------------------+---------+------+------+-----------------------+
| id | select_type | table  | type  | possible_keys     | key               |key_len  | ref  | rows | Extra |
+----+-------------+--------+-------+-------------------+-------------------+---------+------+------+-----------------------+
| 1  | SIMPLE      | xuanke | range | FK_Relationship_3 | FK_Relationship_3 | 5       | NULL |    1 | Using index condition |
+----+-------------+--------+-------+-------------------+-------------------+---------+------+------+-----------------------+
1 row in set (0.00 sec)

mysql> explain select * from xuanke where c_id >100;
+----+-------------+--------+------+-------------------+------+---------+------+--------+-------------+
| id | select_type | table | type  | possible_keys     | key  | key_len | ref  | rows   |Extra |
+----+-------------+--------+------+-------------------+------+---------+------+--------+-------------+
| 1  | SIMPLE      | xuanke | ALL  | FK_Relationship_3 | NULL | NULL    |NULL  | 148988 | Using where |
+----+-------------+--------+------+-------------------+------+---------+------+--------+-------------+
1 row in set (0.38 sec)
mysql>

10、数据倾斜的情况下

例如状态值的列，有可能出现没有走索引的情况：
select * from dingdan where dingdanzhuangtai='未处理';
如何解决行数据倾斜：
　　1、使用手工修改统计信息，card 值提升一下
　　2、使用 like 的时候，会临时性使用采样的方式，从表中取 8 个数据块，统计“未处理”值的数量，这时候反而准确了。
　　3、使用 force index 和 ignore index 来做特殊处理，在 Oracle 中不会出现数据倾斜导致不走索引的情况，因为有数据值所占百分比，能够正确引导是否走索引。

11、CBO 计算走索引花费太大

根本原因还是从表中访问的行数过多
针对 like、<=、>=、between and 等不确定的一些条件，会进行动态采样，可能出现有时候走索引，有时候不走索引的情况。
数据库最核心的组件是优化器，对 SQL 进行解析，生成执行计划。

优化器工作模式：
 1、RBO（rule based optimization），基于规则的优化器，条件太苛刻，现在基本不用
主要干什么：
　　1、是否走索引（定义规则为：是否有 where 条件、where 条件是否有索引等，如果满足规则就走索引，不满足就不走索引）
　　2、表的连接顺序等（定义规则为：按照写的 SQL 中的表连接顺序）针对这种优化器，我们在写 SQL 的时候，就需要了解这种优化器的工作习性（规则库），按照他的脾气来，强烈依赖 SQL 的写法。

2、CBO（cost based optimization），基于成本的优化器（现在主要是 CBO，MySQL只有 CBO）
　　1、在解析以前，会做一件事情，对 SQL 进行改写，改写成更合理的一些 SQL语句
　　2、将执行路径列出来，计算每一个执行路径的成本(cpu 和 io 的成本，主要是 io 成本)，基于统计信息进行计算，估算一个成本
　　3、选择执行成本最低的 SQL 作为执行计划

对于 CBO：
1、不过度依赖 SQL 写法
2、严重依赖统计信息

12、隐式类型转换导致索引失效

1、数字列不害怕类型转换
2、字符串列非常害怕隐式类型转换，因此对于字符串的列，一定要加上''：
字符串列“坑”
我们习惯于将很多列定位为字符串，例如手机号列，在 where 的时候，也习惯与=123，不加''，因为我们认为这是数字，但是定义的是字符列，发生隐式转换，导致索引失效。
3、日期列不害怕隐式类型转换
设计原则：
如果存储的是数字，就定义成数字列
如果存储的是日期，就定义成日期列
如果存储的是字符串，where 条件的时候，右面一定要加上''

13、<>会导致索引失效

因为数据库认为等于的时候会取少量数据，认为不等于会取大量的数据

14、在 where 条件的 like 中%在前

where like '%abc'，有索引也会失效，后面的中这种写法索引可能不会失效
select * from ... where name like 'abc%'

15、not in(值的列表)经常索引失效，in(值的列表)一般走索引

因为认为 not in 时结果集会比较大，而 in 的时候结果集会比较小。

16、对于 not in 和 not exists 子查询的情况，索引不一定失效

对 not in、not exists 和 left join 之间的相互转换，索引都生效了：
mysql> explain select * from student s where s.stu_id not in (select stu_id from xuanke);
+----+-------------+--------+-------+-------------------+-------------------+---------+------ +--------+-------------+
| id | select_type | table  | type  | possible_keys     | key |key_len      | ref     | rows  | Extra  |
+----+-------------+--------+-------+-------------------+-------------------+---------+------ +--------+-------------+
| 1  | PRIMARY     | s      | ALL   | NULL              | NULL              | NULL    | NULL  | 194737 | Using where |
| 2  | SUBQUERY    | xuanke | index | FK_Relationship_1 |FK_Relationship_1  | 5       | NULL  | 149877 | Using index |
+----+-------------+--------+-------+-------------------+-------------------+---------+------ +--------+-------------+
2 rows in set (0.00 sec)
mysql> explain select * from student s where not exists (select 1 from xuanke x where x.stu_id=s.stu_id);
+----+--------------------+-------+------+-------------------+-------------------+---------+-----------------+--------+-------------+
| id | select_type        | table | type | possible_keys     | key| key_len      | ref     | rows            | Extra |
+----+--------------------+-------+------+-------------------+-------------------+---------+-----------------+--------+-------------+
| 1 | PRIMARY             | s     | ALL  | NULL              | NULL              | NULL    | NULL            | 194737 | Using where |
| 2 | DEPENDENT SUBQUERY  | x     | ref  | FK_Relationship_1 |FK_Relationship_1  | 5       | xuanke.s.stu_id | 1      | Using index |
+----+--------------------+-------+------+-------------------+-------------------+---------+-----------------+--------+-------------+
mysql> explain select * from student s left join xuanke x on s.stu_id=x.stu_id and x.c_id is null;
+----+-------------+-------+------+-------------------------------------+-------------------+---------+-----------------+--------+-------------+
| id | select_type | table | type | possible_keys                       | key               | key_len | ref             | rows   | Extra |
+----+-------------+-------+------+-------------------------------------+-------------------+---------+-----------------+--------+-------------+
| 1  | SIMPLE      | s     | ALL  | NULL                                | NULL              | NULL    | NULL            | 194737 | NULL|
| 1  | SIMPLE      | x     | ref  | FK_Relationship_1,FK_Relationship_3 |FK_Relationship_1  | 5       | xuanke.s.stu_id | 1      | Using where |
+----+-------------+-------+------+-------------------------------------+-------------------+---------+-----------------+--------+-------------+
2 rows in set (0.00 sec)

17、对于 in、exists 子查询的情况，索引一般也不会失效

mysql> explain select * from student s where exists (select 1 from xuanke x
where x.stu_id=s.stu_id);
+----+--------------------+-------+------+-------------------+-------------------+---------+-----------------+--------+-------------+
| id | select_type        | table | type | possible_keys     | key               | key_len | ref             | rows   | Extra |
+----+--------------------+-------+------+-------------------+-------------------+---------+-----------------+--------+-------------+
| 1  | PRIMARY            | s     | ALL  | NULL              | NULL              | NULL    | NULL            | 194737 | Using where |
| 2  | DEPENDENT SUBQUERY | x     | ref  | FK_Relationship_1 |FK_Relationship_1  | 5       | xuanke.s.stu_id | 1      | Using index |
+----+--------------------+-------+------+-------------------+-------------------+---------+-----------------+--------+-------------+
2 rows in set (0.00 sec)
mysql> explain select * from student s where s.stu_id in (select stu_id from xuanke);
+----+--------------+-------------+--------+-------------------+-------------------+--------- +-----------------+--------+-------------+
| id | select_type  | table       | type   | possible_keys     | key               | key_len  | ref             | rows   | Extra       |
+----+--------------+-------------+--------+-------------------+-------------------+--------- +-----------------+--------+-------------+
| 1  | SIMPLE       | s           | ALL    | PRIMARY           |NULL               | NULL     | NULL            | 194737 | Using where |
| 1  | SIMPLE       | <subquery2> | eq_ref | <auto_key>        |<auto_key>         | 5        | xuanke.s.stu_id | 1      | NULL        |
| 2  | MATERIALIZED | xuanke      | index  | FK_Relationship_1 |FK_Relationship_1  | 5        | NULL            | 149877 | Using index |
+----+--------------+-------------+--------+-------------------+-------------------+--------- +-----------------+--------+-------------+
3 rows in set (0.00 sec)

18、对于日期时间列来说，下面的索引会失效

发生了隐式类型转换，导致索引会失效：
explain select * from login_record1 where t_time=cast('2011-1-1' as date);
explain select * from login_record1 where d_date=cast('13:00:00' as time);
explain select * from login_record1 where t_time='2011-1-1'; //索引不会失效

19、is null 一般会走索引，即使所有的数据都为空，这是一个 bug

is not null 有时候走索引，有时候不走索引，还是比较准确，主要看空值和非空值的数量。
is null 可能会成为一个坑。
MySQL 没有存储数据分布，因此在进行 where 条件的时候，可能会出现数据倾斜的盲点，反而采用 like 等模糊匹配的时候，因为会刺激 mysql 进行动态采样，反而会比较准确。有时候会出现下面的一些写法：
　　1、like 'abc';来代替='abc'
　　2、between 1 and 1 来代替=1

2、MySQL 索引失效的场景

猜你喜欢