索引优化及排序优化

索引优化

索引的类型

单值索引
复合索引
唯一索引

实例

建表:

create table staffs(
id int primary key auto_increment,
name varchar(24) not null default "",
age int not null default 0,
pos varchar(20) not null default "",
add_time timestamp not null default CURRENT_TIMESTAMP
)charset utf8;

create table user(
id int not null auto_increment primary key,
name varchar(20) default null,
age int default null,
email varchar(20) default null
)charset=utf8;

插入数据:

insert into staffs(`name`,`age`,`pos`,`add_time`) values('z3',22,'manager',now());
insert into staffs(`name`,`age`,`pos`,`add_time`) values('July',23,'dev',now());
insert into staffs(`name`,`age`,`pos`,`add_time`) values('2000',23,'dev',now());

insert into user(name,age,email) values('1aa1',21,'[email protected]');
insert into user(name,age,email) values('2aa2',22,'[email protected]');
insert into user(name,age,email) values('3aa3',23,'[email protected]');
insert into user(name,age,email) values('4aa4',25,'[email protected]');

建立复合索引:

create index idx_staffs_nameAgePos on staffs(name,age,pos);

在使用MySQL索引时有下面一个口诀：
全值匹配我最爱,最左前缀要遵守
带头大哥不能死,中间兄弟不能断
索引列上少计算,范围之后全失效
like百分写最右,覆盖索引不写星
不等空值还有or,索引失效要少用
varchar引号不可丢,SQL高级也不难

意思是：
用几列建立复合索引，就用上那几个列并且按照顺序来用是最好的。

像上面建立的索引idx_staffs_nameAgePos，最前面的name字段是必须用上的，如果用了pos字段，则中间age字段也必须要有。不然便使用不到索引。
在这里插入图片描述

不在索引列上做任何操作（计算，函数，（自动/手动）类型转换），否则会导致索引失效而进行全表扫描。

若中间索引列用到了范围（>、<、like等），则后面的所以全失效，如上面的索引要是用到where name=‘z3’ and age>18 and pos = ‘maneger’;这样的条件，则pos = 'maneger’这个条件会失效。
在这里插入图片描述
like模糊查询百分号写在左边会导致索引失效。有时候右模糊查询，并不能查到想得到的内容，此时如果需要使用两边都是%号，又不想索引失效，应使用覆盖查询（查询范围在已经建立的索引范围之内，尽量与索引的个数，顺序相同），即不写select * ，写select name age pos这样。
在这里插入图片描述

is null 和 is not null将无法使用索引，使用or也可能导致索引失效。
在这里插入图片描述

varchar类型不加引号也将导致索引失效。

练习

假设index(a,b,c)

where语句	索引是否被使用到
where a = 3	Y,使用到a
where a = 3 and b = 5	Y,使用到a,b
where a = 3 and b = 5 and c = 4	Y,使用到a,b,c
where b = 3 或 where b = 3 and c =4 或 where c = 4	N
where a = 3 and c = 5	N 使用到a,c没有被使用,b中间断了
where a = 3 and b > 4 and c = 5	Y 使用到了a,b
where a = 3 and b like ‘kk%’ and c= 4	Y 使用到了a,b,c
where a = 3 and b like ‘%kk’ and c= 4	使用到了a
where a = 3 and b like ‘%kk%’ and c= 4	使用到了a

排序优化

分析

观察,至少跑一天,看看生产的慢SQL情况
开启慢查询日志,设置阙值,比如超过5秒钟的就是慢SQL,并抓取出来
explain + 慢SQL分析
show profile
进行SQL数据库服务器的参数调优(运维orDBA来做)

总结

慢查询的开启并捕获
explain+慢SQL分析
show profile查询SQL在MySQL服务器里面的执行细节
SQL数据库服务器的参数调优

小表驱动大表

驱动表的概念，mysql中指定了连接条件时，满足查询条件的记录行数少的表为驱动表；如未指定查询条件，则扫描行数少的为驱动表。mysql优化器就是这么粗暴以小表驱动大表的方式来决定执行顺序的。

order by优化

order by子句,尽量使用index方式排序,避免使用filesort方式排序

建表，插入测试数据

create table tbla(
age int,
birth timestamp not null
);
insert into tbla(age,birth) values(22,now());
insert into tbla(age,birth) values(23,now());
insert into tbla(age,birth) values(24,now());

建立索引

create index idx_tbla_agebrith on tbla(age,birth);

分析

MySQL支持两种方式的排序,filesort和index,index效率高,MySQL扫描索引本身完成排序。filesort方式效率较低

order by 满足两种情况下,会使用index方式排序：

order by 语句使用索引最左前列
使用where子句与order by子句条件组合满足索引最左前列

filesort有两种算法-双路排序和单路排序：

双路排序,MySQL4.1之前是使用双路排序,字面意思就是两次扫描磁盘,最终得到数据,读取行指针和order by列,对他们进行排序,然后扫描已经排序好的列表,按照列表中的值重新从列表中读取对应的数据输出
单路排序,从磁盘读取查询需要的所有列,按照order by列在buffer对他们进行排序,然后扫描排序后的列表进行输出,它的效率更快一些,避免了第二次读取数据,并且把随机IO变成了顺序IO,但是它会使用更多的空间

优化策略调整MySQL参数

增加sort_buffer_size参数设置
增大max_lenght_for_sort_data参数的设置

提高order by的速度
1、order by时select * 是一个大忌,只写需要的字段

当查询的字段大小总和小于max_length_for_sort_data而且排序字段不是text或blob类型时,会用改进后的
算法–单路排序
两种算法的数据都有可能超出sort_buffer的容量,超出之后,会创建tmp文件进行合并排序,导致多次I/O

2、尝试提高sort_buffer_size
3、尝试提高max_length_for_sort_data

夕麻

发布了28 篇原创文章 · 获赞 2 · 访问量 1976

私信关注