《MySQL高级》索引分析和优化笔记——order by 排序分析

2023年 7月 11日数据运维贤蛋大眼萌

学习《MySQL高级》高阳老师讲解索引课程的笔记，本篇侧重对 order by 排序分析

建表

# 建表
CREATE TABLE tblA(
#id int primary key not null autp_increment,
age int,
birth timestamp not null
);

insert into tblA(age,birth) values(22,now());
insert into tblA(age,birth) values(23,now());
insert into tblA(age,birth) values(24,now());

# 建立复合索引
CREATE INDEX idx_A_ageBirth on tblA(age,birth);

select * from tblA;

Order By 优化（索引分析）

由于本表中只有两个字段 age, birth，复合索引都覆盖了，所以 select * 就相当于 select age, birth, 查询直接走索引，不需要回表。此处仅关注排序（order by）是否会出现文件排序（filesort）。

1.1 explain select * from tblA where age > 20 order by age;

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+

排序用到了 age 字段的索引，不会出现filesort。

1.2 explain select * from tblA where age > 20 order by age, birth;

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+--------------------------+

排序，age, birth 符合复合索引顺序，所以排序用到了age, birth 两个字段的索引，不会出现filesort。

1.3 explain select from tblA where age > 20 order by birth;

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index; Using filesort |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+

查询时用到了 age 字段的索引；排序时，由于 age 是范围，范围后面全失效，所以不能走 age, birth 索引进行排序，那么就会出现filesort。

若 age 是个等值查询，排序时就不会出现filesort。见下面语句：

mysql> explain select * from tblA where age = 22 order by birth;
+----+-------------+-------+------------+------+----------------+----------------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys  | key            | key_len | ref   | rows | filtered | Extra                    |
+----+-------------+-------+------------+------+----------------+----------------+---------+-------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | ref  | idx_A_ageBirth | idx_A_ageBirth | 5       | const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+------+----------------+----------------+---------+-------+------+----------+--------------------------+

1.4 explain select * from tblA where age > 20 order by birth, age;

+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key            | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | idx_A_ageBirth | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using where; Using index; Using filesort |
+----+-------------+-------+------------+-------+----------------+----------------+---------+------+------+----------+------------------------------------------+

排序时，由于 birth, age 不符合复合索引的顺序，所以会出现filesort。

2.1 explain select * from tblA order by birth;

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+

排序时，仅有birth，会出现filesort。

2.2 explain select * from tblA where birth > ‘2016-01-28 00:00:00’ order by birth;

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |    33.33 | Using where; Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+------------------------------------------+

排序时，where 和 order by 的字段顺序不符合复合索引顺序，会出现文件排序。

2.3 explain select from tblA where birth > ‘2016-01-28 00:00:00’ order by age;

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |    33.33 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+--------------------------+

排序时，where 和 order by 的字段顺序经过优化器优化后符合符合索引的顺序，不会出现文件排序。

2.4 explain select from tblA order by age asc, birth desc;

+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
| id | select_type | table | partitions | type  | possible_keys | key            | key_len | ref  | rows | filtered | Extra                       |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+
|  1 | SIMPLE      | tblA  | NULL       | index | NULL          | idx_A_ageBirth | 9       | NULL |    3 |   100.00 | Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+----------------+---------+------+------+----------+-----------------------------+

排序时，order by 的字段 age、birth 虽然符合符合索引顺序，但是 age 按升序、birth 按降序，这样无法利用索引排序，会出现文件排序。若 age、birth 同升或同降，则不会出现文件排序。

Order By 无法使用索引时的优化

如果 Order By 用不上索引的话，就会出现 filesort。filesort 有两种算法：双路排序和单路排序。

双路排序

MySQL4.1之前是使用双路排序，字面意思就是两次扫描磁盘，最终得到数据，

读取行指针和 orderby 列，对他们进行排序，然后扫描已经排序好的列表，按照列表中的值重新从列表中读取对应的数据输出。从磁盘取排序字段，在buffer进行排序，再从磁盘取其他字段。

取一批数据，要对磁盘进行了两次扫描，众所周知，I/O是很耗时的，所以在mysql4.1之后，出现了第二种改进的算法，就是单路排序。

单路排序

从磁盘读取查询需要的所有列，按照order by列在buffer对它们进行排序，然后扫描排序后的列表进行输出，它的效率更快些，避免了第二次读取数据。并且把随机IO变成了顺序IO，但是它会使用更多的空间，因为它把每一行都保存在内存中了。

一般来说单路排序更好，但也会出现问题。这是为什么？

在sort_buffer中， 单路排序比多路排序要多占用很多空间，因为单路排序是把所有字段都取出，所以有可能取出的数据的总大小超出了sort_buffer的容量，导致每次只能取sort_ buffer容量大小的数据，进行排序(创建tmp文件，多路合并)，排完再取sort_buffer容量大小，再排…从而多次I/O。

本来想省一次I/O操作，反而导致了大量的I/O操作，反而得不偿失。

如何优化？通过增加 sort_buffer_size 容量和max_length_for_sort_data