Mysql一些SQL优化的方式

数据运维 2024-03-12 爱可生开源社区手机阅读

本人对mysql的理解不深，前一整子公司有mysql相关的分享，学到了一些东西，加上业务上有一些慢sql优化，所以做了以下的总结

准备工作

首先准备两张表，account表与user表，两表可根据account_id联合查询，两张表各插入10000条数据

CREATE TABLE `account` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `email` varchar(100) DEFAULT NULL,
  `mobile_area_code` varchar(16) DEFAULT NULL,
  `mobile` varchar(64) DEFAULT NULL,
  `password` varchar(64) DEFAULT NULL,
  `create_time` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_password` (`password`),
  KEY `idx_key_part` (`mobile`,`mobile_area_code`),
  KEY `idx_mobile_area_code_create_time`(`mobile_area_code`, `create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;


CREATE TABLE `user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) DEFAULT NULL,
  `nick_name` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_account_id` (`account_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;

delimiter $$
CREATE PROCEDURE proc_initData4()
BEGIN
    declare i int;
    set i = 1;
    set @dt='2023-06-01 00:00:00';
    WHILE i<=1000000 DO
        if i % 2 = 0 then
        INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '86', uuid(), uuid(), date_add(@dt, interval i minute));
        else
        INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '96', uuid(), uuid(), date_add(@dt, interval i minute));
        end if;
        SET i = i+1;
    END WHILE;
end $$
CALL proc_initData4();


delimiter $$
CREATE PROCEDURE proc_initData3()
BEGIN
    declare i int;
    set i = 1;
    WHILE i<=1000000 DO
       INSERT INTO user (id, `account_id`, `nick_name`) VALUES (i, i, uuid());
        SET i = i+1;
    END WHILE;
end $$
CALL proc_initData3();

explain

先看看explain中有什么关键字

mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9986 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

列名	作用
id	在一个查询中每个select关键字都对应一个唯一的id
select_type	select关键字对应的查询类型
table	表名
type	每个select查询中针对该表的访问方法
possible_keys	可能用到的索引
key	实际用到的索引
key_len	实际用到的索引长度
ref	当使用索引列等值查询时，与索引列等值匹配的对象信息
rows	预计需要读取的记录数
filtered	按搜索条件过滤后剩余条数的占比
Extra	一些额外信息

id

id值如果相同，可以认为是一组，从上往下顺序执行；在所有组中，id值越大，优先级越高，越先执行

如下图，id值相同，说明是同一组，从上往下顺序执行，出现在前面的是驱动表，出现在后面的是被驱动表，查询优化器准备让account表作为驱动表，user表作为被驱动表执行查询

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys  | key            | key_len | ref       | rows   | filtered | Extra |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
|  1 | SIMPLE      | a     | NULL       | ALL  | PRIMARY        | NULL           | NULL    | NULL      | 990675 |   100.00 | NULL  |
|  1 | SIMPLE      | u     | NULL       | ref  | idx_account_id | idx_account_id | 5       | test.a.id |      1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
2 rows in set, 1 warning (0.07 sec)

如下图，语句包含子查询，id值不同，值越大，优先级越高，越先执行，所以是先执行子查询，再执行account表相关的查询

mysql> explain select * from account where id in (select account_id from user where id > 9800) or password > 'a4c9429f-d81d';
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys      | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
|  1 | PRIMARY     | account | NULL       | ALL   | idx_password       | NULL    | NULL    | NULL | 9986 |   100.00 | Using where |
|  2 | SUBQUERY    | user    | NULL       | range | PRIMARY,account_id | PRIMARY | 4       | NULL |  200 |   100.00 | Using where |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

如下图，不是所有的子查询id值都会不同，mysql有一个查询优化器，可能会对语句进行优化重写，转换为连接查询

mysql> explain select * from account where id in (select account_id from user where id > 9800) and password > 'a4c9429f-d81d';
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type  | table       | partitions | type   | possible_keys          | key                 | key_len | ref             | rows   | filtered | Extra       |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
|  1 | SIMPLE       | account     | NULL       | ALL    | PRIMARY,idx_password   | NULL                | NULL    | NULL            | 990675 |    50.00 | Using where |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_distinct_key>    | <auto_distinct_key> | 5       | test.account.id |      1 |   100.00 | NULL        |
|  2 | MATERIALIZED | user        | NULL       | range  | PRIMARY,idx_account_id | PRIMARY             | 4       | NULL            | 498167 |   100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.05 sec)

如下图，union 会对多个查询的结果合并起来并进行去重，所以会新建一个的临时表，id为null表明这个临时表是为了合并查询结果集创建的

mysql> explain select * from account where id > 9900 union select * from account where id > 9901;
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
| id | select_type  | table      | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra           |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
|  1 | PRIMARY      | account    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where     |
|  2 | UNION        | account    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where     |
| NULL | UNION RESULT | <union1,2> | NULL       | ALL   | NULL          | NULL    | NULL    | NULL |   NULL |     NULL | Using temporary |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
3 rows in set, 1 warning (0.04 sec)

如下图，union all不会进行去重，自然也没有的临时表

mysql> explain select * from account where id > 9900 union all select * from account where id > 9901;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
|  1 | PRIMARY     | account | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where |
|  2 | UNION       | account | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

select _type

下表中，上面都有例子，就不赘述了

select_type	意思
SIMPLE	不包含union或者子查询
PRIMARY	对于包含union、union all或者子查询的大查询来说，他是由几个小查询组成的，最左边的查询的select_type就是PRIMARY
UNION	对于包含union、union all的大查询，最左边的查询的select_type是PRIMARY，其余的小查询就是union
UNION RESULT	使用union查询后会进行去重
SUBQUERY	包含子查询，并且该子查询不会被查询优化器转化成连接查询

下图中，先看id值，id值为2的语句先执行，id值为1的有两个，说明是连接查询，注意：<subquery2>最后的2表示这个表是由id值为2的语句衍生出来的表

mysql> explain select * from account where id in (select account_id from user where id > 9900);
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type  | table       | partitions | type   | possible_keys          | key                 | key_len | ref             | rows   | filtered | Extra       |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
|  1 | SIMPLE       | account     | NULL       | ALL    | PRIMARY                | NULL                | NULL    | NULL            | 990675 |   100.00 | NULL        |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_distinct_key>    | <auto_distinct_key> | 5       | test.account.id |      1 |   100.00 | NULL        |
|  2 | MATERIALIZED | user        | NULL       | range  | PRIMARY,idx_account_id | PRIMARY             | 4       | NULL            | 498167 |   100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)

type

const

如下图，const表示通过索引一次就找到了，当主键或者唯一二级索引与常量进行等值匹配时，type为const

mysql> explain select * from account where id = 16;
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.02 sec)

eq_ref

如下图，eq_ref常见于主键或者唯一二级索引扫描，下图被驱动表扫描主键

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | account_id    | NULL    | NULL    | NULL              | 9851 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.u.account_id |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)

ref

如下图，ref常见于普通二级索引与常量进行等值匹配

mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password  | idx_password | 403     | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

ref_or_null

如下图，ref_or_null常见于在ref的基础上，新增可为null的查询

mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76' or password is null;
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
| id | select_type | table   | partitions | type        | possible_keys | key          | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | account | NULL       | ref_or_null | idx_password  | idx_password | 403     | const |    2 |   100.00 | Using index condition |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)

range

范围查询，较为简单

index和All

如下图，index常见于索引树全表扫描，与All的聚簇索引全表扫描不同，index所扫描的文件较聚簇索引文件小

mysql> explain select mobile from account where mobile_area_code = '86';
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | account | NULL       | index | NULL          | idx_key_part | 470     | NULL | 9986 |    10.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

possible_keys和key

如下图，可能用到的索引为idx_password,idx_key_part，实际用到的索引为idx_password，用什么索引是mysql的查询优化器经过成本计算后决定的，所以数据库中没有用到的索引需要删除，不然查询优化器计算成本需要更多的时间

mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys             | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password,idx_key_part | idx_password | 403     | const |    1 |     5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

ref

如下图，在访问方法为eq_ref时，与索引列做等值匹配的是user表的account_id

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | account_id    | NULL    | NULL    | NULL              | 9851 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.u.account_id |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

如下图，等值匹配的也可以是一个函数

mysql> explain select * from account a inner join user u on a.id = upper(u.account_id);
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | NULL          | NULL    | NULL    | NULL | 9851 |   100.00 | NULL        |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | func |    1 |   100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

rows和filtered

如下图，rows表示此次查询预计需要扫描的行数，filter表示预计需要扫描的行数中有多少行符合我们的条件，下图中9986条都符合我们的条件，所以filtered值应该越高越好

平时联表查询的时候，我们应该多关注filtered值，filtered越高，说明驱动表扫描出来的行数大部分都是有用的，回表的次数就会变少，语句才会更高效

mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9986 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

Extra

Using where

如下图，使用全表扫描来执行语句，where条件中有针对某个字段的查询

mysql> explain select * from user where nick_name = 'a';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |    10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

如下图，使用某个索引来执行语句，where条件中包含了该索引列不存在的字段

mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys             | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password,idx_key_part | idx_password | 403     | const |    1 |     5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

Using filesort

如下图，filesort表示在没有使用索引排序的情况下，在内存中（记录较少）或者在磁盘中（记录较多）进行排序

mysql> explain select * from user order by nick_name desc limit 10;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

Using temporary

如下图，mysql在无法使用索引的情况下，需要使用内部的临时表来执行语句，值得注意的是，以下语句extra中还有Using filesort，这是因为mysql会在group by后面加上order by

mysql> explain select nick_name, count(*) from user group by nick_name;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |   100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)

Using index

如下图，语句中查询的字段某个索引可以满足，不需要回表

mysql> explain select mobile from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_key_part  | idx_key_part | 403     | const |    1 |   100.00 | Using index |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

Using index condition

没有找到合适的例子

Using join buffer

如下图，没有添加索引联表查询的情况下，会使用join buffer来提高查询速度

mysql> explain select * from account a inner join user u on u.nick_name = a.password;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9047 |   100.00 | NULL                                               |
|  1 | SIMPLE      | u     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |    10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

常见的sql优化

加索引

大部分慢sql是没有加索引或索引没有加对导致，网上关于怎么加索引的文章非常多，我就不赘述了，只提以下两点

很多文章说区分度不大的字段不宜加索引，其实有点一棍子打死的嫌疑，比如上面的account表，mobile_area_code只有86、96两个值，假如百分之90的数据是96，百分之10的数据是86，而有一个业务场景恰恰是要查86相关的数据，这个时候索引的作用就很大了，所以加索引有一些规则可以参考，但还是要看业务具体实现

多用组合索引，组合索引可以提高索引复用率，覆盖索引可以减少回表

深度分页

业务上比较常见的问题，列表翻页到比较大的页面时接口会越来越慢，sql常会变成这样

select * from account order by password limit 5000, 10;

这时可用下面这条语句优化

select * from account where id >= (select id from account order by password limit 5000, 1) limit 10;

explain一下，可以看到第一条语句全表扫描了

mysql> explain select * from account order by password limit 5000, 10;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9975 |   100.00 | Using filesort |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+

那为什么使用第二条语句就能加快查询呢？

第一条语句排序所有记录，其实我们不需要这么多记录，第二条语句只拿了我们需要的记录，时间自然就少了

分页查询还有一个容易踩的坑，用于排序的字段必须是唯一的，使用主键或者唯一的业务单号，如果排序的字段不唯一（比如创建时间），分页的过程会出现少处理或者多处理数据，严重的话会陷在死循环中出不来

强制索引

我们应该经常会遇到sql语句走错索引的情况，比如下面这条sql，本来是走idx_account_id这个索引的，最终mysql决定走主键索引

当然，这只是一个例子，并不是说下面这条sql走idx_account_id索引更快

mysql> explain select * from user where account_id between 1000 and 10002000 order by id desc limit 10;
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type  | possible_keys  | key     | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | user  | NULL       | index | idx_account_id | PRIMARY | 4       | NULL |   20 |    50.00 | Using where; Backward index scan |
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.01 sec)

如果是在测试环境，可以使用optimizer trace 功能查看索引变更的原因，首先需要开启optimizer trace

SET optimizer_trace="enabled=on";
select * from information_schema.OPTIMIZER_TRACE;

比如上面的sql就是因为排序的原因导致索引变更

{
  "reconsidering_access_paths_for_index_ordering": {
    "clause": "ORDER BY",
    "steps": [
    ],
    "index_order_summary": {
      "table": "`user`",
      "index_provides_order": true,
      "order_direction": "desc",
      "index": "PRIMARY",
      "plan_changed": true,
      "access_type": "index"
    }
  }
}

线上一般不会开启optimizer trace 功能，测试环境的数据量又不够，所以这种分析方式比较局限

在平时的业务开发中，本人偶尔会遇到这种情况，A索引明明更快，结果走了B索引，如果是明确知道B索引更快，可以使用下面的语句强制走B索引

select * from user force index(idx_account_id) where account_id between 1000 and 10002000 order by id desc limit 10;

select /*+ INDEX(user idx_account_id) */ * from user where account_id between 1000 and 10002000 order by id desc limit 10;

强制索引的弊端

硬编码，不够优雅，代码需跟随索引名变更

滞后性，慢sql治理一般是在功能上线之后，强制索引需要重新测试、打包上线，如果是非常严重的慢sql，这种方式不够敏捷

巧用union

业务开发中常有获取不同状态位数据，然后根据创建时间排序的需求，形式大概如下

select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;

explain一下，发现全表扫描了，没有用上索引

mysql> explain select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
| id | select_type | table   | partitions | type | possible_keys                    | key  | key_len | ref  | rows   | filtered | Extra                       |
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
|  1 | SIMPLE      | account | NULL       | ALL  | idx_mobile_area_code_create_time | NULL | NULL    | NULL | 991309 |   100.00 | Using where; Using filesort |
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
1 row in set, 1 warning (0.02 sec)

为了用上idx_mobile_area_code_create_time组合索引，需先固定mobile_area_code，对二级索引进行反向扫描，得出以下语句，用上了索引，执行速度也快了许多

select *
from (
  (select * from account where mobile_area_code = '86' order by mobile_area_code, create_time desc limit 10)
     union all
  (select * from account where mobile_area_code = '96' order by mobile_area_code, create_time desc limit 10)
) t
order by mobile_area_code, create_time desc
limit 10

巧用虚拟列

还是上面的account表，如果想要统计7月1日10点到10日10点，创建时间在8点和20点的国内账号，按小时汇总

大部分的同学会这样写

select DATE_FORMAT(create_time, '%H') hour, count(1) cnt from account
 where mobile_area_code= '86'
 and create_time >= '2023-07-01 10:00:00'
 and create_time <= '2023-07-10 10:00:00'
GROUP BY hour HAVING hour  in ('08','20')

explain，用上了临时表

mysql> explain select DATE_FORMAT(create_time, '%H') hour ,count(1) cnt_order  from account  where mobile_area_code= '86'  and create_time >= '2023-07-01 10:00:00'  and create_time <= '2023-07-10 10:00:00' GROUP
BY hour HAVING hour  in ('08','20');
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
| id | select_type | table   | partitions | type  | possible_keys                    | key                              | key_len | ref  | rows | filtered | Extra                                     |
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
|  1 | SIMPLE      | account | NULL       | range | idx_mobile_area_code_create_time | idx_mobile_area_code_create_time | 72      | NULL | 6481 |   100.00 | Using where; Using index; Using temporary |
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
1 row in set, 1 warning (0.00 sec)

加个虚拟列和索引

alter table account add column hour_ char(2)  GENERATED ALWAYS AS (DATE_FORMAT(create_time, '%H')) VIRTUAL;
alter table account add key idx_area_code_hour_create_time_4(mobile_area_code, hour_, create_time);

改写sql，用上虚拟列索引

select hour_ hour,count(1) cnt from account
where hour_ in ('08','20')
 and mobile_area_code = '86'
 and create_time >= '2023-07-01 10:00:00'
 and create_time <= '2023-07-10 10:00:00'
group by  hour_;

explain一下，临时表没了，性能提升了一点

mysql> explain select hour_ hour,count(1) cnt from account where hour_ in ('08','20')  and mobile_area_code = '86'  and create_time >= '2023-07-01 10:00:00'  and create_time <= '2023-07-10 10:00:00' group by  hour_;
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys                                                     | key                              | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | account | NULL       | range | idx_mobile_area_code_create_time,idx_area_code_hour_create_time_4 | idx_area_code_hour_create_time_4 | 81      | NULL |  540 |   100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

模糊转精确

业务开发中模糊查询是较难处理的，但业务上确确实实有这样的需求，很难砍，对数据库的压力又很大，本人所在的业务一般采用下面两种方式

判断字段长度转精确，字段是支持模糊查询的，但用户输入的是精确的字段，比如订单号，很多用户是复制过来的，如果符合长度，直接精确查询，不符合长度，再进行模糊查询，可以减少一定量的慢sql语句

业务上要求用户输入前面5位或后5位字符串，后端再经过中间表转换为精确字符，再进行查询

虽然但是，模糊查询的处理跟业务强相关，本人业务的方式不一定适用所有

排序优化

根据极客时间Mysql45讲第16讲的例子，order by工作方式分为两种，全字段排序、rowid排序

CREATE TABLE `t` (
 `id` int(11) NOT NULL,
 `city` varchar(16) NOT NULL,
 `name` varchar(16) NOT NULL,
 `age` int(11) NOT NULL,
 `addr` varchar(128) DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `city` (`city`)
) ENGINE=InnoDB;

比如下面的查询语句

select city,name,age from t where city='杭州' order by name limit 1000;

根据sort_buffer的大小，如果sort_buffer足够大，会把需要返回的字段全部放进去，根据排序字段排完序之后返回

如果sort_buffer不够大，会把主键id和排序字段放进去，排完序之后再回表，查询到所有业务字段之后再返回

当然，sort_buffer不够大的情况还会使用到文件排序

按照这个原理，在业务上，经常有根据创建时间，更新时间排序的需求，如果根据创建时间来排序，sort_buffer要放多一个字段，有一个比较鸡贼的办法，创建时间的大小跟id是相差无几的，我们使用id来替换创建时间，当然，这个还是有一定的误差的，不同的业务创建时间的取法是不一样的，需要根据业务来评估

代码分治

将一条复杂sql拆分为多条sql，会提高代码复杂度，消耗测试资源，最好在写需求的时候就识别出来，说得容易，人都有惰性，大家懂的

总结

单表查询的情况下，慢sql一般都是没加索引或者没用上索引，比如隐式转换、函数计算、模糊查询，使用子查询

多表查询的情况下，慢sql优化主要思想就是分治，通过代码或者sql的方式尽量缩小范围、分批处理数据