Mysql一些SQL优化的方式

2024年 3月 12日 64.5k 0

本人对mysql的理解不深,前一整子公司有mysql相关的分享,学到了一些东西,加上业务上有一些慢sql优化,所以做了以下的总结

准备工作

首先准备两张表,account表与user表,两表可根据account_id联合查询,两张表各插入10000条数据

CREATE TABLE `account` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `email` varchar(100) DEFAULT NULL,
  `mobile_area_code` varchar(16) DEFAULT NULL,
  `mobile` varchar(64) DEFAULT NULL,
  `password` varchar(64) DEFAULT NULL,
  `create_time` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_password` (`password`),
  KEY `idx_key_part` (`mobile`,`mobile_area_code`),
  KEY `idx_mobile_area_code_create_time`(`mobile_area_code`, `create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;
​
​
CREATE TABLE `user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) DEFAULT NULL,
  `nick_name` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_account_id` (`account_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;
delimiter $$
CREATE PROCEDURE proc_initData4()
BEGIN
    declare i int;
    set i = 1;
    set @dt='2023-06-01 00:00:00';
    WHILE i<=1000000 DO
        if i % 2 = 0 then
        INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '86', uuid(), uuid(), date_add(@dt, interval i minute));
        else
        INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '96', uuid(), uuid(), date_add(@dt, interval i minute));
        end if;
        SET i = i+1;
    END WHILE;
end $$
CALL proc_initData4();
​
​
delimiter $$
CREATE PROCEDURE proc_initData3()
BEGIN
    declare i int;
    set i = 1;
    WHILE i<=1000000 DO
       INSERT INTO user (id, `account_id`, `nick_name`) VALUES (i, i, uuid());
        SET i = i+1;
    END WHILE;
end $$
CALL proc_initData3();

explain

先看看explain中有什么关键字

mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9986 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
列名 作用
id 在一个查询中每个select关键字都对应一个唯一的id
select_type select关键字对应的查询类型
table 表名
type 每个select查询中针对该表的访问方法
possible_keys 可能用到的索引
key 实际用到的索引
key_len 实际用到的索引长度
ref 当使用索引列等值查询时,与索引列等值匹配的对象信息
rows 预计需要读取的记录数
filtered 按搜索条件过滤后剩余条数的占比
Extra 一些额外信息
id

id值如果相同,可以认为是一组,从上往下顺序执行;在所有组中,id值越大,优先级越高,越先执行

如下图,id值相同,说明是同一组,从上往下顺序执行,出现在前面的是驱动表,出现在后面的是被驱动表,查询优化器准备让account表作为驱动表,user表作为被驱动表执行查询

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys  | key            | key_len | ref       | rows   | filtered | Extra |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
|  1 | SIMPLE      | a     | NULL       | ALL  | PRIMARY        | NULL           | NULL    | NULL      | 990675 |   100.00 | NULL  |
|  1 | SIMPLE      | u     | NULL       | ref  | idx_account_id | idx_account_id | 5       | test.a.id |      1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
2 rows in set, 1 warning (0.07 sec)

如下图,语句包含子查询,id值不同,值越大,优先级越高,越先执行,所以是先执行子查询,再执行account表相关的查询

mysql> explain select * from account where id in (select account_id from user where id > 9800) or password > 'a4c9429f-d81d';
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys      | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
|  1 | PRIMARY     | account | NULL       | ALL   | idx_password       | NULL    | NULL    | NULL | 9986 |   100.00 | Using where |
|  2 | SUBQUERY    | user    | NULL       | range | PRIMARY,account_id | PRIMARY | 4       | NULL |  200 |   100.00 | Using where |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

如下图,不是所有的子查询id值都会不同,mysql有一个查询优化器,可能会对语句进行优化重写,转换为连接查询

mysql> explain select * from account where id in (select account_id from user where id > 9800) and password > 'a4c9429f-d81d';
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type  | table       | partitions | type   | possible_keys          | key                 | key_len | ref             | rows   | filtered | Extra       |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
|  1 | SIMPLE       | account     | NULL       | ALL    | PRIMARY,idx_password   | NULL                | NULL    | NULL            | 990675 |    50.00 | Using where |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_distinct_key>    | <auto_distinct_key> | 5       | test.account.id |      1 |   100.00 | NULL        |
|  2 | MATERIALIZED | user        | NULL       | range  | PRIMARY,idx_account_id | PRIMARY             | 4       | NULL            | 498167 |   100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.05 sec)

如下图,union 会对多个查询的结果合并起来并进行去重,所以会新建一个<union1,2>的临时表,id为null表明这个临时表是为了合并查询结果集创建的

mysql> explain select * from account where id > 9900 union select * from account where id > 9901;
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
| id | select_type  | table      | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra           |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
|  1 | PRIMARY      | account    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where     |
|  2 | UNION        | account    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where     |
| NULL | UNION RESULT | <union1,2> | NULL       | ALL   | NULL          | NULL    | NULL    | NULL |   NULL |     NULL | Using temporary |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
3 rows in set, 1 warning (0.04 sec)

如下图,union all不会进行去重,自然也没有<union1,2>的临时表

mysql> explain select * from account where id > 9900 union all select * from account where id > 9901;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
|  1 | PRIMARY     | account | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where |
|  2 | UNION       | account | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 495337 |   100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
select _type

下表中,上面都有例子,就不赘述了

select_type 意思
SIMPLE 不包含union或者子查询
PRIMARY 对于包含union、union all或者子查询的大查询来说,他是由几个小查询组成的,最左边的查询的select_type就是PRIMARY
UNION 对于包含union、union all的大查询,最左边的查询的select_type是PRIMARY,其余的小查询就是union
UNION RESULT 使用union查询后会进行去重
SUBQUERY 包含子查询,并且该子查询不会被查询优化器转化成连接查询

下图中,先看id值,id值为2的语句先执行,id值为1的有两个,说明是连接查询,注意:<subquery2>最后的2表示这个表是由id值为2的语句衍生出来的表

mysql> explain select * from account where id in (select account_id from user where id > 9900);
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type  | table       | partitions | type   | possible_keys          | key                 | key_len | ref             | rows   | filtered | Extra       |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
|  1 | SIMPLE       | account     | NULL       | ALL    | PRIMARY                | NULL                | NULL    | NULL            | 990675 |   100.00 | NULL        |
|  1 | SIMPLE       | <subquery2> | NULL       | eq_ref | <auto_distinct_key>    | <auto_distinct_key> | 5       | test.account.id |      1 |   100.00 | NULL        |
|  2 | MATERIALIZED | user        | NULL       | range  | PRIMARY,idx_account_id | PRIMARY             | 4       | NULL            | 498167 |   100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
type
const

如下图,const表示通过索引一次就找到了,当主键或者唯一二级索引与常量进行等值匹配时,type为const

mysql> explain select * from account where id = 16;
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.02 sec)
eq_ref

如下图,eq_ref常见于主键或者唯一二级索引扫描,下图被驱动表扫描主键

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | account_id    | NULL    | NULL    | NULL              | 9851 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.u.account_id |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)
ref

如下图,ref常见于普通二级索引与常量进行等值匹配

mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password  | idx_password | 403     | const |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
ref_or_null

如下图,ref_or_null常见于在ref的基础上,新增可为null的查询

mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76' or password is null;
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
| id | select_type | table   | partitions | type        | possible_keys | key          | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | account | NULL       | ref_or_null | idx_password  | idx_password | 403     | const |    2 |   100.00 | Using index condition |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)
range

范围查询,较为简单

index和All

如下图,index常见于索引树全表扫描,与All的聚簇索引全表扫描不同,index所扫描的文件较聚簇索引文件小

mysql> explain select mobile from account where mobile_area_code = '86';
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | account | NULL       | index | NULL          | idx_key_part | 470     | NULL | 9986 |    10.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
possible_keys和key

如下图,可能用到的索引为idx_password,idx_key_part,实际用到的索引为idx_password,用什么索引是mysql的查询优化器经过成本计算后决定的,所以数据库中没有用到的索引需要删除,不然查询优化器计算成本需要更多的时间

mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys             | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password,idx_key_part | idx_password | 403     | const |    1 |     5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
ref

如下图,在访问方法为eq_ref时,与索引列做等值匹配的是user表的account_id

mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | account_id    | NULL    | NULL    | NULL              | 9851 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | test.u.account_id |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

如下图,等值匹配的也可以是一个函数

mysql> explain select * from account a inner join user u on a.id = upper(u.account_id);
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | u     | NULL       | ALL    | NULL          | NULL    | NULL    | NULL | 9851 |   100.00 | NULL        |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | func |    1 |   100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
rows和filtered

如下图,rows表示此次查询预计需要扫描的行数,filter表示预计需要扫描的行数中有多少行符合我们的条件,下图中9986条都符合我们的条件,所以filtered值应该越高越好

平时联表查询的时候,我们应该多关注filtered值,filtered越高,说明驱动表扫描出来的行数大部分都是有用的,回表的次数就会变少,语句才会更高效

mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9986 |   100.00 | NULL  |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
Extra
Using where

如下图,使用全表扫描来执行语句,where条件中有针对某个字段的查询

mysql> explain select * from user where nick_name = 'a';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |    10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

如下图,使用某个索引来执行语句,where条件中包含了该索引列不存在的字段

mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys             | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_password,idx_key_part | idx_password | 403     | const |    1 |     5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Using filesort

如下图,filesort表示在没有使用索引排序的情况下,在内存中(记录较少)或者在磁盘中(记录较多)进行排序

mysql> explain select * from user order by nick_name desc limit 10;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
Using temporary

如下图,mysql在无法使用索引的情况下,需要使用内部的临时表来执行语句,值得注意的是,以下语句extra中还有Using filesort,这是因为mysql会在group by后面加上order by

mysql> explain select nick_name, count(*) from user group by nick_name;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                           |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
|  1 | SIMPLE      | user  | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |   100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)
Using index

如下图,语句中查询的字段某个索引可以满足,不需要回表

mysql> explain select mobile from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | account | NULL       | ref  | idx_key_part  | idx_key_part | 403     | const |    1 |   100.00 | Using index |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Using index condition

没有找到合适的例子

Using join buffer

如下图,没有添加索引联表查询的情况下,会使用join buffer来提高查询速度

mysql> explain select * from account a inner join user u on u.nick_name = a.password;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9047 |   100.00 | NULL                                               |
|  1 | SIMPLE      | u     | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9851 |    10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

常见的sql优化

加索引

大部分慢sql是没有加索引或索引没有加对导致,网上关于怎么加索引的文章非常多,我就不赘述了,只提以下两点

  • 很多文章说区分度不大的字段不宜加索引,其实有点一棍子打死的嫌疑,比如上面的account表,mobile_area_code只有86、96两个值,假如百分之90的数据是96,百分之10的数据是86,而有一个业务场景恰恰是要查86相关的数据,这个时候索引的作用就很大了,所以加索引有一些规则可以参考,但还是要看业务具体实现
  • 多用组合索引,组合索引可以提高索引复用率,覆盖索引可以减少回表
  • 深度分页

    业务上比较常见的问题,列表翻页到比较大的页面时接口会越来越慢,sql常会变成这样

    select * from account order by password limit 5000, 10;
    

    这时可用下面这条语句优化

    select * from account where id >= (select id from account order by password limit 5000, 1) limit 10;
    

    explain一下,可以看到第一条语句全表扫描了

    mysql> explain select * from account order by password limit 5000, 10;
    +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
    | id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
    +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
    |  1 | SIMPLE      | account | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9975 |   100.00 | Using filesort |
    +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
    

    那为什么使用第二条语句就能加快查询呢?

    image.png

    第一条语句排序所有记录,其实我们不需要这么多记录,第二条语句只拿了我们需要的记录,时间自然就少了

    分页查询还有一个容易踩的坑,用于排序的字段必须是唯一的,使用主键或者唯一的业务单号,如果排序的字段不唯一(比如创建时间),分页的过程会出现少处理或者多处理数据,严重的话会陷在死循环中出不来

    强制索引

    我们应该经常会遇到sql语句走错索引的情况,比如下面这条sql,本来是走idx_account_id这个索引的,最终mysql决定走主键索引

    当然,这只是一个例子,并不是说下面这条sql走idx_account_id索引更快

    mysql> explain select * from user where account_id between 1000 and 10002000 order by id desc limit 10;
    +----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
    | id | select_type | table | partitions | type  | possible_keys  | key     | key_len | ref  | rows | filtered | Extra                            |
    +----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
    |  1 | SIMPLE      | user  | NULL       | index | idx_account_id | PRIMARY | 4       | NULL |   20 |    50.00 | Using where; Backward index scan |
    +----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
    1 row in set, 1 warning (0.01 sec)
    

    如果是在测试环境,可以使用optimizer trace 功能查看索引变更的原因,首先需要开启optimizer trace

    SET optimizer_trace="enabled=on";
    select * from information_schema.OPTIMIZER_TRACE;
    

    比如上面的sql就是因为排序的原因导致索引变更

    {
      "reconsidering_access_paths_for_index_ordering": {
        "clause": "ORDER BY",
        "steps": [
        ],
        "index_order_summary": {
          "table": "`user`",
          "index_provides_order": true,
          "order_direction": "desc",
          "index": "PRIMARY",
          "plan_changed": true,
          "access_type": "index"
        }
      }
    }
    

    线上一般不会开启optimizer trace 功能,测试环境的数据量又不够,所以这种分析方式比较局限

    在平时的业务开发中,本人偶尔会遇到这种情况,A索引明明更快,结果走了B索引,如果是明确知道B索引更快,可以使用下面的语句强制走B索引

    select * from user force index(idx_account_id) where account_id between 1000 and 10002000 order by id desc limit 10;
    ​
    select /*+ INDEX(user idx_account_id) */ * from user where account_id between 1000 and 10002000 order by id desc limit 10;
    

    强制索引的弊端

  • 硬编码,不够优雅,代码需跟随索引名变更
  • 滞后性,慢sql治理一般是在功能上线之后,强制索引需要重新测试、打包上线,如果是非常严重的慢sql,这种方式不够敏捷
  • 巧用union

    业务开发中常有获取不同状态位数据,然后根据创建时间排序的需求,形式大概如下

    select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;
    

    explain一下,发现全表扫描了,没有用上索引

    mysql> explain select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;
    +----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
    | id | select_type | table   | partitions | type | possible_keys                    | key  | key_len | ref  | rows   | filtered | Extra                       |
    +----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
    |  1 | SIMPLE      | account | NULL       | ALL  | idx_mobile_area_code_create_time | NULL | NULL    | NULL | 991309 |   100.00 | Using where; Using filesort |
    +----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
    1 row in set, 1 warning (0.02 sec)
    

    为了用上idx_mobile_area_code_create_time组合索引,需先固定mobile_area_code,对二级索引进行反向扫描,得出以下语句,用上了索引,执行速度也快了许多

    select *
    from (
      (select * from account where mobile_area_code = '86' order by mobile_area_code, create_time desc limit 10)
         union all
      (select * from account where mobile_area_code = '96' order by mobile_area_code, create_time desc limit 10)
    ) t
    order by mobile_area_code, create_time desc
    limit 10
    
    巧用虚拟列

    还是上面的account表,如果想要统计7月1日10点到10日10点,创建时间在8点和20点的国内账号,按小时汇总

    大部分的同学会这样写

    select DATE_FORMAT(create_time, '%H') hour, count(1) cnt from account
     where mobile_area_code= '86'
     and create_time >= '2023-07-01 10:00:00'
     and create_time <= '2023-07-10 10:00:00'
    GROUP BY hour HAVING hour  in ('08','20')
    

    explain,用上了临时表

    mysql> explain select DATE_FORMAT(create_time, '%H') hour ,count(1) cnt_order  from account  where mobile_area_code= '86'  and create_time >= '2023-07-01 10:00:00'  and create_time <= '2023-07-10 10:00:00' GROUP
    BY hour HAVING hour  in ('08','20');
    +----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
    | id | select_type | table   | partitions | type  | possible_keys                    | key                              | key_len | ref  | rows | filtered | Extra                                     |
    +----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
    |  1 | SIMPLE      | account | NULL       | range | idx_mobile_area_code_create_time | idx_mobile_area_code_create_time | 72      | NULL | 6481 |   100.00 | Using where; Using index; Using temporary |
    +----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
    1 row in set, 1 warning (0.00 sec)
    

    加个虚拟列和索引

    alter table account add column hour_ char(2)  GENERATED ALWAYS AS (DATE_FORMAT(create_time, '%H')) VIRTUAL;
    alter table account add key idx_area_code_hour_create_time_4(mobile_area_code, hour_, create_time);
    

    改写sql,用上虚拟列索引

    select hour_ hour,count(1) cnt from account
    where hour_ in ('08','20')
     and mobile_area_code = '86'
     and create_time >= '2023-07-01 10:00:00'
     and create_time <= '2023-07-10 10:00:00'
    group by  hour_;
    

    explain一下,临时表没了,性能提升了一点

    mysql> explain select hour_ hour,count(1) cnt from account where hour_ in ('08','20')  and mobile_area_code = '86'  and create_time >= '2023-07-01 10:00:00'  and create_time <= '2023-07-10 10:00:00' group by  hour_;
    +----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
    | id | select_type | table   | partitions | type  | possible_keys                                                     | key                              | key_len | ref  | rows | filtered | Extra                    |
    +----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
    |  1 | SIMPLE      | account | NULL       | range | idx_mobile_area_code_create_time,idx_area_code_hour_create_time_4 | idx_area_code_hour_create_time_4 | 81      | NULL |  540 |   100.00 | Using where; Using index |
    +----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
    1 row in set, 1 warning (0.00 sec)
    
    模糊转精确

    业务开发中模糊查询是较难处理的,但业务上确确实实有这样的需求,很难砍,对数据库的压力又很大,本人所在的业务一般采用下面两种方式

  • 判断字段长度转精确,字段是支持模糊查询的,但用户输入的是精确的字段,比如订单号,很多用户是复制过来的,如果符合长度,直接精确查询,不符合长度,再进行模糊查询,可以减少一定量的慢sql语句
  • 业务上要求用户输入前面5位或后5位字符串,后端再经过中间表转换为精确字符,再进行查询
  • 虽然但是,模糊查询的处理跟业务强相关,本人业务的方式不一定适用所有

    排序优化

    根据极客时间Mysql45讲第16讲的例子,order by工作方式分为两种,全字段排序、rowid排序

    CREATE TABLE `t` (
     `id` int(11) NOT NULL,
     `city` varchar(16) NOT NULL,
     `name` varchar(16) NOT NULL,
     `age` int(11) NOT NULL,
     `addr` varchar(128) DEFAULT NULL,
     PRIMARY KEY (`id`),
     KEY `city` (`city`)
    ) ENGINE=InnoDB;
    

    比如下面的查询语句

    select city,name,age from t where city='杭州' order by name limit 1000;
    

    根据sort_buffer的大小,如果sort_buffer足够大,会把需要返回的字段全部放进去,根据排序字段排完序之后返回

    image.png
    如果sort_buffer不够大,会把主键id和排序字段放进去,排完序之后再回表,查询到所有业务字段之后再返回

    image.png

    当然,sort_buffer不够大的情况还会使用到文件排序

    按照这个原理,在业务上,经常有根据创建时间,更新时间排序的需求,如果根据创建时间来排序,sort_buffer要放多一个字段,有一个比较鸡贼的办法,创建时间的大小跟id是相差无几的,我们使用id来替换创建时间,当然,这个还是有一定的误差的,不同的业务创建时间的取法是不一样的,需要根据业务来评估

    代码分治

    将一条复杂sql拆分为多条sql,会提高代码复杂度,消耗测试资源,最好在写需求的时候就识别出来,说得容易,人都有惰性,大家懂的

    总结

  • 单表查询的情况下,慢sql一般都是没加索引或者没用上索引,比如隐式转换、函数计算、模糊查询,使用子查询
  • 多表查询的情况下,慢sql优化主要思想就是分治,通过代码或者sql的方式尽量缩小范围、分批处理数据
  • 相关文章

    Oracle如何使用授予和撤销权限的语法和示例
    Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
    下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
    社区版oceanbase安装
    Oracle 导出CSV工具-sqluldr2
    ETL数据集成丨快速将MySQL数据迁移至Doris数据库

    发布评论