本人对mysql的理解不深,前一整子公司有mysql相关的分享,学到了一些东西,加上业务上有一些慢sql优化,所以做了以下的总结
准备工作
首先准备两张表,account表与user表,两表可根据account_id联合查询,两张表各插入10000条数据
CREATE TABLE `account` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`email` varchar(100) DEFAULT NULL,
`mobile_area_code` varchar(16) DEFAULT NULL,
`mobile` varchar(64) DEFAULT NULL,
`password` varchar(64) DEFAULT NULL,
`create_time` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_password` (`password`),
KEY `idx_key_part` (`mobile`,`mobile_area_code`),
KEY `idx_mobile_area_code_create_time`(`mobile_area_code`, `create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`account_id` int(11) DEFAULT NULL,
`nick_name` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_account_id` (`account_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;
delimiter $$
CREATE PROCEDURE proc_initData4()
BEGIN
declare i int;
set i = 1;
set @dt='2023-06-01 00:00:00';
WHILE i<=1000000 DO
if i % 2 = 0 then
INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '86', uuid(), uuid(), date_add(@dt, interval i minute));
else
INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '96', uuid(), uuid(), date_add(@dt, interval i minute));
end if;
SET i = i+1;
END WHILE;
end $$
CALL proc_initData4();
delimiter $$
CREATE PROCEDURE proc_initData3()
BEGIN
declare i int;
set i = 1;
WHILE i<=1000000 DO
INSERT INTO user (id, `account_id`, `nick_name`) VALUES (i, i, uuid());
SET i = i+1;
END WHILE;
end $$
CALL proc_initData3();
explain
先看看explain中有什么关键字
mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | account | NULL | ALL | NULL | NULL | NULL | NULL | 9986 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
列名 | 作用 |
---|---|
id | 在一个查询中每个select关键字都对应一个唯一的id |
select_type | select关键字对应的查询类型 |
table | 表名 |
type | 每个select查询中针对该表的访问方法 |
possible_keys | 可能用到的索引 |
key | 实际用到的索引 |
key_len | 实际用到的索引长度 |
ref | 当使用索引列等值查询时,与索引列等值匹配的对象信息 |
rows | 预计需要读取的记录数 |
filtered | 按搜索条件过滤后剩余条数的占比 |
Extra | 一些额外信息 |
id
id值如果相同,可以认为是一组,从上往下顺序执行;在所有组中,id值越大,优先级越高,越先执行
如下图,id值相同,说明是同一组,从上往下顺序执行,出现在前面的是驱动表,出现在后面的是被驱动表,查询优化器准备让account表作为驱动表,user表作为被驱动表执行查询
mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
| 1 | SIMPLE | a | NULL | ALL | PRIMARY | NULL | NULL | NULL | 990675 | 100.00 | NULL |
| 1 | SIMPLE | u | NULL | ref | idx_account_id | idx_account_id | 5 | test.a.id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+
2 rows in set, 1 warning (0.07 sec)
如下图,语句包含子查询,id值不同,值越大,优先级越高,越先执行,所以是先执行子查询,再执行account表相关的查询
mysql> explain select * from account where id in (select account_id from user where id > 9800) or password > 'a4c9429f-d81d';
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
| 1 | PRIMARY | account | NULL | ALL | idx_password | NULL | NULL | NULL | 9986 | 100.00 | Using where |
| 2 | SUBQUERY | user | NULL | range | PRIMARY,account_id | PRIMARY | 4 | NULL | 200 | 100.00 | Using where |
+----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
如下图,不是所有的子查询id值都会不同,mysql有一个查询优化器,可能会对语句进行优化重写,转换为连接查询
mysql> explain select * from account where id in (select account_id from user where id > 9800) and password > 'a4c9429f-d81d';
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| 1 | SIMPLE | account | NULL | ALL | PRIMARY,idx_password | NULL | NULL | NULL | 990675 | 50.00 | Using where |
| 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | test.account.id | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | user | NULL | range | PRIMARY,idx_account_id | PRIMARY | 4 | NULL | 498167 | 100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.05 sec)
如下图,union 会对多个查询的结果合并起来并进行去重,所以会新建一个<union1,2>的临时表,id为null表明这个临时表是为了合并查询结果集创建的
mysql> explain select * from account where id > 9900 union select * from account where id > 9901;
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
| 1 | PRIMARY | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where |
| 2 | UNION | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where |
| NULL | UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+
3 rows in set, 1 warning (0.04 sec)
如下图,union all不会进行去重,自然也没有<union1,2>的临时表
mysql> explain select * from account where id > 9900 union all select * from account where id > 9901;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| 1 | PRIMARY | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where |
| 2 | UNION | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
select _type
下表中,上面都有例子,就不赘述了
select_type | 意思 |
---|---|
SIMPLE | 不包含union或者子查询 |
PRIMARY | 对于包含union、union all或者子查询的大查询来说,他是由几个小查询组成的,最左边的查询的select_type就是PRIMARY |
UNION | 对于包含union、union all的大查询,最左边的查询的select_type是PRIMARY,其余的小查询就是union |
UNION RESULT | 使用union查询后会进行去重 |
SUBQUERY | 包含子查询,并且该子查询不会被查询优化器转化成连接查询 |
下图中,先看id值,id值为2的语句先执行,id值为1的有两个,说明是连接查询,注意:<subquery2>
最后的2表示这个表是由id值为2的语句衍生出来的表
mysql> explain select * from account where id in (select account_id from user where id > 9900);
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
| 1 | SIMPLE | account | NULL | ALL | PRIMARY | NULL | NULL | NULL | 990675 | 100.00 | NULL |
| 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | test.account.id | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | user | NULL | range | PRIMARY,idx_account_id | PRIMARY | 4 | NULL | 498167 | 100.00 | Using where |
+----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
type
const
如下图,const表示通过索引一次就找到了,当主键或者唯一二级索引与常量进行等值匹配时,type为const
mysql> explain select * from account where id = 16;
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | account | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.02 sec)
eq_ref
如下图,eq_ref常见于主键或者唯一二级索引扫描,下图被驱动表扫描主键
mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| 1 | SIMPLE | u | NULL | ALL | account_id | NULL | NULL | NULL | 9851 | 100.00 | Using where |
| 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.u.account_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)
ref
如下图,ref常见于普通二级索引与常量进行等值匹配
mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
| 1 | SIMPLE | account | NULL | ref | idx_password | idx_password | 403 | const | 1 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
ref_or_null
如下图,ref_or_null常见于在ref的基础上,新增可为null的查询
mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76' or password is null;
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | account | NULL | ref_or_null | idx_password | idx_password | 403 | const | 2 | 100.00 | Using index condition |
+----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+
1 row in set, 1 warning (0.01 sec)
range
范围查询,较为简单
index和All
如下图,index常见于索引树全表扫描,与All的聚簇索引全表扫描不同,index所扫描的文件较聚簇索引文件小
mysql> explain select mobile from account where mobile_area_code = '86';
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | account | NULL | index | NULL | idx_key_part | 470 | NULL | 9986 | 10.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
possible_keys和key
如下图,可能用到的索引为idx_password,idx_key_part,实际用到的索引为idx_password,用什么索引是mysql的查询优化器经过成本计算后决定的,所以数据库中没有用到的索引需要删除,不然查询优化器计算成本需要更多的时间
mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | account | NULL | ref | idx_password,idx_key_part | idx_password | 403 | const | 1 | 5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
ref
如下图,在访问方法为eq_ref时,与索引列做等值匹配的是user表的account_id
mysql> explain select * from account a inner join user u on a.id = u.account_id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
| 1 | SIMPLE | u | NULL | ALL | account_id | NULL | NULL | NULL | 9851 | 100.00 | Using where |
| 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.u.account_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
如下图,等值匹配的也可以是一个函数
mysql> explain select * from account a inner join user u on a.id = upper(u.account_id);
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | u | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 100.00 | NULL |
| 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
rows和filtered
如下图,rows表示此次查询预计需要扫描的行数,filter表示预计需要扫描的行数中有多少行符合我们的条件,下图中9986条都符合我们的条件,所以filtered值应该越高越好
平时联表查询的时候,我们应该多关注filtered值,filtered越高,说明驱动表扫描出来的行数大部分都是有用的,回表的次数就会变少,语句才会更高效
mysql> explain select * from account;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | account | NULL | ALL | NULL | NULL | NULL | NULL | 9986 | 100.00 | NULL |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
Extra
Using where
如下图,使用全表扫描来执行语句,where条件中有针对某个字段的查询
mysql> explain select * from user where nick_name = 'a';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
如下图,使用某个索引来执行语句,where条件中包含了该索引列不存在的字段
mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | account | NULL | ref | idx_password,idx_key_part | idx_password | 403 | const | 1 | 5.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Using filesort
如下图,filesort表示在没有使用索引排序的情况下,在内存中(记录较少)或者在磁盘中(记录较多)进行排序
mysql> explain select * from user order by nick_name desc limit 10;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
Using temporary
如下图,mysql在无法使用索引的情况下,需要使用内部的临时表来执行语句,值得注意的是,以下语句extra中还有Using filesort,这是因为mysql会在group by后面加上order by
mysql> explain select nick_name, count(*) from user group by nick_name;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
| 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)
Using index
如下图,语句中查询的字段某个索引可以满足,不需要回表
mysql> explain select mobile from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76';
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | account | NULL | ref | idx_key_part | idx_key_part | 403 | const | 1 | 100.00 | Using index |
+----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Using index condition
没有找到合适的例子
Using join buffer
如下图,没有添加索引联表查询的情况下,会使用join buffer来提高查询速度
mysql> explain select * from account a inner join user u on u.nick_name = a.password;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 9047 | 100.00 | NULL |
| 1 | SIMPLE | u | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
常见的sql优化
加索引
大部分慢sql是没有加索引或索引没有加对导致,网上关于怎么加索引的文章非常多,我就不赘述了,只提以下两点
深度分页
业务上比较常见的问题,列表翻页到比较大的页面时接口会越来越慢,sql常会变成这样
select * from account order by password limit 5000, 10;
这时可用下面这条语句优化
select * from account where id >= (select id from account order by password limit 5000, 1) limit 10;
explain一下,可以看到第一条语句全表扫描了
mysql> explain select * from account order by password limit 5000, 10;
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | account | NULL | ALL | NULL | NULL | NULL | NULL | 9975 | 100.00 | Using filesort |
+----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
那为什么使用第二条语句就能加快查询呢?
第一条语句排序所有记录,其实我们不需要这么多记录,第二条语句只拿了我们需要的记录,时间自然就少了
分页查询还有一个容易踩的坑,用于排序的字段必须是唯一的,使用主键或者唯一的业务单号,如果排序的字段不唯一(比如创建时间),分页的过程会出现少处理或者多处理数据,严重的话会陷在死循环中出不来
强制索引
我们应该经常会遇到sql语句走错索引的情况,比如下面这条sql,本来是走idx_account_id这个索引的,最终mysql决定走主键索引
当然,这只是一个例子,并不是说下面这条sql走idx_account_id索引更快
mysql> explain select * from user where account_id between 1000 and 10002000 order by id desc limit 10;
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
| 1 | SIMPLE | user | NULL | index | idx_account_id | PRIMARY | 4 | NULL | 20 | 50.00 | Using where; Backward index scan |
+----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+
1 row in set, 1 warning (0.01 sec)
如果是在测试环境,可以使用optimizer trace 功能查看索引变更的原因,首先需要开启optimizer trace
SET optimizer_trace="enabled=on";
select * from information_schema.OPTIMIZER_TRACE;
比如上面的sql就是因为排序的原因导致索引变更
{
"reconsidering_access_paths_for_index_ordering": {
"clause": "ORDER BY",
"steps": [
],
"index_order_summary": {
"table": "`user`",
"index_provides_order": true,
"order_direction": "desc",
"index": "PRIMARY",
"plan_changed": true,
"access_type": "index"
}
}
}
线上一般不会开启optimizer trace 功能,测试环境的数据量又不够,所以这种分析方式比较局限
在平时的业务开发中,本人偶尔会遇到这种情况,A索引明明更快,结果走了B索引,如果是明确知道B索引更快,可以使用下面的语句强制走B索引
select * from user force index(idx_account_id) where account_id between 1000 and 10002000 order by id desc limit 10;
select /*+ INDEX(user idx_account_id) */ * from user where account_id between 1000 and 10002000 order by id desc limit 10;
强制索引的弊端
巧用union
业务开发中常有获取不同状态位数据,然后根据创建时间排序的需求,形式大概如下
select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;
explain一下,发现全表扫描了,没有用上索引
mysql> explain select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
| 1 | SIMPLE | account | NULL | ALL | idx_mobile_area_code_create_time | NULL | NULL | NULL | 991309 | 100.00 | Using where; Using filesort |
+----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+
1 row in set, 1 warning (0.02 sec)
为了用上idx_mobile_area_code_create_time组合索引,需先固定mobile_area_code,对二级索引进行反向扫描,得出以下语句,用上了索引,执行速度也快了许多
select *
from (
(select * from account where mobile_area_code = '86' order by mobile_area_code, create_time desc limit 10)
union all
(select * from account where mobile_area_code = '96' order by mobile_area_code, create_time desc limit 10)
) t
order by mobile_area_code, create_time desc
limit 10
巧用虚拟列
还是上面的account表,如果想要统计7月1日10点到10日10点,创建时间在8点和20点的国内账号,按小时汇总
大部分的同学会这样写
select DATE_FORMAT(create_time, '%H') hour, count(1) cnt from account
where mobile_area_code= '86'
and create_time >= '2023-07-01 10:00:00'
and create_time <= '2023-07-10 10:00:00'
GROUP BY hour HAVING hour in ('08','20')
explain,用上了临时表
mysql> explain select DATE_FORMAT(create_time, '%H') hour ,count(1) cnt_order from account where mobile_area_code= '86' and create_time >= '2023-07-01 10:00:00' and create_time <= '2023-07-10 10:00:00' GROUP
BY hour HAVING hour in ('08','20');
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
| 1 | SIMPLE | account | NULL | range | idx_mobile_area_code_create_time | idx_mobile_area_code_create_time | 72 | NULL | 6481 | 100.00 | Using where; Using index; Using temporary |
+----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+
1 row in set, 1 warning (0.00 sec)
加个虚拟列和索引
alter table account add column hour_ char(2) GENERATED ALWAYS AS (DATE_FORMAT(create_time, '%H')) VIRTUAL;
alter table account add key idx_area_code_hour_create_time_4(mobile_area_code, hour_, create_time);
改写sql,用上虚拟列索引
select hour_ hour,count(1) cnt from account
where hour_ in ('08','20')
and mobile_area_code = '86'
and create_time >= '2023-07-01 10:00:00'
and create_time <= '2023-07-10 10:00:00'
group by hour_;
explain一下,临时表没了,性能提升了一点
mysql> explain select hour_ hour,count(1) cnt from account where hour_ in ('08','20') and mobile_area_code = '86' and create_time >= '2023-07-01 10:00:00' and create_time <= '2023-07-10 10:00:00' group by hour_;
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | account | NULL | range | idx_mobile_area_code_create_time,idx_area_code_hour_create_time_4 | idx_area_code_hour_create_time_4 | 81 | NULL | 540 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
模糊转精确
业务开发中模糊查询是较难处理的,但业务上确确实实有这样的需求,很难砍,对数据库的压力又很大,本人所在的业务一般采用下面两种方式
虽然但是,模糊查询的处理跟业务强相关,本人业务的方式不一定适用所有
排序优化
根据极客时间Mysql45讲第16讲的例子,order by工作方式分为两种,全字段排序、rowid排序
CREATE TABLE `t` (
`id` int(11) NOT NULL,
`city` varchar(16) NOT NULL,
`name` varchar(16) NOT NULL,
`age` int(11) NOT NULL,
`addr` varchar(128) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `city` (`city`)
) ENGINE=InnoDB;
比如下面的查询语句
select city,name,age from t where city='杭州' order by name limit 1000;
根据sort_buffer的大小,如果sort_buffer足够大,会把需要返回的字段全部放进去,根据排序字段排完序之后返回
如果sort_buffer不够大,会把主键id和排序字段放进去,排完序之后再回表,查询到所有业务字段之后再返回
当然,sort_buffer不够大的情况还会使用到文件排序
按照这个原理,在业务上,经常有根据创建时间,更新时间排序的需求,如果根据创建时间来排序,sort_buffer要放多一个字段,有一个比较鸡贼的办法,创建时间的大小跟id是相差无几的,我们使用id来替换创建时间,当然,这个还是有一定的误差的,不同的业务创建时间的取法是不一样的,需要根据业务来评估
代码分治
将一条复杂sql拆分为多条sql,会提高代码复杂度,消耗测试资源,最好在写需求的时候就识别出来,说得容易,人都有惰性,大家懂的