Mysql一些SQL优化的方式
本人对mysql的理解不深,前一整子公司有mysql相关的分享,学到了一些东西,加上业务上有一些慢sql优化,所以做了以下的总结
准备工作
首先准备两张表,account表与user表,两表可根据account_id联合查询,两张表各插入10000条数据
CREATE TABLE `account` ( `id` int(11) NOT NULL AUTO_INCREMENT, `email` varchar(100) DEFAULT NULL, `mobile_area_code` varchar(16) DEFAULT NULL, `mobile` varchar(64) DEFAULT NULL, `password` varchar(64) DEFAULT NULL, `create_time` datetime NOT NULL, PRIMARY KEY (`id`), KEY `idx_password` (`password`), KEY `idx_key_part` (`mobile`,`mobile_area_code`), KEY `idx_mobile_area_code_create_time`(`mobile_area_code`, `create_time`) ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4; CREATE TABLE `user` ( `id` int(11) NOT NULL AUTO_INCREMENT, `account_id` int(11) DEFAULT NULL, `nick_name` varchar(100) DEFAULT NULL, PRIMARY KEY (`id`), KEY `idx_account_id` (`account_id`) ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;
delimiter $$ CREATE PROCEDURE proc_initData4() BEGIN declare i int; set i = 1; set @dt='2023-06-01 00:00:00'; WHILE i<=1000000 DO if i % 2 = 0 then INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '86', uuid(), uuid(), date_add(@dt, interval i minute)); else INSERT INTO account (id, `email`, `mobile_area_code`, `mobile`, `password`, `create_time`) VALUES (i, uuid(), '96', uuid(), uuid(), date_add(@dt, interval i minute)); end if; SET i = i+1; END WHILE; end $$ CALL proc_initData4(); delimiter $$ CREATE PROCEDURE proc_initData3() BEGIN declare i int; set i = 1; WHILE i<=1000000 DO INSERT INTO user (id, `account_id`, `nick_name`) VALUES (i, i, uuid()); SET i = i+1; END WHILE; end $$ CALL proc_initData3();
explain
先看看explain中有什么关键字
mysql> explain select * from account; +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+ | 1 | SIMPLE | account | NULL | ALL | NULL | NULL | NULL | NULL | 9986 | 100.00 | NULL | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)
列名 | 作用 |
---|---|
id | 在一个查询中每个select关键字都对应一个唯一的id |
select_type | select关键字对应的查询类型 |
table | 表名 |
type | 每个select查询中针对该表的访问方法 |
possible_keys | 可能用到的索引 |
key | 实际用到的索引 |
key_len | 实际用到的索引长度 |
ref | 当使用索引列等值查询时,与索引列等值匹配的对象信息 |
rows | 预计需要读取的记录数 |
filtered | 按搜索条件过滤后剩余条数的占比 |
Extra | 一些额外信息 |
id
id值如果相同,可以认为是一组,从上往下顺序执行;在所有组中,id值越大,优先级越高,越先执行
如下图,id值相同,说明是同一组,从上往下顺序执行,出现在前面的是驱动表,出现在后面的是被驱动表,查询优化器准备让account表作为驱动表,user表作为被驱动表执行查询
mysql> explain select * from account a inner join user u on a.id = u.account_id; +----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+ | 1 | SIMPLE | a | NULL | ALL | PRIMARY | NULL | NULL | NULL | 990675 | 100.00 | NULL | | 1 | SIMPLE | u | NULL | ref | idx_account_id | idx_account_id | 5 | test.a.id | 1 | 100.00 | NULL | +----+-------------+-------+------------+------+----------------+----------------+---------+-----------+--------+----------+-------+ 2 rows in set, 1 warning (0.07 sec)
如下图,语句包含子查询,id值不同,值越大,优先级越高,越先执行,所以是先执行子查询,再执行account表相关的查询
mysql> explain select * from account where id in (select account_id from user where id > 9800) or password > 'a4c9429f-d81d'; +----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+ | 1 | PRIMARY | account | NULL | ALL | idx_password | NULL | NULL | NULL | 9986 | 100.00 | Using where | | 2 | SUBQUERY | user | NULL | range | PRIMARY,account_id | PRIMARY | 4 | NULL | 200 | 100.00 | Using where | +----+-------------+---------+------------+-------+--------------------+---------+---------+------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec)
如下图,不是所有的子查询id值都会不同,mysql有一个查询优化器,可能会对语句进行优化重写,转换为连接查询
mysql> explain select * from account where id in (select account_id from user where id > 9800) and password > 'a4c9429f-d81d'; +----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+ | 1 | SIMPLE | account | NULL | ALL | PRIMARY,idx_password | NULL | NULL | NULL | 990675 | 50.00 | Using where | | 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | test.account.id | 1 | 100.00 | NULL | | 2 | MATERIALIZED | user | NULL | range | PRIMARY,idx_account_id | PRIMARY | 4 | NULL | 498167 | 100.00 | Using where | +----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+ 3 rows in set, 1 warning (0.05 sec)
如下图,union 会对多个查询的结果合并起来并进行去重,所以会新建一个<union1,2>的临时表,id为null表明这个临时表是为了合并查询结果集创建的
mysql> explain select * from account where id > 9900 union select * from account where id > 9901; +----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+ | 1 | PRIMARY | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where | | 2 | UNION | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where | | NULL | UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary | +----+--------------+------------+------------+-------+---------------+---------+---------+------+--------+----------+-----------------+ 3 rows in set, 1 warning (0.04 sec)
如下图,union all不会进行去重,自然也没有<union1,2>的临时表
mysql> explain select * from account where id > 9900 union all select * from account where id > 9901; +----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ | 1 | PRIMARY | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where | | 2 | UNION | account | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 495337 | 100.00 | Using where | +----+-------------+---------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec)
select _type
下表中,上面都有例子,就不赘述了
select_type | 意思 |
---|---|
SIMPLE | 不包含union或者子查询 |
PRIMARY | 对于包含union、union all或者子查询的大查询来说,他是由几个小查询组成的,最左边的查询的select_type就是PRIMARY |
UNION | 对于包含union、union all的大查询,最左边的查询的select_type是PRIMARY,其余的小查询就是union |
UNION RESULT | 使用union查询后会进行去重 |
SUBQUERY | 包含子查询,并且该子查询不会被查询优化器转化成连接查询 |
下图中,先看id值,id值为2的语句先执行,id值为1的有两个,说明是连接查询,注意:<subquery2>
最后的2表示这个表是由id值为2的语句衍生出来的表
mysql> explain select * from account where id in (select account_id from user where id > 9900); +----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+ | 1 | SIMPLE | account | NULL | ALL | PRIMARY | NULL | NULL | NULL | 990675 | 100.00 | NULL | | 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | test.account.id | 1 | 100.00 | NULL | | 2 | MATERIALIZED | user | NULL | range | PRIMARY,idx_account_id | PRIMARY | 4 | NULL | 498167 | 100.00 | Using where | +----+--------------+-------------+------------+--------+------------------------+---------------------+---------+-----------------+--------+----------+-------------+ 3 rows in set, 1 warning (0.00 sec)
type
const
如下图,const表示通过索引一次就找到了,当主键或者唯一二级索引与常量进行等值匹配时,type为const
mysql> explain select * from account where id = 16; +----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | 1 | SIMPLE | account | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL | +----+-------------+---------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.02 sec)
eq_ref
如下图,eq_ref常见于主键或者唯一二级索引扫描,下图被驱动表扫描主键
mysql> explain select * from account a inner join user u on a.id = u.account_id; +----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+ | 1 | SIMPLE | u | NULL | ALL | account_id | NULL | NULL | NULL | 9851 | 100.00 | Using where | | 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.u.account_id | 1 | 100.00 | NULL | +----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+ 2 rows in set, 1 warning (0.01 sec)
ref
如下图,ref常见于普通二级索引与常量进行等值匹配
mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76'; +----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+ | 1 | SIMPLE | account | NULL | ref | idx_password | idx_password | 403 | const | 1 | 100.00 | NULL | +----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)
ref_or_null
如下图,ref_or_null常见于在ref的基础上,新增可为null的查询
mysql> explain select * from account where password = 'a4c799db-d81d-11eb-9e9b-54ce0dd3fc76' or password is null; +----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+ | 1 | SIMPLE | account | NULL | ref_or_null | idx_password | idx_password | 403 | const | 2 | 100.00 | Using index condition | +----+-------------+---------+------------+-------------+---------------+--------------+---------+-------+------+----------+-----------------------+ 1 row in set, 1 warning (0.01 sec)
range
范围查询,较为简单
index和All
如下图,index常见于索引树全表扫描,与All的聚簇索引全表扫描不同,index所扫描的文件较聚簇索引文件小
mysql> explain select mobile from account where mobile_area_code = '86'; +----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+ | 1 | SIMPLE | account | NULL | index | NULL | idx_key_part | 470 | NULL | 9986 | 10.00 | Using where; Using index | +----+-------------+---------+------------+-------+---------------+--------------+---------+------+------+----------+--------------------------+ 1 row in set, 1 warning (0.00 sec)
possible_keys和key
如下图,可能用到的索引为idx_password,idx_key_part,实际用到的索引为idx_password,用什么索引是mysql的查询优化器经过成本计算后决定的,所以数据库中没有用到的索引需要删除,不然查询优化器计算成本需要更多的时间
mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76'; +----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+ | 1 | SIMPLE | account | NULL | ref | idx_password,idx_key_part | idx_password | 403 | const | 1 | 5.00 | Using where | +----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
ref
如下图,在访问方法为eq_ref时,与索引列做等值匹配的是user表的account_id
mysql> explain select * from account a inner join user u on a.id = u.account_id; +----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+ | 1 | SIMPLE | u | NULL | ALL | account_id | NULL | NULL | NULL | 9851 | 100.00 | Using where | | 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.u.account_id | 1 | 100.00 | NULL | +----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec)
如下图,等值匹配的也可以是一个函数
mysql> explain select * from account a inner join user u on a.id = upper(u.account_id); +----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+ | 1 | SIMPLE | u | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 100.00 | NULL | | 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | Using where | +----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec)
rows和filtered
如下图,rows表示此次查询预计需要扫描的行数,filter表示预计需要扫描的行数中有多少行符合我们的条件,下图中9986条都符合我们的条件,所以filtered值应该越高越好
平时联表查询的时候,我们应该多关注filtered值,filtered越高,说明驱动表扫描出来的行数大部分都是有用的,回表的次数就会变少,语句才会更高效
mysql> explain select * from account; +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+ | 1 | SIMPLE | account | NULL | ALL | NULL | NULL | NULL | NULL | 9986 | 100.00 | NULL | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec)
Extra
Using where
如下图,使用全表扫描来执行语句,where条件中有针对某个字段的查询
mysql> explain select * from user where nick_name = 'a'; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 10.00 | Using where | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
如下图,使用某个索引来执行语句,where条件中包含了该索引列不存在的字段
mysql> explain select * from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76' and password = 'a4c4a5ff-d81d-11eb-9e9b-54ce0dd3fc76'; +----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+ | 1 | SIMPLE | account | NULL | ref | idx_password,idx_key_part | idx_password | 403 | const | 1 | 5.00 | Using where | +----+-------------+---------+------------+------+---------------------------+--------------+---------+-------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
Using filesort
如下图,filesort表示在没有使用索引排序的情况下,在内存中(记录较少)或者在磁盘中(记录较多)进行排序
mysql> explain select * from user order by nick_name desc limit 10; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+ | 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 100.00 | Using filesort | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+ 1 row in set, 1 warning (0.00 sec)
Using temporary
如下图,mysql在无法使用索引的情况下,需要使用内部的临时表来执行语句,值得注意的是,以下语句extra中还有Using filesort,这是因为mysql会在group by后面加上order by
mysql> explain select nick_name, count(*) from user group by nick_name; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+ | 1 | SIMPLE | user | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 100.00 | Using temporary; Using filesort | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------+ 1 row in set, 1 warning (0.00 sec)
Using index
如下图,语句中查询的字段某个索引可以满足,不需要回表
mysql> explain select mobile from account where mobile = 'a4c4a5fe-d81d-11eb-9e9b-54ce0dd3fc76'; +----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+ | 1 | SIMPLE | account | NULL | ref | idx_key_part | idx_key_part | 403 | const | 1 | 100.00 | Using index | +----+-------------+---------+------------+------+---------------+--------------+---------+-------+------+----------+-------------+ 1 row in set, 1 warning (0.00 sec)
Using index condition
没有找到合适的例子
Using join buffer
如下图,没有添加索引联表查询的情况下,会使用join buffer来提高查询速度
mysql> explain select * from account a inner join user u on u.nick_name = a.password; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+ | 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 9047 | 100.00 | NULL | | 1 | SIMPLE | u | NULL | ALL | NULL | NULL | NULL | NULL | 9851 | 10.00 | Using where; Using join buffer (Block Nested Loop) | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+ 2 rows in set, 1 warning (0.00 sec)
常见的sql优化
加索引
大部分慢sql是没有加索引或索引没有加对导致,网上关于怎么加索引的文章非常多,我就不赘述了,只提以下两点
深度分页
业务上比较常见的问题,列表翻页到比较大的页面时接口会越来越慢,sql常会变成这样
select * from account order by password limit 5000, 10;
这时可用下面这条语句优化
select * from account where id >= (select id from account order by password limit 5000, 1) limit 10;
explain一下,可以看到第一条语句全表扫描了
mysql> explain select * from account order by password limit 5000, 10; +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+ | 1 | SIMPLE | account | NULL | ALL | NULL | NULL | NULL | NULL | 9975 | 100.00 | Using filesort | +----+-------------+---------+------------+------+---------------+------+---------+------+------+----------+----------------+
那为什么使用第二条语句就能加快查询呢?
第一条语句排序所有记录,其实我们不需要这么多记录,第二条语句只拿了我们需要的记录,时间自然就少了
分页查询还有一个容易踩的坑,用于排序的字段必须是唯一的,使用主键或者唯一的业务单号,如果排序的字段不唯一(比如创建时间),分页的过程会出现少处理或者多处理数据,严重的话会陷在死循环中出不来
强制索引
我们应该经常会遇到sql语句走错索引的情况,比如下面这条sql,本来是走idx_account_id这个索引的,最终mysql决定走主键索引
当然,这只是一个例子,并不是说下面这条sql走idx_account_id索引更快
mysql> explain select * from user where account_id between 1000 and 10002000 order by id desc limit 10; +----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+ | 1 | SIMPLE | user | NULL | index | idx_account_id | PRIMARY | 4 | NULL | 20 | 50.00 | Using where; Backward index scan | +----+-------------+-------+------------+-------+----------------+---------+---------+------+------+----------+----------------------------------+ 1 row in set, 1 warning (0.01 sec)
如果是在测试环境,可以使用optimizer trace 功能查看索引变更的原因,首先需要开启optimizer trace
SET optimizer_trace="enabled=on"; select * from information_schema.OPTIMIZER_TRACE;
比如上面的sql就是因为排序的原因导致索引变更
{ "reconsidering_access_paths_for_index_ordering": { "clause": "ORDER BY", "steps": [ ], "index_order_summary": { "table": "`user`", "index_provides_order": true, "order_direction": "desc", "index": "PRIMARY", "plan_changed": true, "access_type": "index" } } }
线上一般不会开启optimizer trace 功能,测试环境的数据量又不够,所以这种分析方式比较局限
在平时的业务开发中,本人偶尔会遇到这种情况,A索引明明更快,结果走了B索引,如果是明确知道B索引更快,可以使用下面的语句强制走B索引
select * from user force index(idx_account_id) where account_id between 1000 and 10002000 order by id desc limit 10; select /*+ INDEX(user idx_account_id) */ * from user where account_id between 1000 and 10002000 order by id desc limit 10;
强制索引的弊端
巧用union
业务开发中常有获取不同状态位数据,然后根据创建时间排序的需求,形式大概如下
select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10;
explain一下,发现全表扫描了,没有用上索引
mysql> explain select * from account where mobile_area_code in ('86', '96') order by mobile_area_code, create_time desc limit 10; +----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+ | 1 | SIMPLE | account | NULL | ALL | idx_mobile_area_code_create_time | NULL | NULL | NULL | 991309 | 100.00 | Using where; Using filesort | +----+-------------+---------+------------+------+----------------------------------+------+---------+------+--------+----------+-----------------------------+ 1 row in set, 1 warning (0.02 sec)
为了用上idx_mobile_area_code_create_time组合索引,需先固定mobile_area_code,对二级索引进行反向扫描,得出以下语句,用上了索引,执行速度也快了许多
select * from ( (select * from account where mobile_area_code = '86' order by mobile_area_code, create_time desc limit 10) union all (select * from account where mobile_area_code = '96' order by mobile_area_code, create_time desc limit 10) ) t order by mobile_area_code, create_time desc limit 10
巧用虚拟列
还是上面的account表,如果想要统计7月1日10点到10日10点,创建时间在8点和20点的国内账号,按小时汇总
大部分的同学会这样写
select DATE_FORMAT(create_time, '%H') hour, count(1) cnt from account where mobile_area_code= '86' and create_time >= '2023-07-01 10:00:00' and create_time <= '2023-07-10 10:00:00' GROUP BY hour HAVING hour in ('08','20')
explain,用上了临时表
mysql> explain select DATE_FORMAT(create_time, '%H') hour ,count(1) cnt_order from account where mobile_area_code= '86' and create_time >= '2023-07-01 10:00:00' and create_time <= '2023-07-10 10:00:00' GROUP BY hour HAVING hour in ('08','20'); +----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+ | 1 | SIMPLE | account | NULL | range | idx_mobile_area_code_create_time | idx_mobile_area_code_create_time | 72 | NULL | 6481 | 100.00 | Using where; Using index; Using temporary | +----+-------------+---------+------------+-------+----------------------------------+----------------------------------+---------+------+------+----------+-------------------------------------------+ 1 row in set, 1 warning (0.00 sec)
加个虚拟列和索引
alter table account add column hour_ char(2) GENERATED ALWAYS AS (DATE_FORMAT(create_time, '%H')) VIRTUAL; alter table account add key idx_area_code_hour_create_time_4(mobile_area_code, hour_, create_time);
改写sql,用上虚拟列索引
select hour_ hour,count(1) cnt from account where hour_ in ('08','20') and mobile_area_code = '86' and create_time >= '2023-07-01 10:00:00' and create_time <= '2023-07-10 10:00:00' group by hour_;
explain一下,临时表没了,性能提升了一点
mysql> explain select hour_ hour,count(1) cnt from account where hour_ in ('08','20') and mobile_area_code = '86' and create_time >= '2023-07-01 10:00:00' and create_time <= '2023-07-10 10:00:00' group by hour_; +----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+ | 1 | SIMPLE | account | NULL | range | idx_mobile_area_code_create_time,idx_area_code_hour_create_time_4 | idx_area_code_hour_create_time_4 | 81 | NULL | 540 | 100.00 | Using where; Using index | +----+-------------+---------+------------+-------+-------------------------------------------------------------------+----------------------------------+---------+------+------+----------+--------------------------+ 1 row in set, 1 warning (0.00 sec)
模糊转精确
业务开发中模糊查询是较难处理的,但业务上确确实实有这样的需求,很难砍,对数据库的压力又很大,本人所在的业务一般采用下面两种方式
虽然但是,模糊查询的处理跟业务强相关,本人业务的方式不一定适用所有
排序优化
根据极客时间Mysql45讲第16讲的例子,order by工作方式分为两种,全字段排序、rowid排序
CREATE TABLE `t` ( `id` int(11) NOT NULL, `city` varchar(16) NOT NULL, `name` varchar(16) NOT NULL, `age` int(11) NOT NULL, `addr` varchar(128) DEFAULT NULL, PRIMARY KEY (`id`), KEY `city` (`city`) ) ENGINE=InnoDB;
比如下面的查询语句
select city,name,age from t where city='杭州' order by name limit 1000;
根据sort_buffer的大小,如果sort_buffer足够大,会把需要返回的字段全部放进去,根据排序字段排完序之后返回
如果sort_buffer不够大,会把主键id和排序字段放进去,排完序之后再回表,查询到所有业务字段之后再返回
当然,sort_buffer不够大的情况还会使用到文件排序
按照这个原理,在业务上,经常有根据创建时间,更新时间排序的需求,如果根据创建时间来排序,sort_buffer要放多一个字段,有一个比较鸡贼的办法,创建时间的大小跟id是相差无几的,我们使用id来替换创建时间,当然,这个还是有一定的误差的,不同的业务创建时间的取法是不一样的,需要根据业务来评估
代码分治
将一条复杂sql拆分为多条sql,会提高代码复杂度,消耗测试资源,最好在写需求的时候就识别出来,说得容易,人都有惰性,大家懂的