##
MySQL 8 优化案例:Why Are Queries with Many IN Values More Expensive After Upgrading to MySQL 8.x?
作者毫无保留的用案例告诉我们如何优化多个in值的SQL语句,强推
原文:https://www.percona.com/blog/why-are-queries-with-many-in-values-more-expensive-after-upgrading-to-mysql-8-x/
About the Author:Przemysław Malkowski
Przemek joined Support Team at Percona in August 2012. Before that he spent over five years working for Wikia.com (Quantcast Top 50) as System Administrator where he was a key person responsible for seamless building up MySQL powered database infrastructure. Besides MySQL he worked on maintaining all other parts of LAMP stack, with main focus on automation, monitoring and backups.
Some of our Percona Support customers report performance degradation after a major MySQL upgrade, and there can be many different reasons for this. These days, the most common major upgrade is from MySQL 5.7 (which recently reached EOL) to 8.0, and I am going to emphasize one important case that affects many database instances.
Range optimization problem
Queries having many IN values use Equality Range Optimization in MySQL query optimizer. Let’s say our query looks like this:
SELECT COUNT(*) FROM test.sbtest1 WHERE id IN (10,50,200,...,30822);
There are ten thousand values in the compared list. Running this one under MySQL 5.7 results in the following execution statistics:
mysql57 > source query1.sql
+----------+
| count(*) |
+----------+
| 17433 |
+----------+
1 row in set (0.16 sec)
Slow log entry (truncated for readability):
# Time: 2024-06-23T13:34:05.912909Z
# User@Host: msandbox[msandbox] @ localhost [] Id: 6
# Query_time: 0.161071 Lock_time: 0.021591 Rows_sent: 1 Rows_examined: 17433
SET timestamp=1719149645;
select count(*) from test.sbtest1 where id in (...);
As well as handler stats confirm the index is used properly and optimization worked:
mysql57 > show status like 'ha%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 2 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 17433 |
| Handler_read_last | 0 |
| Handler_read_next | 0 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
18 rows in set (0.00 sec)
The query EXPLAIN confirms the range type and index:
mysql57 > source explain1.sql
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sbtest1
partitions: NULL
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 17433
filtered: 100.00
Extra: Using where; Using index
1 row in set, 1 warning (0.06 sec)
However, in our upgraded instance, using the same table, we end up with the following results:
mysql80 > source query1.sql
+----------+
| count(*) |
+----------+
| 17433 |
+----------+
1 row in set, 1 warning (0.38 sec)
mysql80 > show warnings;
*************************** 1. row ***************************
Level: Warning
Code: 3170
Message: Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.
1 row in set (0.00 sec)
So the above warning already informs us about problems, which lead to a much worse optimizer plan as below:
# Time: 2024-06-23T13:44:51.610282Z
# User@Host: msandbox[msandbox] @ localhost [] Id: 18
# Query_time: 0.385067 Lock_time: 0.000004 Rows_sent: 1 Rows_examined: 1200000
SET timestamp=1719150291;
select count(*) from test.sbtest1 where id in (...);
mysql80 > show status like 'ha%';
+----------------------------+---------+
| Variable_name | Value |
+----------------------------+---------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 2 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 1 |
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 1200000 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+---------+
18 rows in set (0.00 sec)
Query plan surprisingly shows a secondary index on a different column, and practically all table rows scanned:
mysql80 > source explain1.sql
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sbtest1
partitions: NULL
type: index
possible_keys: PRIMARY
key: k
key_len: 4
ref: NULL
rows: 1183608
filtered: 50.00
Extra: Using where; Using index
1 row in set, 2 warnings (0.05 sec)
Now, the range optimization has the allowed memory limit defined by the range_optimizer_max_mem_size
variable, which is, however, the same for MySQL 5.7 and 8.0 (8MB)! So why doesn’t the very same query against the same table fit in the newer version?
Let’s check the related memory footprint as instrumented in Performance Schema. On 5.7 (obtained after truncating the P_S table and running the query), it is about 5.5 MB:
mysql57 > select * from performance_schema.memory_summary_by_thread_by_event_name where thread_id=(select THREAD_ID from performance_schema.threads where processlist_id=CONNECTION_ID()) and
event_name='memory/sql/test_quick_select'G
*************************** 1. row ***************************
THREAD_ID: 32
EVENT_NAME: memory/sql/test_quick_select
COUNT_ALLOC: 104
COUNT_FREE: 104
SUM_NUMBER_OF_BYTES_ALLOC: 5705856
SUM_NUMBER_OF_BYTES_FREE: 5705856
LOW_COUNT_USED: 0
CURRENT_COUNT_USED: 0
HIGH_COUNT_USED: 104
LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 0
HIGH_NUMBER_OF_BYTES_USED: 5705856
1 row in set (0.00 sec)
But on MySQL 8.0, it is much higher, around 11.5 MB, so higher than the allowed limit of 8 MB:
mysql8.0 > select * from performance_schema.memory_summary_by_thread_by_event_name where thread_id=PS_CURRENT_THREAD_ID() and event_name='memory/sql/test_quick_select'G
*************************** 1. row ***************************
THREAD_ID: 47
EVENT_NAME: memory/sql/test_quick_select
COUNT_ALLOC: 18
COUNT_FREE: 18
SUM_NUMBER_OF_BYTES_ALLOC: 12099576
SUM_NUMBER_OF_BYTES_FREE: 12099576
LOW_COUNT_USED: 0
CURRENT_COUNT_USED: 0
HIGH_COUNT_USED: 18
LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 0
HIGH_NUMBER_OF_BYTES_USED: 12099576
1 row in set (0.00 sec)
Increasing the variable fixes the query plan:
mysql80 > set range_optimizer_max_mem_size=12*1024*1024;
Query OK, 0 rows affected (0.00 sec)
mysql80 > source query1.sql
+----------+
| count(*) |
+----------+
| 17433 |
+----------+
1 row in set (0.10 sec)
In my opinion, this memory requirement difference is unjustified. Therefore I reported the regression here:
https://bugs.mysql.com/bug.php?id=115327
And separately, a misleading documentation bug:
https://bugs.mysql.com/bug.php?id=115062
This regression becomes more impactful the more indexes are on the table, as MySQL 8.0 is affected by this other related bug:
https://bugs.mysql.com/bug.php?id=104000
Summary
Upgrading to MySQL 8.x may be challenging and should be well-tested before implementation to avoid some very bad surprises, like the one above! In the long term, though, it is inevitable, as 5.7 reached EOL. Before this particular regression or any other that affects you is fixed, you may consider using our post-EOL support for the time.
As an addition, let me remind you that we have a tool useful for checking if any query behaves differently on the new version. Here is a quick update that illustrates how the extra warning was spot by it:
$ pt-upgrade h=127.0.0.1,P=5756,u=msandbox,p=***,D=test h=127.0.0.1,P=8055,u=msandbox,p=msandbox,D=test slow57.log
#-----------------------------------------------------------------------
# Logs
#-----------------------------------------------------------------------
File: slow57.log
Size: 141681
#-----------------------------------------------------------------------
# Hosts
#-----------------------------------------------------------------------
host1:
DSN: h=127.0.0.1,P=5756
hostname: przemek-dbg
MySQL: MySQL Community Server (GPL) 5.7.44
host2:
DSN: h=127.0.0.1,P=8055
hostname: przemek-dbg
MySQL: MySQL Community Server - GPL 8.0.37
########################################################################
# Query class F4A5056EC85D02D0
########################################################################
Reporting class because it has diffs, but hasn't been reported yet.
Total queries 1
Unique queries 1
Discarded queries 0
select count(*) from test.sbtest? where id in(?+)
##
## Warning diffs: 1
##
-- 1.
No warning 3170
vs.
Code: 3170
Level: Warning
Message: Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.
select count(*) from test.sbtest1 where id in (90, ... ,13668,15161)
#-----------------------------------------------------------------------
# Stats
#-----------------------------------------------------------------------
failed_queries 0
not_select 0
queries_filtered 0
queries_no_diffs 0
queries_read 1
queries_with_diffs 1
queries_with_errors 0
MySQL Performance Tuning is an essential guide covering the critical aspects of MySQL performance optimization.