MySQL 8 优化案例:多个IN值的参数优化

2024年 7月 1日 52.6k 0

##

MySQL 8 优化案例:Why Are Queries with Many IN Values More Expensive After Upgrading to MySQL 8.x?

作者毫无保留的用案例告诉我们如何优化多个in值的SQL语句,强推

原文:https://www.percona.com/blog/why-are-queries-with-many-in-values-more-expensive-after-upgrading-to-mysql-8-x/

About the Author:Przemysław Malkowski
Przemek joined Support Team at Percona in August 2012. Before that he spent over five years working for Wikia.com (Quantcast Top 50) as System Administrator where he was a key person responsible for seamless building up MySQL powered database infrastructure. Besides MySQL he worked on maintaining all other parts of LAMP stack, with main focus on automation, monitoring and backups.

Some of our Percona Support customers report performance degradation after a major MySQL upgrade, and there can be many different reasons for this. These days, the most common major upgrade is from MySQL 5.7 (which recently reached EOL) to 8.0, and I am going to emphasize one important case that affects many database instances.

Range optimization problem

Queries having many IN values use Equality Range Optimization in MySQL query optimizer. Let’s say our query looks like this:

SELECT COUNT(*) FROM test.sbtest1 WHERE id IN (10,50,200,...,30822);

There are ten thousand values in the compared list. Running this one under MySQL 5.7 results in the following execution statistics:

mysql57 > source query1.sql
+----------+
| count(*) |
+----------+
|    17433 |
+----------+
1 row in set (0.16 sec)

Slow log entry (truncated for readability):

# Time: 2024-06-23T13:34:05.912909Z
# User@Host: msandbox[msandbox] @ localhost []  Id:     6
# Query_time: 0.161071  Lock_time: 0.021591 Rows_sent: 1  Rows_examined: 17433
SET timestamp=1719149645;
select count(*) from test.sbtest1 where id in (...);

As well as handler stats confirm the index is used properly and optimization worked:

mysql57 > show status like 'ha%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 2     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 17433 |
| Handler_read_last          | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.00 sec)

The query EXPLAIN confirms the range type and index:

mysql57 > source explain1.sql
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sbtest1
   partitions: NULL
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 17433
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.06 sec)

However, in our upgraded instance, using the same table, we end up with the following results:

mysql80 > source query1.sql
+----------+
| count(*) |
+----------+
|    17433 |
+----------+
1 row in set, 1 warning (0.38 sec)

mysql80 > show warnings;
*************************** 1. row ***************************
  Level: Warning
   Code: 3170
Message: Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.
1 row in set (0.00 sec)

So the above warning already informs us about problems, which lead to a much worse optimizer plan as below:

# Time: 2024-06-23T13:44:51.610282Z
# User@Host: msandbox[msandbox] @ localhost []  Id:    18
# Query_time: 0.385067  Lock_time: 0.000004 Rows_sent: 1  Rows_examined: 1200000
SET timestamp=1719150291;
select count(*) from test.sbtest1 where id in (...);

mysql80 > show status like 'ha%';
+----------------------------+---------+
| Variable_name              | Value   |
+----------------------------+---------+
| Handler_commit             | 1       |
| Handler_delete             | 0       |
| Handler_discover           | 0       |
| Handler_external_lock      | 2       |
| Handler_mrr_init           | 0       |
| Handler_prepare            | 0       |
| Handler_read_first         | 1       |
| Handler_read_key           | 1       |
| Handler_read_last          | 0       |
| Handler_read_next          | 1200000 |
| Handler_read_prev          | 0       |
| Handler_read_rnd           | 0       |
| Handler_read_rnd_next      | 0       |
| Handler_rollback           | 0       |
| Handler_savepoint          | 0       |
| Handler_savepoint_rollback | 0       |
| Handler_update             | 0       |
| Handler_write              | 0       |
+----------------------------+---------+
18 rows in set (0.00 sec)

Query plan surprisingly shows a secondary index on a different column, and practically all table rows scanned:

mysql80 > source explain1.sql
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sbtest1
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: k
      key_len: 4
          ref: NULL
         rows: 1183608
     filtered: 50.00
        Extra: Using where; Using index
1 row in set, 2 warnings (0.05 sec)

Now, the range optimization has the allowed memory limit defined by the range_optimizer_max_mem_size variable, which is, however, the same for MySQL 5.7 and 8.0 (8MB)! So why doesn’t the very same query against the same table fit in the newer version?
Let’s check the related memory footprint as instrumented in Performance Schema. On 5.7 (obtained after truncating the P_S table and running the query), it is about 5.5 MB:

mysql57 > select * from performance_schema.memory_summary_by_thread_by_event_name where thread_id=(select THREAD_ID from performance_schema.threads where processlist_id=CONNECTION_ID()) and
event_name='memory/sql/test_quick_select'G
*************************** 1. row ***************************
                   THREAD_ID: 32
                  EVENT_NAME: memory/sql/test_quick_select
                 COUNT_ALLOC: 104
                  COUNT_FREE: 104
   SUM_NUMBER_OF_BYTES_ALLOC: 5705856
    SUM_NUMBER_OF_BYTES_FREE: 5705856
              LOW_COUNT_USED: 0
          CURRENT_COUNT_USED: 0
             HIGH_COUNT_USED: 104
    LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 0
   HIGH_NUMBER_OF_BYTES_USED: 5705856
1 row in set (0.00 sec)

But on MySQL 8.0, it is much higher, around 11.5 MB, so higher than the allowed limit of 8 MB:

mysql8.0 > select * from performance_schema.memory_summary_by_thread_by_event_name where thread_id=PS_CURRENT_THREAD_ID() and event_name='memory/sql/test_quick_select'G
*************************** 1. row ***************************
                   THREAD_ID: 47
                  EVENT_NAME: memory/sql/test_quick_select
                 COUNT_ALLOC: 18
                  COUNT_FREE: 18
   SUM_NUMBER_OF_BYTES_ALLOC: 12099576
    SUM_NUMBER_OF_BYTES_FREE: 12099576
              LOW_COUNT_USED: 0
          CURRENT_COUNT_USED: 0
             HIGH_COUNT_USED: 18
    LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 0
   HIGH_NUMBER_OF_BYTES_USED: 12099576
1 row in set (0.00 sec)

Increasing the variable fixes the query plan:

mysql80 > set range_optimizer_max_mem_size=12*1024*1024;
Query OK, 0 rows affected (0.00 sec)

mysql80 > source query1.sql
+----------+
| count(*) |
+----------+
|    17433 |
+----------+
1 row in set (0.10 sec)

In my opinion, this memory requirement difference is unjustified. Therefore I reported the regression here:
https://bugs.mysql.com/bug.php?id=115327

And separately, a misleading documentation bug:
https://bugs.mysql.com/bug.php?id=115062

This regression becomes more impactful the more indexes are on the table, as MySQL 8.0 is affected by this other related bug:
https://bugs.mysql.com/bug.php?id=104000

Summary

Upgrading to MySQL 8.x may be challenging and should be well-tested before implementation to avoid some very bad surprises, like the one above! In the long term, though, it is inevitable, as 5.7 reached EOL. Before this particular regression or any other that affects you is fixed, you may consider using our post-EOL support for the time.

As an addition, let me remind you that we have a tool useful for checking if any query behaves differently on the new version. Here is a quick update that illustrates how the extra warning was spot by it:

$ pt-upgrade h=127.0.0.1,P=5756,u=msandbox,p=***,D=test h=127.0.0.1,P=8055,u=msandbox,p=msandbox,D=test slow57.log

#-----------------------------------------------------------------------
# Logs
#-----------------------------------------------------------------------

File: slow57.log
Size: 141681

#-----------------------------------------------------------------------
# Hosts
#-----------------------------------------------------------------------

host1:

  DSN:       h=127.0.0.1,P=5756
  hostname:  przemek-dbg
  MySQL:     MySQL Community Server (GPL) 5.7.44

host2:

  DSN:       h=127.0.0.1,P=8055
  hostname:  przemek-dbg
  MySQL:     MySQL Community Server - GPL 8.0.37

########################################################################
# Query class F4A5056EC85D02D0
########################################################################

Reporting class because it has diffs, but hasn't been reported yet.

Total queries      1
Unique queries     1
Discarded queries  0

select count(*) from test.sbtest? where id in(?+)

##
## Warning diffs: 1
##

-- 1.

No warning 3170

vs.

   Code: 3170
  Level: Warning
Message: Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.

select count(*) from test.sbtest1 where id in (90, ... ,13668,15161)

#-----------------------------------------------------------------------
# Stats
#-----------------------------------------------------------------------

failed_queries        0
not_select            0
queries_filtered      0
queries_no_diffs      0
queries_read          1
queries_with_diffs    1
queries_with_errors   0

MySQL Performance Tuning is an essential guide covering the critical aspects of MySQL performance optimization.

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论