MySQL Crash 的原因有很多,比如硬件问题,磁盘坏块导致页损坏,内存问题导致内存访问错误,等等,软件问题,MySQL 自身的 Bug。通常 MySQL Crash 问题需要根据错误日志、Core 文件、业务 SQL,表结构等多种信息结合起来排查问题,即使有诸多信息,有些 Crash 问题仍然不好排查,成为遗留问题,长期存在。
今天看到 Percona 博客上一篇关于 MySQL Bug 的文章,看了下其中的堆栈信息,有些眼熟,似乎之前遇到过,记忆中是当作磁盘坏块处理了。在这里记录一下,下次遇到可以有所参考。
产生条件:
- 在一条记录上并发更新数据,可以是 delete/update/insert 混合在一起执行
- 该 Bug 报告于 5.7.29 版本
错误堆栈:
In our production environment:
The server crashed several times these days:
InnoDB: Assertion failure in thread 47758491551488 in file rem0rec.cc line 586
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
06:05:59 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=4
max_threads=151
thread_count=4
connection_count=4
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61035 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x2b6fe0000b70
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 2b6fa3ec7d00 thread_stack 0x40000
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(my_print_stacktrace+0x35)[0x18460a6]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(handle_fatal_signal+0x3f6)[0xe88063]
/lib64/libpthread.so.0(+0xf6d0)[0x2b6f998966d0]
/lib64/libc.so.6(gsignal+0x37)[0x2b6f9b211277]
/lib64/libc.so.6(abort+0x148)[0x2b6f9b212968]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld[0x1ac9962]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_Z20rec_get_offsets_funcPKhPK12dict_index_tPmmPKcmPP16mem_block_info_t+0x121)[0x19a36a3]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0x16f3)[0x1a281ff]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN11ha_innobase10index_readEPhPKhj16ha_rkey_function+0x43e)[0x18985b4]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN7handler14index_read_mapEPhPKhm16ha_rkey_function+0x64)[0xf137c8]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN7handler17ha_index_read_mapEPhPKhm16ha_rkey_function+0x1b0)[0xf053e4]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN7handler16read_range_firstEPK12st_key_rangeS2_bb+0xed)[0xf0ef3b]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN7handler21multi_range_read_nextEPPc+0x14f)[0xf0ce7d]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN10DsMrr_impl10dsmrr_nextEPPc+0x3a)[0xf0dd38]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN11ha_innobase21multi_range_read_nextEPPc+0x2a)[0x18aa52e]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0xa8)[0x16f1b88]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld[0x1431816]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_Z12mysql_updateP3THDR4ListI4ItemES4_y15enum_duplicatesPyS6_+0x17f4)[0x15c04cd]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN14Sql_cmd_update23try_single_table_updateEP3THDPb+0x25b)[0x15c6de7]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_ZN14Sql_cmd_update7executeEP3THD+0x93)[0x15c7353]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_Z21mysql_execute_commandP3THDb+0x2b5d)[0x150e6a0]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x639)[0x1513ddd]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0xc9a)[0x15096a5]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(_Z10do_commandP3THD+0x4b2)[0x150851d]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(handle_connection+0x1e0)[0x1639004]
/u01/kongzhi.kz/mysql-servers/mysql-5.7.29/bu-Debug/sql/mysqld(pfs_spawn_thread+0x170)[0x1cc171c]
/lib64/libpthread.so.0(+0x7e25)[0x2b6f9988ee25]
/lib64/libc.so.6(clone+0x6d)[0x2b6f9b2d9bad]
分析原因:
在源码中,由于并发更新一条记录,prev_rec 指向的记录可能是无效的,后来的线程无法锁定该记录,当该记录所在页面被 purge 线程修改,并且该记录是该页面的最后一条记录时,Crash 就可能发生。
Bug 记录地址:
https://bugs.mysql.com/bug.php?id=99286
https://jira.percona.com/browse/PS-7163