作者简介:高鹏,笔名八怪。《深入理解MySQL主从原理》图书作者,同时运营个人公众号“MySQL学习”,持续分享遇到的有趣case以及代码解析!
一、问题重现
版本5.7.11,我们首先设置参数:
- max_connections = 20
- table_open_cache = 20
- table_definition_cache=10
- open_files_limit=300
- innodb_open_files=300
- OS中设置 hard nofile 300 soft nofile 300
这样重启后我们的数据库中open_files_limit设置为300如下:
mysql> show variables like '%open_files_limit%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| open_files_limit | 300 |
+------------------+-------+
1 row in set (0.00 sec)
接下来我们建立3个大约80分区的表,注意建立期间可能要重启一次数据库,释放一次open table cache/open table share,接下来我们连续打开3个表当打开最后一个表的时候如下:
mysql> select * from testpar1;
ERROR 2013 (HY000): Lost connection to MySQL server during query
查看日志如下:
2020-10-16T22:36:41.736524-05:00 2 [ERROR] InnoDB: Operating system error number 24 in a file operation.
2020-10-16T22:36:41.736527-05:00 2 [ERROR] InnoDB: Error number 24 means 'Too many open files'
2020-10-16T22:36:41.736529-05:00 2 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
2020-10-16T22:36:41.736531-05:00 2 [Warning] InnoDB: Cannot open './test/testpar1#P#p20190528.ibd'. Have you deleted .ibd files under a running mysqld server?
2020-10-16T22:36:41.736534-05:00 2 [ERROR] InnoDB: Trying to do I/O to a tablespace which exists without .ibd data file. I/O type: read, page: [page id: space=74, page number=3], I/O length: 16384 bytes
2020-10-16T22:36:41.736537-05:00 2 [ERROR] InnoDB: trying to read page [page id: space=74, page number=3] in nonexisting or being-dropped tablespace
2020-10-16T22:36:41.736541-05:00 2 [ERROR] [FATAL] InnoDB: Unable to read page [page id: space=74, page number=3] into the buffer pool after 100 attempts. The most probable cause of this error may be that the table has been corrupted. Or, the table was compressed with with an algorithm that is not supported by this instance. If it is not a decompress failure, you can try to fix this problem by using innodb_force_recovery. Please see http://dev.mysql.com/doc/refman/5.7/en/ for more details. Aborting...
2020-10-16 22:36:41 0x7f8b751ab700 InnoDB: Assertion failure in thread 140236941866752 in file ut0ut.cc line 920
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
02:36:41 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=2147483648
read_buffer_size=131072
max_used_connections=1
max_threads=20
thread_count=1
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2120458 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f8b38000ae0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f8b751aaea8 thread_stack 0x40000
/opt/mysql/mysql3320/bin/mysqld(my_print_stacktrace+0x35)[0xf45595]
/opt/mysql/mysql3320/bin/mysqld(handle_fatal_signal+0x4a4)[0x77fd34]
/lib64/libpthread.so.0(+0xf630)[0x7f8c7bbea630]
/lib64/libc.so.6(gsignal+0x37)[0x7f8c7a76a387]
/lib64/libc.so.6(abort+0x148)[0x7f8c7a76ba78]
/opt/mysql/mysql3320/bin/mysqld[0x10dd465]
/opt/mysql/mysql3320/bin/mysqld(_ZN2ib5fatalD1Ev+0xb3)[0x10e2b33]
/opt/mysql/mysql3320/bin/mysqld(_Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_tmPKcmP5mtr_tb+0xf56)[0x11267c6]
/opt/mysql/mysql3320/bin/mysqld(_Z31btr_cur_open_at_index_side_funcbP12dict_index_tmP9btr_cur_tmPKcmP5mtr_t+0x3a5)[0x1103405]
/opt/mysql/mysql3320/bin/mysqld[0x1062d7b]
/opt/mysql/mysql3320/bin/mysqld(_Z15row_search_mvccPh15page_cur_mode_tP14row_prebuilt_tmm+0x1f62)[0x106d4f2]
/opt/mysql/mysql3320/bin/mysqld(_ZN11ha_innobase10index_readEPhPKhj16ha_rkey_function+0x247)[0xf7c687]
/opt/mysql/mysql3320/bin/mysqld(_ZN11ha_innobase11index_firstEPh+0x39)[0xf6ccf9]
/opt/mysql/mysql3320/bin/mysqld(_ZN11ha_innopart19index_first_in_partEjPh+0x2d)[0xf8fa7d]
/opt/mysql/mysql3320/bin/mysqld(_ZN16Partition_helper36handle_unordered_scan_next_partitionEPh+0x182)[0xc6d6e2]
/opt/mysql/mysql3320/bin/mysqld(_ZN7handler14ha_index_firstEPh+0x127)[0x7c8787]
/opt/mysql/mysql3320/bin/mysqld(_Z15join_read_firstP7QEP_TAB+0x6f)[0xce6ddf]
/opt/mysql/mysql3320/bin/mysqld(_Z10sub_selectP4JOINP7QEP_TABb+0x2ba)[0xce7cca]
/opt/mysql/mysql3320/bin/mysqld(_ZN4JOIN4execEv+0x28a)[0xce622a]
/opt/mysql/mysql3320/bin/mysqld(_Z12handle_queryP3THDP3LEXP12Query_resultyy+0x250)[0xd51d00]
/opt/mysql/mysql3320/bin/mysqld[0xd12973]
/opt/mysql/mysql3320/bin/mysqld(_Z21mysql_execute_commandP3THDb+0x3325)[0xd16605]
/opt/mysql/mysql3320/bin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3ad)[0xd1851d]
/opt/mysql/mysql3320/bin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x117d)[0xd196fd]
/opt/mysql/mysql3320/bin/mysqld(_Z10do_commandP3THD+0x194)[0xd1a1f4]
/opt/mysql/mysql3320/bin/mysqld(handle_connection+0x29c)[0xde605c]
/opt/mysql/mysql3320/bin/mysqld(pfs_spawn_thread+0x174)[0x1213184]
/lib64/libpthread.so.0(+0x7ea5)[0x7f8c7bbe2ea5]
/lib64/libc.so.6(clone+0x6d)[0x7f8c7a8328dd]
我们注意日志中的error number 24 ,这是就是Linux errno
- EMFILE 24 /* Too many open files */
二、5.7中open_files_limit 设置规则
在MySQL的官方文档中有如下说明:
The effective open_files_limit value is based on the value specified at system startup (if any) and
the values of max_connections and table_open_cache, using these formulas:
• 10 + max_connections + (table_open_cache * 2)
• max_connections * 5
• Operating system limit if that limit is positive but not Infinity
• If operating system limit is Infinity: open_files_limit value if specified at startup, 5000 if not
The server attempts to obtain the number of file descriptors using the maximum of those values. If that many descriptors cannot be obtained, the server attempts to obtain as many as the system will permit.The effective value is 0 on systems where MySQL cannot change the number of open files.
On Unix, the value cannot be set greater than ulimit -n.
也就是说参数的设置规则来自4个方面:
- 10 + max_connections + (table_open_cache * 2)
- max_connections * 5
- 操作系统设置的open files
- 参数open_files_limit的设置,默认值为5000
对于如上4个值会取其中的最大值,获取操作系统open files设置的时候使用的是getrlimit函数获取,设置则使用的是setrlimit函数,这个比较和设置函数如下:
static uint set_max_open_files(uint max_file_limit) my_set_max_open_files
{
struct rlimit rlimit;
uint old_cur;
DBUG_ENTER("set_max_open_files");
DBUG_PRINT("enter",("files: %u", max_file_limit));
if (!getrlimit(RLIMIT_NOFILE,&rlimit)) //获取OS设置的 nofile值
{
old_cur= (uint) rlimit.rlim_cur;
DBUG_PRINT("info", ("rlim_cur: %u rlim_max: %u",
(uint) rlimit.rlim_cur,
(uint) rlimit.rlim_max));
if (rlimit.rlim_cur == (rlim_t) RLIM_INFINITY)
rlimit.rlim_cur = max_file_limit;
if (rlimit.rlim_cur >= max_file_limit) //如果OS设置大于 调整的参数 那么以os设置为准
DBUG_RETURN(rlimit.rlim_cur); /* purecov: inspected */
rlimit.rlim_cur= rlimit.rlim_max= max_file_limit;
if (setrlimit(RLIMIT_NOFILE, &rlimit)) //如果OS值小于 则进行调整设置为 调整的值
max_file_limit= old_cur; /* Use original value */
else//成功
{
rlimit.rlim_cur= 0; /* Safety if next call fails */
(void) getrlimit(RLIMIT_NOFILE,&rlimit);
DBUG_PRINT("info", ("rlim_cur: %u", (uint) rlimit.rlim_cur));
if (rlimit.rlim_cur) /* If call didn't fail */
max_file_limit= (uint) rlimit.rlim_cur;
}
}
DBUG_PRINT("exit",("max_file_limit: %u", max_file_limit));
DBUG_RETURN(max_file_limit);
}
对于innodb_open_files来说不能大于open_files_limit的设置,这段代码如下:
if (innobase_open_files < 10) { //不能小于10
innobase_open_files = 300;
if (srv_file_per_table && table_cache_size > 300) {
innobase_open_files = table_cache_size;
}
}
if (innobase_open_files > (long) open_files_limit) { //不能超过open_files_limit的大小
ib::warn() < (long) table_cache_size) {
innobase_open_files = table_cache_size;
}
}
但是值得注意的是如果我们使用mysqld_safe来启动数据库,那么mysqld_safe也会通过my_print_defaults进行参数解析获取其中的open_files_limit,同时调用ulimit -n 进行设置如下:
+ arg=--open-files-limit=300
+ case "$arg" in
+ open_files=300
...
+ ulimit -n 300
因为mysqld_safe和mysqld属于父子进程,那么mysqld复制了其中的PCB那么资源使用上限和mysqld_safe一致,比如这里mysqld的file limit也是300。并且我们需要注意的另外一点,如果mysqld_safe不重启而修改了参数文件中open_files_limit,光是重启了mysqld比如kill 发起杀掉mysqld,起来后open_file_limits也不会更改,因为默认mysqld_safe解析了参数文件的open_files_limit会带入启动项目(--open-files-limit=65535),但是mysqld_safe并没有重启,值还是老的,并且会覆盖mysqld重新读取的参数文件设置。我们可以简单测试一下如下:
- mytest简单的获取当前进程的file limit
#include
#include
#include
#include
int main(void){
struct rlimit rlimit;
getrlimit(RLIMIT_NOFILE,&rlimit);
printf("limit files soft limit is %ld ,hard limit is %ld\n",(uint) rlimit.rlim_cur,(uint) rlimit.rlim_max);
sleep(1000);
}
- mytest.sh 设置ulimit -n 调用mytest,模拟mysqld_safe调用mysqld
#!/bin/bash
ulimit -n 300
/opt/mysql/mysql3306/install/support-files/mytest
测试如下
- 当前OS设置open file limit是30000,直接跑mytest:
[root@mgr1 support-files]# ulimit -a |grep file
core file size (blocks, -c) unlimited
file size (blocks, -f) unlimited
open files (-n) 30000
file locks (-x) unlimited
[root@mgr1 support-files]# ./mytest
limit files soft limit is 30000 ,hard limit is 30000
- 当前OS设置open file limit是30000,mytest.sh调用mytest:
[root@mgr1 support-files]# ulimit -a |grep file
core file size (blocks, -c) unlimited
file size (blocks, -f) unlimited
open files (-n) 30000
file locks (-x) unlimited
[root@mgr1 support-files]# ./mytest.sh
limit files soft limit is 300 ,hard limit is 300
这是一个父子进程
root 25782 25751 0 03:43 pts/2 00:00:00 /bin/bash ./mytest.sh
root 25783 25782 0 03:43 pts/2 00:00:00 /opt/mysql/mysql3306/install/support-files/mytest
因此我们如果参数文件设置了open_files_limit,那么使用mysqld_safe启动的时候将会使用open_files_limit设置的值,这一点5.7版本是要注意的。总结一下:
- 如果使用mysqld直接启动数据库那么将遵守官方文档规则。
- 如果使用mysqld_safe拉取mysqld,如果参数文件设置了open_files_limit那么mysqld_safe会获取open_files_limit的设置并且使用ulimit -n 进行修改,然后拉起mysqld,不依赖OS的设置,官方文档规则去掉第三条。
- 如果使用mysqld_safe拉取mysqld,如果参数文件没有设置open_files_limit则遵守官方文档规则。
三、错误流程
1、报错流程
- os_file_get_last_error_low:
ib::error()
<< "Operating system error number "
<< err
<< " in a file operation.";
...
} else if (!os_diagnose_all_o_direct_einval(err)) {
if (strerror(err) != NULL) {
ib::error()
<< "Error number " << err << " means '"
<< strerror(err) << "'";
}
ib::info() << OPERATING_SYSTEM_ERROR_MSG;
}
- fil_node_open_file :
ib::warn() << "Cannot open '" <name << "'."
" Have you deleted .ibd files under a"
" running mysqld server?";
- _fil_io :
if (!req_type.ignore_missing()) {
ib::error()
<< "Trying to do I/O to a tablespace"
" which exists without .ibd data file."
" I/O type: "
<< (req_type.is_read()
? "read" : "write") 、
<< ", page: "
<< page_id_t(page_id.space(),
cur_page_no)
<< ", I/O length: " << len << " bytes";
}
- buf_read_page
if (err == DB_TABLESPACE_DELETED) {
ib::error() << "trying to read page " << page_id
<< " in nonexisting or being-dropped tablespace";
}
2、信号处理发出
当上面的报错联系100次后将会触发crash,其中BUF_PAGE_READ_MAX_RETRIES为100
if (buf_read_page(page_id, page_size, trx)) { //进行物理文件读取
buf_read_ahead_random(page_id, page_size,
ibuf_inside(mtr), trx);
retries = 0;
} else if (retries < BUF_PAGE_READ_MAX_RETRIES) {
++retries;
DBUG_EXECUTE_IF(
"innodb_page_corruption_retries",
retries = BUF_PAGE_READ_MAX_RETRIES;
);
} else {
ib::fatal() << "Unable to read page " << page_id
<< " into the buffer pool after "
<< BUF_PAGE_READ_MAX_RETRIES << " attempts."
" The most probable cause of this error may"
" be that the table has been corrupted. Or,"
" the table was compressed with with an"
" algorithm that is not supported by this"
" instance. If it is not a decompress failure,"
" you can try to fix this problem by using"
" innodb_force_recovery."
" Please see " REFMAN " for more"
" details. Aborting...";
}
fatal的析构函数如下:
fatal::~fatal()
{
sql_print_error("[FATAL] InnoDB: %s", m_oss.str().c_str());
ut_error;
}
ut_error会调用assert(0),触发abort(),发出signal 6信号,导致crash。值得注意的是5.7.29的报错已经不一样了,这段代码有所变化,并且不会触发crash,报错如下:
2020-10-15T23:24:06.546126-05:00 2 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
2020-10-15T23:24:06.546130-05:00 2 [Note] InnoDB: fil_sys open file LRU len 276
2020-10-15T23:24:06.547016-05:00 2 [ERROR] InnoDB: Operating system error number 24 in a file operation.
2020-10-15T23:24:06.547030-05:00 2 [ERROR] InnoDB: Error number 24 means 'Too many open files'
2020-10-15T23:24:06.547035-05:00 2 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
2020-10-15T23:24:06.547039-05:00 2 [Note] InnoDB: fil_sys open file LRU len 276
四、关于assert断言
我们可以简单写一个assert触发一下abort函数看看捕获的信号是什么如下:
#include
#include
#include
#include
#include
#define ut_error assert(0)
void handsig6(int sig)
{
if(sig == SIGABRT){
printf("%s\n%s\n","I catch SIGABRT singal is 6","do other things exp dump crash stack!!")
exit(0);
}
}
int fatal()
{
printf("get fatal error\n");
ut_error;
}
int main(void)
{
signal(SIGABRT,handsig6);
fatal();
}
执行如下:
get fatal error
a.out: test10.c:23: fatal: Assertion `0' failed.
I catch SIGABRT singal is 6
do other things exp dump crash stack!!
可以看到触发的正是SIGABRT信号即信号6