故障现象:
开发反馈说mysql数据库表无法打开。涉及的库为两节点的mgr环境:mysql8.0.32+mgr
处理思路:
初步怀疑系统可能宕机过,需检查系统日志,数据库日志,查看mysql mgr状态再分析确定下一步处理办法。
处理过程:
登录主从服务器,检查系统日志,看不到108直接重启的故障原因tail -2000 /var/log/messages.XXX|more
检查mysql mgr状态:
从状态看,貌似107节点应该正常,事实上无法打开库表。
检查107的系统是否重启过:last|grep boot正常
检查108的系统是否重启过:last|grep boot异常,9月30日19:55系统自动重启过
查看两节点的mysql日志,两节点之间无法正常通信33061端口:
主节点107日志:
2023-09-30T19:54:04.152523+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address 192.168.52.108:3306 has become unreachable.'
2023-09-30T19:54:04.176596+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block all updates. The server will remain blocked un
til contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
2023-09-30T19:56:39.266157+08:00 0 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Old incarnation found while trying to add node 192.168.52.108:33061 16960749990850918. Please stop the old node or wait for it to leav
e the group.'
2023-09-30T19:56:39.366009+08:00 0 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Old incarnation found while trying to add node 192.168.52.108:33061 16960749990850918. Please stop the old node or wait for it to leav
e the group.'
2023-09-30T19:56:39.465009+08:00 0 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Old incarnation found while trying to add node 192.168.52.108:33061 16960749990850918. Please stop the old node or wait for it to leav
e the group.'
2023-09-30T19:56:39.564792+08:00 0 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Old incarnation found while trying to add node 192.168.52.108:33061 16960749990850918. Please stop the old node or wait for it to leav
e the group.'
2023-09-30T19:56:39.664804+08:00 0 [Warning] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Old incarnation found while trying to add node 192.168.52.108:33061 16960749990850918. Please stop the old node or wait for it to leav
e the group.'
从节点108日志:
2023-09-30T19:56:03.980014+08:00 0 [Warning] [MY-011068] [Server] The syntax 'log_slave_updates' is deprecated and will be removed in a future release. Please use log_replica_updates instead.
2023-09-30T19:56:03.980061+08:00 0 [Warning] [MY-011069] [Server] The syntax '--master-info-repository' is deprecated and will be removed in a future release.
2023-09-30T19:56:03.980086+08:00 0 [Warning] [MY-011069] [Server] The syntax '--relay-log-info-repository' is deprecated and will be removed in a future release.
2023-09-30T19:56:03.980135+08:00 0 [Warning] [MY-011069] [Server] The syntax '--transaction-write-set-extraction' is deprecated and will be removed in a future release.
2023-09-30T19:56:03.981383+08:00 0 [System] [MY-010116] [Server] /home/mysql/mysql-8.0.32/bin/mysqld (mysqld 8.0.32) starting as process 2734
2023-09-30T19:56:04.264772+08:00 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2023-09-30T19:56:11.643084+08:00 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2023-09-30T19:56:13.335966+08:00 0 [System] [MY-013587] [Repl] Plugin group_replication reported: 'Plugin 'group_replication' is starting.'
2023-09-30T19:56:21.403812+08:00 0 [System] [MY-010229] [Server] Starting XA crash recovery...
2023-09-30T19:56:21.415597+08:00 0 [System] [MY-010232] [Server] XA crash recovery finished.
2023-09-30T19:56:27.741955+08:00 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2023-09-30T19:56:27.742108+08:00 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2023-09-30T19:56:27.743034+08:00 0 [Warning] [MY-013595] [Server] Failed to initialize TLS for channel: mysql_admin. See below for the description of exact issue.
2023-09-30T19:56:27.743076+08:00 0 [Warning] [MY-010069] [Server] Failed to set up SSL because of the following SSL library error: SSL context is not usable without certificate and private key
2023-09-30T19:56:27.743094+08:00 0 [System] [MY-013603] [Server] No TLS configuration was given for channel mysql_admin; re-using TLS configuration of channel mysql_main.
2023-09-30T19:56:27.871595+08:00 0 [Warning] [MY-010604] [Repl] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=disp
atch02-relay-bin' to avoid this problem.
2023-09-30T19:56:39.054566+08:00 0 [System] [MY-010931] [Server] /home/mysql/mysql-8.0.32/bin/mysqld: ready for connections. Version: '8.0.32' socket: '/home/mysql/mysql.sock' port: 3306 MySQL Community Server - GPL.
2023-09-30T19:56:39.054584+08:00 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 33060, socket: /tmp/mysqlx.sock
2023-09-30T19:56:39.054633+08:00 0 [System] [MY-013292] [Server] Admin interface ready for connections, address: '192.168.52.108' port: 33062
2023-09-30T19:56:39.055794+08:00 4 [System] [MY-011565] [Repl] Plugin group_replication reported: 'Setting super_read_only=ON.'
2023-09-30T19:56:39.076400+08:00 12 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
New state master_host='', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2023-09-30T19:56:40.162652+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 33061'
2023-09-30T19:56:40.237092+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'
2023-09-30T19:56:46.252861+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 33061'
2023-09-30T19:56:46.316789+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'
2023-09-30T19:56:52.343533+08:00 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 33061'
通过netstat -tunlp|grep 3306发现两个节点均不存在33061端口,于是决定重启两节点的mysql服务。
在两节点都停止服务:systemctl stop mysql
[root@test01 ~]# systemctl status mysql
● mysql.service - LSB: start and stop MySQL
Loaded: loaded (/etc/rc.d/init.d/mysql; bad; vendor preset: disabled)
Active: deactivating (stop) since Sat 2023-10-07 12:09:15 CST; 1min 2s ago
Docs: man:systemd-sysv-generator(8)
Process: 9547 ExecStart=/etc/rc.d/init.d/mysql start (code=exited, status=0/SUCCESS)
Control: 9706 (mysql)
Tasks: 212
CGroup: /system.slice/mysql.service
├─ 9563 /bin/sh /home/mysql/mysql-8.0.32/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/test01.pid
├─10238 /home/mysql/mysql-8.0.32/bin/mysqld --basedir=/home/mysql/mysql-8.0.32 --datadir=/data/mysql --plugin-dir=/home/mysql/mysql-8.0.32/lib/plugin --user=mysql --log-error=test01.err --pid-file=/data/mysql/test01.pi...
└─control
├─ 9706 /bin/sh /etc/rc.d/init.d/mysql stop
└─10233 sleep 1
Aug 14 14:11:11 test01 systemd[1]: Starting LSB: start and stop MySQL...
Aug 14 14:11:18 test01 mysql[9547]: Starting MySQL....... SUCCESS!
Aug 14 14:11:18 test01 systemd[1]: Started LSB: start and stop MySQL.
Oct 07 12:09:15 test01 systemd[1]: Stopping LSB: start and stop MySQL...
[root@test01 ~]# ps -ef|grep mysql
root 9563 1 0 Aug14 ? 00:00:00 /bin/sh /home/mysql/mysql-8.0.32/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/test01.pid
root 9706 1 0 12:09 ? 00:00:00 /bin/sh /etc/rc.d/init.d/mysql stop
root 9948 9887 0 12:09 pts/2 00:00:00 systemctl stop mysql
mysql 10238 9563 15 Aug14 ? 8-09:29:21 /home/mysql/mysql-8.0.32/bin/mysqld --basedir=/home/mysql/mysql-8.0.32 --datadir=/data/mysql --plugin-dir=/home/mysql/mysql-8.0.32/lib/plugin --user=mysql --log-error=test01.err --pid-file=/data/mysql/test01.pid --socket=/home/mysql/mysql.sock --port=3306
root 10441 10133 0 12:10 pts/3 00:00:00 grep --color=auto mysql
[root@test01 ~]# ps -ef|grep mysql
root 10592 10133 0 12:10 pts/3 00:00:00 grep --color=auto mysql
[root@test01 ~]# systemctl start mysql
[root@test01 ~]# mysql -uroot -p"passwd2023" -P3306 -h192.168.52.107
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 26
Server version: 8.0.32 MySQL Community Server - GPL
Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> SET GLOBAL group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (25.54 sec)
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 673c35f1-dc20-11ed-8b5b-005056a27233 | 192.168.52.107 | 3306 | ONLINE | PRIMARY | 8.0.32 | XCom |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.00 sec)
启动108从节点mysql服务:
[root@test02 mysql]# systemctl start mysql
[root@test02 mysql]# mysql -uroot -p"passwd2023" -P3306 -h192.168.52.108
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 13
Server version: 8.0.32 MySQL Community Server - GPL
Copyright (c) 2000, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 673c35f1-dc20-11ed-8b5b-005056a27233 | 192.168.52.107 | 3306 | ONLINE | PRIMARY | 8.0.32 | XCom |
| group_replication_applier | e64615df-dc20-11ed-80b9-005056a26559 | 192.168.52.108 | 3306 | RECOVERING | SECONDARY | 8.0.32 | XCom |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
2 rows in set (0.00 sec)
观察108从节点自动恢复
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 673c35f1-dc20-11ed-8b5b-005056a27233 | 192.168.52.107 | 3306 | ONLINE | PRIMARY | 8.0.32 | XCom |
| group_replication_applier | e64615df-dc20-11ed-80b9-005056a26559 | 192.168.52.108 | 3306 | ONLINE | SECONDARY | 8.0.32 | XCom |
+---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+----------------------------+
2 rows in set (0.00 sec)
处理结果:
两节点正常的mysql端口状态:
两节点正常的mgr状态: