介绍
Mysql 从库
Mysql 从库通过接收(或者是主动拉取)主库binlog日志,转换为自身中继日志,然后应用中继日志内容实现主从数据同步。
从库 CRC32 HASH算法
Mysql 从库进行数据同步,实则是重放一次主库的 dml 语句(不是原SQL重放),在同步表没有主键和唯一索引的情况下,从库为了提升同步效率,利用CRC32 HASH算法进行优化,但这个算法存在一个BUG,2个不同行的HASH值可能相同,这就造成从库在更新时,有可能更新到错误的行,从而引发后续的更新异常,从库同步中断。
在Mysql bug 列表 和 MOS 中均有此BUG的说明。
从库相关参数
控制从库在进行行匹配的参数为slave_rows_search_algorithms。
- 1、此参数在Mysql 8.0 开始默认值为“INDEX_SCAN,HASH_SCAN”
- 2、Mysql 8.0.18 标记为废弃
- 3、Mysql 8.3.0 彻底弃用,不允许调整。
以下为参数可设置值。
Command-Line Format | --slave-rows-search-algorithms=value |
---|---|
Deprecated | 8.0.18 |
System Variable | slave_rows_search_algorithms |
Scope | Global |
Dynamic | Yes |
SET_VAR Hint Applies |
No |
Type | Set |
Default Value | INDEX_SCAN,HASH_SCAN |
Valid Values |
|
BUG场景再现
环境准备
使用沙箱工具搭建Mysql 8.0.26 主从环境
[mysql@19db2 ~]$ dbdeployer versions
Basedir: /home/mysql/opt/mysql
5.7.29 5.7.31 8.0.26 8.0.33 8.0.34
[mysql@19db2 ~]$ dbdeployer deploy replication 8.0.26 -n 2 --gtid --force
Installing and starting master
. sandbox server started
Installing and starting slave1
.. sandbox server started
$HOME/sandboxes/rsandbox_8_0_26/initialize_slaves
initializing slave 1
Replication directory installed in $HOME/sandboxes/rsandbox_8_0_26
run 'dbdeployer usage multiple' for basic instructions'
[mysql@19db2 rsandbox_8_0_26]$ ./n1
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 8.0.26 MySQL Community Server - GPL
....
master [localhost:21627] {msandbox} ((none)) > show variables like '%slave_rows_search_algorithms%';
+------------------------------+----------------------+
| Variable_name | Value |
+------------------------------+----------------------+
| slave_rows_search_algorithms | INDEX_SCAN,HASH_SCAN |
+------------------------------+----------------------+
1 row in set (0.00 sec)
[mysql@19db2 rsandbox_8_0_26]$ ./n2
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 12
Server version: 8.0.26 MySQL Community Server - GPL
....
slave1 [localhost:21628] {msandbox} ((none)) > show variables like '%slave_rows_search_algorithms%';
+------------------------------+----------------------+
| Variable_name | Value |
+------------------------------+----------------------+
| slave_rows_search_algorithms | INDEX_SCAN,HASH_SCAN |
+------------------------------+----------------------+
1 row in set (0.00 sec)
测试数据准备
master [localhost:21627] {msandbox} ((none)) > create database fails;
Query OK, 1 row affected (0.00 sec)
master [localhost:21627] {msandbox} ((none)) > use fails;
Database changed
-- 测试表 无主键 无唯一索引
master [localhost:21627] {msandbox} (fails) > CREATE TABLE `task_sequence` (
-> `T_NAME` varchar(100) DEFAULT NULL ,
-> `CURRENT_VALUE` decimal(10,0) DEFAULT '0' ,
-> `T_INCREMENT` decimal(1,0) DEFAULT '1' ,
-> `BIZ_TYPE` varchar(10) DEFAULT NULL
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 ;
Query OK, 0 rows affected, 1 warning (0.01 sec)
-- 主库插入2条模拟数据
master [localhost:21627] {msandbox} (fails) > insert into task_sequence (T_NAME,CURRENT_VALUE,T_INCREMENT,BIZ_TYPE) values('20230820110700',267,1,'04');
Query OK, 1 row affected (0.00 sec)
master [localhost:21627] {msandbox} (fails) > insert into task_sequence (T_NAME,CURRENT_VALUE,T_INCREMENT,BIZ_TYPE) values('20240412120134',329,1,'03');
Query OK, 1 row affected (0.00 sec)
主从库验证模拟数据
此时主从同步正常,数据正常
master [localhost:21627] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 329 | 1 | 03 |
+----------------+---------------+-------------+----------+
2 rows in set (0.00 sec)
slave1 [localhost:21628] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 329 | 1 | 03 |
+----------------+---------------+-------------+----------+
2 rows in set (0.00 sec)
更新数据,BUG验证
下面将对T_NAME=‘20240412120134’ 行进行更新,验证CRC32 HASH算法BUG
Time 1 主库第一次更新T_NAME=‘20240412120134’
master [localhost:21627] {msandbox} (fails) > update task_sequence set CURRENT_VALUE=CURRENT_VALUE+T_INCREMENT where T_NAME='20240412120134';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
此时主从数据同步正常
master [localhost:21627] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 330 | 1 | 03 | select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 330 | 1 | 03 | update task_sequence set CURRENT_VALUE=CURRENT_VALUE+T_INCREMENT where T_NAME='20240412120134';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
此时主从同步正常,但数据已经异常,从库本应该更新T_NAME='20240412120134’的CURRENT_VALUE为331,但是由于CRC32 BUG存在,错误的更新了T_NAME='20230820110700’的CURRENT_VALUE为331,主从数据已经不一致。
master [localhost:21627] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 331 | 1 | 03 | select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 331 | 1 | 04 | show replica status\G
*************************** 1. row ***************************
Replica_IO_State: Waiting for source to send event
Source_Host: 127.0.0.1
Source_User: rsandbox
Source_Port: 21627
Connect_Retry: 60
Source_Log_File: mysql-bin.000001
Read_Source_Log_Pos: 10280
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 10495
Relay_Source_Log_File: mysql-bin.000001
Replica_IO_Running: Yes select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 332 | 1 | 03 | select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 331 | 1 | 04 |
| 20240412120134 | 330 | 1 | 03 | show replica status\G
*************************** 1. row ***************************
Replica_IO_State: Waiting for source to send event
Source_Host: 127.0.0.1
Source_User: rsandbox
Source_Port: 21627
Connect_Retry: 60
Source_Log_File: mysql-bin.000001
Read_Source_Log_Pos: 10637
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 10495
Relay_Source_Log_File: mysql-bin.000001
Replica_IO_Running: Yes
Replica_SQL_Running: No