Mysql 因 CRC32 HASH算法BUG造成从库同步异常

2024年 7月 26日 66.7k 0

介绍

Mysql 从库

    Mysql 从库通过接收(或者是主动拉取)主库binlog日志,转换为自身中继日志,然后应用中继日志内容实现主从数据同步。

从库 CRC32 HASH算法

    Mysql 从库进行数据同步,实则是重放一次主库的 dml 语句(不是原SQL重放),在同步表没有主键和唯一索引的情况下,从库为了提升同步效率,利用CRC32 HASH算法进行优化,但这个算法存在一个BUG,2个不同行的HASH值可能相同,这就造成从库在更新时,有可能更新到错误的行,从而引发后续的更新异常,从库同步中断。

    在Mysql bug 列表 和 MOS 中均有此BUG的说明。
Mysql 因 CRC32 HASH算法BUG造成从库同步异常-1

Mysql 因 CRC32 HASH算法BUG造成从库同步异常-2

从库相关参数

控制从库在进行行匹配的参数为slave_rows_search_algorithms。

  • 1、此参数在Mysql 8.0 开始默认值为“INDEX_SCAN,HASH_SCAN”
  • 2、Mysql 8.0.18 标记为废弃
  • 3、Mysql 8.3.0 彻底弃用,不允许调整。

以下为参数可设置值。

Command-Line Format --slave-rows-search-algorithms=value
Deprecated 8.0.18
System Variable slave_rows_search_algorithms
Scope Global
Dynamic Yes
SET_VAR Hint Applies No
Type Set
Default Value INDEX_SCAN,HASH_SCAN
Valid Values

TABLE_SCAN,INDEX_SCAN

INDEX_SCAN,HASH_SCAN

TABLE_SCAN,HASH_SCAN

TABLE_SCAN,INDEX_SCAN,HASH_SCAN (equivalent to INDEX_SCAN,HASH_SCAN)

BUG场景再现

环境准备

    使用沙箱工具搭建Mysql 8.0.26 主从环境

[mysql@19db2 ~]$ dbdeployer versions
Basedir: /home/mysql/opt/mysql
5.7.29 5.7.31 8.0.26 8.0.33 8.0.34
[mysql@19db2 ~]$ dbdeployer deploy replication 8.0.26 -n 2 --gtid --force
Installing and starting master
. sandbox server started
Installing and starting slave1
.. sandbox server started
$HOME/sandboxes/rsandbox_8_0_26/initialize_slaves
initializing slave 1
Replication directory installed in $HOME/sandboxes/rsandbox_8_0_26
run 'dbdeployer usage multiple' for basic instructions'

[mysql@19db2 rsandbox_8_0_26]$ ./n1
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 8.0.26 MySQL Community Server - GPL

....

master [localhost:21627] {msandbox} ((none)) > show variables like '%slave_rows_search_algorithms%';
+------------------------------+----------------------+
| Variable_name | Value |
+------------------------------+----------------------+
| slave_rows_search_algorithms | INDEX_SCAN,HASH_SCAN |
+------------------------------+----------------------+
1 row in set (0.00 sec)

[mysql@19db2 rsandbox_8_0_26]$ ./n2
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 12
Server version: 8.0.26 MySQL Community Server - GPL
....

slave1 [localhost:21628] {msandbox} ((none)) > show variables like '%slave_rows_search_algorithms%';
+------------------------------+----------------------+
| Variable_name | Value |
+------------------------------+----------------------+
| slave_rows_search_algorithms | INDEX_SCAN,HASH_SCAN |
+------------------------------+----------------------+
1 row in set (0.00 sec)

测试数据准备

master [localhost:21627] {msandbox} ((none)) > create database fails;
Query OK, 1 row affected (0.00 sec)

master [localhost:21627] {msandbox} ((none)) > use fails;
Database changed

-- 测试表 无主键 无唯一索引
master [localhost:21627] {msandbox} (fails) > CREATE TABLE `task_sequence` (
-> `T_NAME` varchar(100) DEFAULT NULL ,
-> `CURRENT_VALUE` decimal(10,0) DEFAULT '0' ,
-> `T_INCREMENT` decimal(1,0) DEFAULT '1' ,
-> `BIZ_TYPE` varchar(10) DEFAULT NULL
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 ;
Query OK, 0 rows affected, 1 warning (0.01 sec)

-- 主库插入2条模拟数据
master [localhost:21627] {msandbox} (fails) > insert into task_sequence (T_NAME,CURRENT_VALUE,T_INCREMENT,BIZ_TYPE) values('20230820110700',267,1,'04');
Query OK, 1 row affected (0.00 sec)

master [localhost:21627] {msandbox} (fails) > insert into task_sequence (T_NAME,CURRENT_VALUE,T_INCREMENT,BIZ_TYPE) values('20240412120134',329,1,'03');
Query OK, 1 row affected (0.00 sec)

主从库验证模拟数据

    此时主从同步正常,数据正常

master [localhost:21627] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 329 | 1 | 03 |
+----------------+---------------+-------------+----------+
2 rows in set (0.00 sec)

slave1 [localhost:21628] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 329 | 1 | 03 |
+----------------+---------------+-------------+----------+
2 rows in set (0.00 sec)

更新数据,BUG验证

    下面将对T_NAME=‘20240412120134’ 行进行更新,验证CRC32 HASH算法BUG

Time 1 主库第一次更新T_NAME=‘20240412120134’

master [localhost:21627] {msandbox} (fails) > update task_sequence set CURRENT_VALUE=CURRENT_VALUE+T_INCREMENT where T_NAME='20240412120134';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

    此时主从数据同步正常

master [localhost:21627] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 330 | 1 | 03 | select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 330 | 1 | 03 | update task_sequence set CURRENT_VALUE=CURRENT_VALUE+T_INCREMENT where T_NAME='20240412120134';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0

    此时主从同步正常,但数据已经异常,从库本应该更新T_NAME='20240412120134’的CURRENT_VALUE为331,但是由于CRC32 BUG存在,错误的更新了T_NAME='20230820110700’的CURRENT_VALUE为331,主从数据已经不一致。

master [localhost:21627] {msandbox} (fails) > select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 331 | 1 | 03 | select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 331 | 1 | 04 | show replica status\G
*************************** 1. row ***************************
Replica_IO_State: Waiting for source to send event
Source_Host: 127.0.0.1
Source_User: rsandbox
Source_Port: 21627
Connect_Retry: 60
Source_Log_File: mysql-bin.000001
Read_Source_Log_Pos: 10280
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 10495
Relay_Source_Log_File: mysql-bin.000001
Replica_IO_Running: Yes select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 267 | 1 | 04 |
| 20240412120134 | 332 | 1 | 03 | select * from task_sequence;
+----------------+---------------+-------------+----------+
| T_NAME | CURRENT_VALUE | T_INCREMENT | BIZ_TYPE |
+----------------+---------------+-------------+----------+
| 20230820110700 | 331 | 1 | 04 |
| 20240412120134 | 330 | 1 | 03 | show replica status\G
*************************** 1. row ***************************
Replica_IO_State: Waiting for source to send event
Source_Host: 127.0.0.1
Source_User: rsandbox
Source_Port: 21627
Connect_Retry: 60
Source_Log_File: mysql-bin.000001
Read_Source_Log_Pos: 10637
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 10495
Relay_Source_Log_File: mysql-bin.000001
Replica_IO_Running: Yes
Replica_SQL_Running: No

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论