MyCat分库分表实时同步到GreatSQL

2024年 1月 3日 39.2k 0

这个事情怎么产生的

MyCat作为经典的分库分表中间件,在长时间内被广泛认为是管理超大MySQL数据库集合的有效解决方案。近来接到客户需求,需要将MyCat集群迁移到GreatSQL中,并且在一段时间内需要实时从MyCat中同步数据到GreatSQL中,全量同步数据比较容易操作,增量同步有如下两个棘手的问题:

  • 多个server,不同的库名字,都要同步到GreatSQL一个库中,即同步关系如下
  • server1:db1.tab->gdb:db.tab;
    server2:db2.tab->gdb:db.tab;
    server3:db3.tab->gdb:db.tab;
  • ddl同步多次执行会冲突。当MyCat的表中添加一个索引、添加一个字段时,实际上是后端所有db都会执行这个DDL,同步到GreatSQL时,多次执行DDL,复制会异常中断。
  • 为了解决上面两个问题,经过查询资料,发现有两个不常用,官方也不建议使用的功能,刚好能够满足需求

  • 为解决库名映射问题:需要在配置文件中添加参数
  • replicate_rewrite_db="channel_1:test_rep1->test_rep"
    replicate_rewrite_db="channel_2:test_rep2->test_rep"
    replicate_rewrite_db="channel_3:test_rep3->test_rep"
  • 为了解决DDL同步后重复执行导致复制中断问题,在配置文件中添加
  • slave-skip-errors=ddl_exist_errors

    验证一下

    为了简化问题,MyCat集群咱们就不搭建了,简化为多源同步复制问题。

    1.初始化4个实例,同步关系如下

    源端口

    源DB_NAME

    目标端口

    目标映射DB

    channel_name

    3306

    test_rep1

    3309

    test_rep

    channel_3306

    3307

    test_rep2

    3309

    test_rep

    channel_3307

    3308

    test_rep3

    3309

    test_rep

    channel_3308

    2.在3309的实例配置文件中,添加库映射关系配置和DDL冲突忽略参数

    replicate_rewrite_db="channel_3306:test_rep1->test_rep"
    replicate_rewrite_db="channel_3307:test_rep2->test_rep"
    replicate_rewrite_db="channel_3308:test_rep3->test_rep"
    slave-skip-errors=ddl_exist_errors

    4.在3309实例中,配置三个channel

    greatsql> change master to MASTER_HOST = '172.17.137.91',MASTER_USER = 'greatsql',MASTER_PASSWORD = 'greatsql',MASTER_PORT = 3306,MASTER_AUTO_POSITION = 1 FOR CHANNEL 'channel_3306';
    
    greatsql> change master to MASTER_HOST = '172.17.137.91',MASTER_USER = 'greatsql',MASTER_PASSWORD = 'greatsql',MASTER_PORT = 3307,MASTER_AUTO_POSITION = 1 FOR CHANNEL 'channel_3307';
    
    greatsql> change master to MASTER_HOST = '172.17.137.91',MASTER_USER = 'greatsql',MASTER_PASSWORD = 'greatsql',MASTER_PORT = 3308,MASTER_AUTO_POSITION = 1 FOR CHANNEL 'channel_3308';
    
    greatsql> start slave;

    3.检查channel配置状态

    greatsql> show slave status \G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for source to send event
                      Master_Host: 172.17.137.91
                      Master_User: greatsql
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: binlog.000002
              Read_Master_Log_Pos: 1119
                   Relay_Log_File: relaylog-channel_3306.000007
                    Relay_Log_Pos: 397
            Relay_Master_Log_File: binlog.000002
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 1119
                  Relay_Log_Space: 606
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: 0
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error: 
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 3306
                      Master_UUID: 5facacd7-9ed6-11ee-b76b-00163e5af5d6
                 Master_Info_File: mysql.slave_master_info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
               Master_Retry_Count: 86400
                      Master_Bind: 
          Last_IO_Error_Timestamp: 
         Last_SQL_Error_Timestamp: 
                   Master_SSL_Crl: 
               Master_SSL_Crlpath: 
               Retrieved_Gtid_Set: 
                Executed_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:3,
    5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:3,
    5facacd7-9ed6-11ee-b76b-00163e5af5d6:1-4,
    9a5f9c26-4262-11ee-85fd-00163e5af5d6:1-16571224
                    Auto_Position: 1
             Replicate_Rewrite_DB: (test_rep1,test_rep)
                     Channel_Name: channel_3306
               Master_TLS_Version: 
           Master_public_key_path: 
            Get_master_public_key: 0
                Network_Namespace: 
    *************************** 2. row ***************************
                   Slave_IO_State: Waiting for source to send event
                      Master_Host: 172.17.137.91
                      Master_User: greatsql
                      Master_Port: 3307
                    Connect_Retry: 60
                  Master_Log_File: binlog.000002
              Read_Master_Log_Pos: 1119
                   Relay_Log_File: relaylog-channel_3307.000004
                    Relay_Log_Pos: 1034
            Relay_Master_Log_File: binlog.000002
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 1119
                  Relay_Log_Space: 1243
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: 0
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error: 
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 3307
                      Master_UUID: 5a369d0b-9ed6-11ee-b3d1-00163e5af5d6
                 Master_Info_File: mysql.slave_master_info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
               Master_Retry_Count: 86400
                      Master_Bind: 
          Last_IO_Error_Timestamp: 
         Last_SQL_Error_Timestamp: 
                   Master_SSL_Crl: 
               Master_SSL_Crlpath: 
               Retrieved_Gtid_Set: 5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:1-2:4
                Executed_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:3,
    5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:3,
    5facacd7-9ed6-11ee-b76b-00163e5af5d6:1-4,
    9a5f9c26-4262-11ee-85fd-00163e5af5d6:1-16571224
                    Auto_Position: 1
             Replicate_Rewrite_DB: (test_rep2,test_rep)
                     Channel_Name: channel_3307
               Master_TLS_Version: 
           Master_public_key_path: 
            Get_master_public_key: 0
                Network_Namespace: 
    *************************** 3. row ***************************
                   Slave_IO_State: Waiting for source to send event
                      Master_Host: 172.17.137.91
                      Master_User: greatsql
                      Master_Port: 3308
                    Connect_Retry: 60
                  Master_Log_File: binlog.000002
              Read_Master_Log_Pos: 1119
                   Relay_Log_File: relaylog-channel_3308.000004
                    Relay_Log_Pos: 1034
            Relay_Master_Log_File: binlog.000002
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 1119
                  Relay_Log_Space: 1243
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: 0
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error: 
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 3308
                      Master_UUID: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6
                 Master_Info_File: mysql.slave_master_info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
               Master_Retry_Count: 86400
                      Master_Bind: 
          Last_IO_Error_Timestamp: 
         Last_SQL_Error_Timestamp: 
                   Master_SSL_Crl: 
               Master_SSL_Crlpath: 
               Retrieved_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:1-2:4
                Executed_Gtid_Set: 49b66af0-9ed6-11ee-ae4f-00163e5af5d6:3,
    5a369d0b-9ed6-11ee-b3d1-00163e5af5d6:3,
    5facacd7-9ed6-11ee-b76b-00163e5af5d6:1-4,
    9a5f9c26-4262-11ee-85fd-00163e5af5d6:1-16571224
                    Auto_Position: 1
             Replicate_Rewrite_DB: (test_rep3,test_rep)
                     Channel_Name: channel_3308
               Master_TLS_Version: 
           Master_public_key_path: 
            Get_master_public_key: 0
                Network_Namespace: 
    3 rows in set, 1 warning (0.00 sec)

    在上面的输出中,可以重点关注如下字段信息,说明db转换映射成功

    $ MYSQL_PWD=greatsql mysql -ugreatsql -h127.0.0.1 -P3309 -e 'show replica status \G'| grep -wE 'Replica_IO_Running|Replica_SQL_Running|Replicate_Rewrite_DB|Channel_Name'
               Replica_IO_Running: Yes
              Replica_SQL_Running: Yes
             Replicate_Rewrite_DB: (test_rep1,test_rep)
                     Channel_Name: channel_3306
               Replica_IO_Running: Yes
              Replica_SQL_Running: Yes
             Replicate_Rewrite_DB: (test_rep2,test_rep)
                     Channel_Name: channel_3307
               Replica_IO_Running: Yes
              Replica_SQL_Running: Yes
             Replicate_Rewrite_DB: (test_rep3,test_rep)
                     Channel_Name: channel_3308

    在3309实例中,查询replica_skip_errors,确认复制异常跳过的错误码,设置为ddl_exist_errors会自动转换为如下错误码

    greatsql> select @@replica_skip_errors;
    +---------------------------------------------------+
    | @@replica_skip_errors                             |
    +---------------------------------------------------+
    | 1007,1008,1050,1051,1054,1060,1061,1068,1091,1146 |
    +---------------------------------------------------+
    1 row in set (0.00 sec)

    5.数据同步验证

  • 在3309库中,创建database test_rep
  • 这个库需要手动创建,是测试发现映射关系只对库下面的表生效,库不会自动转换创建。

  • 在3306库中,创建database test_rep1,并且创建tab1表
  • $ mysql -ugreatsql -h127.0.0.1 -p -P3306 -s
    
    greatsql> create database test_rep1;
    greatsql> use test_rep1;
    greatsql> create table tab1(id int primary key auto_increment, cname varchar(10), age int);
  • 在3307库中,创建database test_rep2,并且创建tab1表
  • $ mysql -ugreatsql -h127.0.0.1 -p -P3307 -s
    
    greatsql> create database test_rep2;
    greatsql> use test_rep2;
    greatsql> create table tab1(id int primary key auto_increment, cname varchar(10), age int);
  • 在3308库中,创建database test_rep3,并且创建tab1表
  • $ mysql -ugreatsql -h127.0.0.1 -p -P3308 -s
    
    greatsql> create database test_rep3;
    greatsql> use test_rep3;
    greatsql> create table tab1(id int primary key auto_increment, cname varchar(10), age int);
  • 在3309中确认database及表的同步
  • $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s
    
    greatsql> show databases;
    Database
    information_schema
    mysql
    performance_schema
    sys
    test_db
    test_rep
    test_rep1
    test_rep2
    test_rep3
    
    greatsql> show tables from test_rep;
    Tables_in_test_rep
    tab1
    greatsql> show tables from test_rep1;
    greatsql> show tables from test_rep2;
    greatsql> show tables from test_rep3;

    从上面的信息可以看出,在3309中,3306、3307、3308中创建的库均按照原有的名字进行了同步,但是表只同步在了3309映射的库test_rep中。

  • 分别在3306、3307、3308中插入一条记录
  • 3306 : insert into test_rep1.tab1 values(1,'a',10);
    3307 : insert into test_rep2.tab1 values(2,'b',20);
    3308 : insert into test_rep3.tab1 values(3,'c',30);

    然后在各自节点查询数据插入情况

    $ mysql -ugreatsql -h127.0.0.1 -p -P3306 -s test_rep1 -e 'select * from tab1'
    id       cname    age
    1        a        10
    
    $ mysql -ugreatsql -h127.0.0.1 -p -P3307 -s test_rep2 -e 'select * from tab1'
    id       cname    age
    2        b        20
    
    $ mysql -ugreatsql -h127.0.0.1 -p -P3308 -s test_rep3 -e 'select * from tab1'
    id       cname    age
    3        c        30
    
    $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep -e 'select * from tab1'
    id       cname    age
    1        a        10
    2        b        20
    3        c        30
    
    $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep1 -e 'select * from tab1'
    ERROR 1146 (42S02) at line 1: Table 'test_rep1.tab1' doesn't exist
    
    $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep2 -e 'select * from tab1'
    ERROR 1146 (42S02) at line 1: Table 'test_rep2.tab1' doesn't exist
    
    $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s test_rep3 -e 'select * from tab1'
    ERROR 1146 (42S02) at line 1: Table 'test_rep3.tab1' doesn't exist

    从上面的查询情况可以看出,3306、3307、3308节点中只有一条记录,并且记录都被同步到了3309的test_rep.tab1表中,而且在3309的test_rep1、test_rep2、test_rep3中是没有表存在的。

  • 分别在3306、3307、3308给表tab创建一个索引
  • greatsql> alter table tab1 add index idx_cname(cname);
  • 观察3309中表的索引情况,可以看到索引idx_cname被同步过来了
  • $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s
    
    greatsql> use test_rep
    greatsql> show create table tab1 \G
    *************************** 1. row ***************************
           Table: tab1
    Create Table: CREATE TABLE `tab1` (
      `id` int NOT NULL AUTO_INCREMENT,
      `cname` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
      `age` int DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `idx_cname` (`cname`)
    ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
    1 row in set (0.00 sec)
  • 分别在3306、3307、3308做update、delete操作
  • greatsql> update test_rep1.tab1 set age=110 where id=1;
    greatsql> update test_rep2.tab1 set age=120 where id=2;
    greatsql> update test_rep3.tab1 set age=130 where id=3;
    
    greatsql> delete from test_rep1.tab1 where id=1;
    greatsql> delete from test_rep2.tab1 where id=1;
    greatsql> delete from test_rep3.tab1 where id=1;

    查看3309的数据同步情况,确认数据被清理

    $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -s
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    greatsql> select * from test_rep.tab1;
  • 观察3个channel的同步情况,可以确认三个复制同步均正常
  • $ mysql -ugreatsql -h127.0.0.1 -p -P3309 -e 'show replica status \G'| grep -E 'Replica_IO_Running|Replica_SQL_Running|Channel_Name'
    greatsql: [Warning] Using a password on the command line interface can be insecure.
               Replica_IO_Running: Yes
              Replica_SQL_Running: Yes
        Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
                     Channel_Name: channel_3306
               Replica_IO_Running: Yes
              Replica_SQL_Running: Yes
        Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
                     Channel_Name: channel_3307
               Replica_IO_Running: Yes
              Replica_SQL_Running: Yes
        Replica_SQL_Running_State: Replica has read all relay log; waiting for more updates
                     Channel_Name: channel_3308

    至此dml、ddl同步均验证。

    方案缺陷

  • 本方案中,业务访问MyCat的表名字,和server后端的表名字完全一致,只是库名字不相同,然后MyCat代理表名和实际server的表名字可以不相同,这种情况下,暂时无法映射处理
  • MyCat代理的实际上是多个单独的库,如果这些库之前没有做自增主键步长处理,或者其他一些主键不重复策略,同步过程中,会存在主键冲突导致数据同步中断的情况,需要提前准备处理方案。
  • 最后附上参考资料

    • https://dev.mysql.com/doc/refman/8.0/en/replication-options-replica.html#option_mysqld_replicate-rewrite-db
    • https://dev.mysql.com/doc/refman/8.0/en/replication-options-replica.html#option_mysqld_slave-skip-errors

    相关文章

    Oracle如何使用授予和撤销权限的语法和示例
    Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
    下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
    社区版oceanbase安装
    Oracle 导出CSV工具-sqluldr2
    ETL数据集成丨快速将MySQL数据迁移至Doris数据库

    发布评论