TiDB 迁移上云实践(一)之自建MySQL

2024年 5月 7日 66.9k 0

1. 名词解释

# 名词 说明
1 PD Server Placement Driver Server 的简称,集群的管理模块。
2 Pump 用于实时记录 TiDB产生的 Binlog,并将 Binlog 按照事务的提交时间进行排序,
再提供给 Drainer 进行消费的服务器。
3 Drainer 从各个 Pump 中收集 Binlog 进行归并,再将 Binlog 转化成 SQL 或者指定格式的数据,
最终同步到下游。
4 Mydumper  是一个 fork 项目,用于对 MySQL/TiDB 进行逻辑备份,并针对 TiDB 的特性进行了优化。
5 Loader 由 PingCAP 开发的数据导入工具,用于向 MySQL/TiDB 中导入数据。
6 Sync-diff-inspector 是一个用于校验 MySQL/TiDB 两份数据是否一致的工具。
7 Tiup TiDB 4.0 开始提供的包管理器,方便软件的安装维护,同时支持集群的部署、扩缩容和
压力测试等功能。
8 Kafka 由 Apache 软件基金会开发的一个开源流处理平台,是一种高吞吐量的分布式发布订阅消息系统。
9 TSO TimeStamp Oracle 的缩写,事务(Transaction)被分配的全局唯一时间标识。

注:

  • 本文的主旨是讨论验证 TiDB 迁移到云数据库产品的一般路径和步骤,不涉及特殊场景(如报错/Bug等)的处理。
  • 相关云数据库产品请参考官方帮助文档。

2. 迁移路径

TiDB 迁移路径

TiDB 迁移上云实践(一)之自建MySQL-1

从上图中可以看到主要有 4 种迁移路径。

  • TiDB 集群通过 Pump + Drainer 同步数据到一个客户自建 MySQL 实例,然后通过 DTS 迁移到 RDS MySQL、PolarDB MySQL、PolarDB-X、ADB MySQL 等云数据库。
  • TiDB 集群通过 Pump + Drainer 同步数据到一个客户自建 Kafka 实例,然后通过 DTS 迁移到 RDS MySQL,其后还可以同步到 PolarDB-X、ADB MySQL 等云数据库。
  • TiDB 集群通过 Pump + Drainer 同步数据到 RDS MySQL,其后还可以同步到 PolarDB-X、ADB MySQL 等云数据库。
  • TiDB 集群通过 DTS 迁移全量数据到 RDS MySQL,然后通过 Pump + Drainer 实现增量数据迁移。

4 种迁移路径对比

# 路径 优势 劣势
1 自建 MySQL 支持多种目的数据库类型;
全量和增量数据上云迁移对源 TiDB 集群无压力;
迁移方案比较灵活;
需要额外 MySQL 中间环节和资源,
大数据量情况下全量数据导入容易成为卡点。
要求表上存在主键或唯一键。
2 自建 Kafka 不需要全量数据备份,比较适合大数量场景 需要额外 Kafka 中间环节和资源,
全量迁移对源库和网络可能存在压力。
要求表上存在主键或唯一键。
3 直连云数据库 适合在 无 DTS 场景下使用,
不需要对 DTS 做网络额外配置
大数据量情况下全量数据导入云数据库
容易成为卡点。
4 DTS 全量 + 增量 不需要全量数据备份,比较适合大数量场景
不依赖中间环节、增量数据可控性更强、更加灵活
增量起始位点需要比较准确定位;
全量迁移对源库和网络可能存在压力
重复应用增量数据需要人工干预并且
要求表上存在主键或唯一键。

本文详细讨论迁移路径 1。

2.1 网络

根据客户自建的 MySQL 实例所在的位置,可以分为 2 种典型场景。

  • 自建 MySQL 位于客户云上 VPC 环境中,如下图场景一示意。
  • 自建 MySQL 位于客户云下 IDC 环境中,如下图场景二示意。

场景一

TiDB 迁移上云实践(一)之自建MySQL-2

场景二

TiDB 迁移上云实践(一)之自建MySQL-3

场景一除了具有场景二所具有的数据迁移对 TiDB 集群没有直接的影响(DTS 全量/增量数据抽取都发生在自建 MySQL 实例上)的优势外,不需要网络做额外的配置,并且对网络压力更加可控,因此更加方便客户数据迁移上云。

注:

  • 具体 DTS 在两种场景下,对不同网络接入方式(VPN、专线、冗余专线、智能网关、CEN)会另文介绍,也可以参考网络和 DTS 相关文档。

2.2 迁移步骤

这里我们使用 ECS 来模拟 TiDB 集群,选择初始 TiDB 集群没有配置 Pump + Drainer 的场景。

Step 步骤 说明
1 TiDB 集群部署 通过 Tiup 工具部署 TiDB 4.0 集群,模拟客户环境。使用 Tiup 部署 TiDB 集群
2 加载样例数据 加载样例库和样例数据,模拟客户数据。Import Example Database
3 添加 Pump 对集群进行扩容,增加 Pump 服务器
4 全量备份 通过 Mydumper 备份集群全量数据。
5 全量数据导入 将步骤 4 生成的全量数据导入到自建 MySQL 实例。
6 添加 Drainer 对集群进行扩容,增加 Drainer 单元,实现增量同步
7 结果验证 验证自建 MySQL 实例和 TiDB 集群数据一致。

3. 迁移

我们开始迁移的模拟,为了节约资源,我们采用所有实例部署在同一台 ECS 上的部署形式。

3.1 TiDB 集群部署

# ECS 规格 OS 磁盘 网络
1 8 core / 32 GB CentOS 7.6 64bit 系统盘 40 GB
ESSD 数据盘 100 GB
按使用流量、上限 100Mbps、
带公网 IP、VPC 网络

TiDB 集群

TiDB 迁移上云实践(一)之自建MySQL-4

3.1.1 安装 Tiup

[root@lincloud259 ~]# cd /data
[root@lincloud259 data]# mkdir -m 777 tidb_cluster

[root@lincloud259 ~]# curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
[root@lincloud259 ~]# . /root/.bash_profile
[root@lincloud259 ~]# tiup cluster

3.1.2 配置 SSH

# 配置 SSH,允许更多并发 Session
[root@lincloud259 ~]# vi /etc/ssh/sshd_config
[root@lincloud259 ~]# grep -i '^maxsessions' /etc/ssh/sshd_config
MaxSessions 90

[root@lincloud259 ~]# systemctl restart sshd.service
[root@lincloud259 ~]# sshd -T | grep -i maxsessions
maxsessions 90

3.1.3 部署集群

[root@lincloud259 ~]# cd /data/tidb_cluster
[root@lincloud259 tidb_cluster]# vi topo.yaml
# # Global variables are applied to all deployments and used as the default value of
# # the deployments if a specific deployment value is missing.
global:
 user: "tidb"
 ssh_port: 22
 deploy_dir: "/data/tidb_cluster/tidb-deploy/jacky"
 data_dir: "/data/tidb_cluster/tidb-data/jacky"

# # Monitored variables are applied to all the machines.
monitored:
 node_exporter_port: 9100
 blackbox_exporter_port: 9115

server_configs:
 tidb:
   log.slow-threshold: 300
 tikv:
   readpool.storage.use-unified-pool: false
   readpool.coprocessor.use-unified-pool: true
 pd:
   replication.enable-placement-rules: true
 tiflash:
   logger.level: "info"

pd_servers:
 - host: 10.23.199.10

tidb_servers:
 - host: 10.23.199.10
   port: 4000
   status_port: 10080

 - host: 10.23.199.10
   port: 4001
   status_port: 10081

 - host: 10.23.199.10
   port: 4002
   status_port: 10082

tikv_servers:
 - host: 10.23.199.10
   port: 20160
   status_port: 20180

 - host: 10.23.199.10
   port: 20161
   status_port: 20181

 - host: 10.23.199.10
   port: 20162
   status_port: 20182

tiflash_servers:
 - host: 10.23.199.10

monitoring_servers:
 - host: 10.23.199.10

grafana_servers:
 - host: 10.23.199.10

注:

  • 因为是在同一台 ECS 上部署所有集群服务器,因此不要做 TiDB Server 的 Numa 绑定配置。

3.2 加载样例数据

配置 ECS 的安全组,允许 4000 端口的公网访问;在安装 MySQL 客户端的设备上下载样例数据库文件并加载。

[mysql@lincloud259 tidb_tmp]$ curl -L --remote-name-all https://s3.amazonaws.com/capitalbikeshare-data/2017-capitalbikeshare-tripdata.zip

[mysql@lincloud259 tidb_tmp]$ unzip \*-tripdata.zip

[mysql@lincloud259 tidb_tmp]$ mysql -h10.24.200.10 -P4000 -uroot

(root@bj4) [(none)]> CREATE DATABASE bikeshare;
Query OK, 0 rows affected (0.11 sec)

(root@bj4) [(none)]> USE bikeshare;
Database changed
(root@bj4) [bikeshare]> CREATE TABLE trips (
        trip_id bigint NOT NULL PRIMARY KEY AUTO_INCREMENT,
        duration integer not null,
        start_date datetime,
        end_date datetime,
        start_station_number integer,
        start_station varchar(255),
        end_station_number integer,
        end_station varchar(255),
        bike_number varchar(255),
        member_type varchar(255)
       );
Query OK, 0 rows affected (0.15 sec)

(root@bj4) [bikeshare]> LOAD DATA LOCAL INFILE '2017Q1-capitalbikeshare-tripdata.csv' INTO TABLE trips
         FIELDS TERMINATED BY ',' ENCLOSED BY '"'
         LINES TERMINATED BY '\r\n'
         IGNORE 1 LINES
       (duration, start_date, end_date, start_station_number, start_station,
       end_station_number, end_station, bike_number, member_type);
Query OK, 646510 rows affected (25.24 sec)
Records: 646510  Deleted: 0  Skipped: 0  Warnings: 0

3.3 添加 Pump

Pump 服务器负责生成 Binlog 文件,因此在对已有 TiDB 集群添加 Pump 服务器前,不要试图开启 TiDB 服务器的 binlog 配置选项。

TiDB 迁移上云实践(一)之自建MySQL-5

在 Pump Server 扩展完成后,开启 binlog。

[root@lincloud259 tidb_cluster]# tiup cluster edit-config jacky
server_configs:
  tidb:
    binlog.enable: true
    binlog.ignore-error: true

Please check change highlight above, do you want to apply the change? [y/N]: y
Apply the change...
Apply change successfully, please use `tiup cluster reload jacky [-N ] [-R ]` to reload config.

[root@lincloud259 tidb_cluster]# su - tidb
[tidb@lincloud259 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/tidb/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/tidb/.ssh/id_rsa.
Your public key has been saved in /home/tidb/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:x/piOeorfeNWt8gEHUDWGR/mKg/+Z5/B9mO5t43xyg4 tidb@iZ2zefd13t0f76cgvidpzlZ
The key's randomart image is:
+---[RSA 2048]----+
|       .+o.oo    |
|       .  ++ .   |
|         . .o    |
|        ....     |
|        S.+      |
|       . *o ..   |
|     .  o=.oE.= .|
|    . . Ooo ++ @o|
|     o+B.+oo oO+O|
+----[SHA256]-----+
[tidb@lincloud259 ~]$ cd .ssh
[tidb@lincloud259 .ssh]$ cat id_rsa.pub  >> authorized_keys
[tidb@lincloud259 .ssh]$ ssh 10.23.199.10
[tidb@lincloud259 .ssh]$ ssh 10.26.33.96
The authenticity of host '10.26.33.96 (10.26.33.96)' can't be established.
ECDSA key fingerprint is SHA256:CrPbE+hWNud80v12LVq1t9KAWSx09fC7xw0+CTJEi9Y.
ECDSA key fingerprint is MD5:ae:30:87:d0:35:c0:9e:20:f2:77:2c:0c:55:f7:d9:11.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.26.33.96' (ECDSA) to the list of known hosts.

Welcome to Alibaba Cloud Elastic Compute Service !

[tidb@lincloud259 ~]$ exit

[tidb@lincloud259 .ssh]$ exit

[root@lincloud259 tidb_cluster]# tiup cluster reload jacky

root@lincloud259 tidb_cluster]# tiup cluster display jacky
Starting component `cluster`: /root/.tiup/components/cluster/v1.0.6/tiup-cluster display jacky
TiDB Cluster: jacky
TiDB Version: v4.0.1
ID                  Role        Host          Ports                            OS/Arch       Status   Data Dir                                            Deploy Dir
--                  ----        ----          -----                            -------       ------   --------                                            ----------
10.23.199.10:3000   grafana     10.23.199.10  3000                             linux/x86_64  Up       -                                                   /data/tidb_cluster/tidb-deploy/jacky/grafana-3000
10.23.199.10:2379   pd          10.23.199.10  2379/2380                        linux/x86_64  Up|L|UI  /data/tidb_cluster/tidb-data/jacky/pd-2379          /data/tidb_cluster/tidb-deploy/jacky/pd-2379
10.23.199.10:9090   prometheus  10.23.199.10  9090                             linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/prometheus-9090  /data/tidb_cluster/tidb-deploy/jacky/prometheus-9090
10.23.199.10:8250   pump        10.23.199.10  8250                             linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/pump-8250        /data/tidb_cluster/tidb-deploy/jacky/pump-8250
10.23.199.10:8251   pump        10.23.199.10  8251                             linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/pump-8251        /data/tidb_cluster/tidb-deploy/jacky/pump-8251
10.23.199.10:8252   pump        10.23.199.10  8252                             linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/pump-8252        /data/tidb_cluster/tidb-deploy/jacky/pump-8252
10.23.199.10:4000   tidb        10.23.199.10  4000/10080                       linux/x86_64  Up       -                                                   /data/tidb_cluster/tidb-deploy/jacky/tidb-4000
10.23.199.10:4001   tidb        10.23.199.10  4001/10081                       linux/x86_64  Up       -                                                   /data/tidb_cluster/tidb-deploy/jacky/tidb-4001
10.23.199.10:4002   tidb        10.23.199.10  4002/10082                       linux/x86_64  Up       -                                                   /data/tidb_cluster/tidb-deploy/jacky/tidb-4002
10.23.199.10:9000   tiflash     10.23.199.10  9000/8123/3930/20170/20292/8234  linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/tiflash-9000     /data/tidb_cluster/tidb-deploy/jacky/tiflash-9000
10.23.199.10:20160  tikv        10.23.199.10  20160/20180                      linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/tikv-20160       /data/tidb_cluster/tidb-deploy/jacky/tikv-20160
10.23.199.10:20161  tikv        10.23.199.10  20161/20181                      linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/tikv-20161       /data/tidb_cluster/tidb-deploy/jacky/tikv-20161
10.23.199.10:20162  tikv        10.23.199.10  20162/20182                      linux/x86_64  Up       /data/tidb_cluster/tidb-data/jacky/tikv-20162       /data/tidb_cluster/tidb-deploy/jacky/tikv-20162

3.4 全量备份

3.4.1 安装备份工具

[root@lincloud259 local]# mkdir -m 777 /data/soft/tidb
[root@lincloud259 local]# cd /data/soft/tidb
[root@lincloud259 tidb]# wget -c 'https://download.pingcap.org/tidb-enterprise-tools-nightly-linux-amd64.tar.gz' -O tidb-enterprise-tools-nightly-linux-amd64.tar.gz
[root@lincloud259 tidb]# cp -rp tidb-enterprise-tools-nightly-linux-amd64.tar.gz /usr/local/
[root@lincloud259 tidb]# cd /usr/local/
[root@lincloud259 local]# tar zxpf tidb-enterprise-tools-nightly-linux-amd64.tar.gz

[root@lincloud259 local]# rm -f tidb-enterprise-tools-nightly-linux-amd64.tar.gz
[root@lincloud259 local]# ln -s /usr/local/tidb-enterprise-tools-nightly-linux-amd64/ /usr/local/tidb_tools
[root@lincloud259 local]# ls
aegis  bin  etc  games  include  lib  lib64  libexec  sbin  share  src  tidb-enterprise-tools-nightly-linux-amd64  tidb_tools
[root@lincloud259 bin]# vi /etc/profile
# Add for /usr/local utilities
PATH=$PATH:/usr/local/tidb_tools/bin
export PATH

[root@lincloud259 bin]# . /etc/profile
[root@lincloud259 bin]# which mydumper
/usr/local/tidb_tools/bin/mydumper

3.4.2 备份 TiDB 集群

[root@lincloud259 bin]# cd /data/tidb_cluster/
[root@lincloud259 tidb_cluster]# mkdir -p tidb_backup/jacky
[root@lincloud259 tidb_cluster]# cd tidb_backup/jacky/
[root@lincloud259 jacky]# mkdir `date '+%d-%m-%Y_%H_%M_%S'`
[root@lincloud259 ]# mydumper -h 127.0.0.1 -P 4000 -u root -t 32 -F 256 -B bikeshare --skip-tz-utc -o ./
[root@lincloud259 ]# ls
bikeshare-schema-create.sql  bikeshare.trips-schema.sql  bikeshare.trips.sql  metadata

[root@lincloud259 22-03-2024_14_46_23]# cat metadata
Started dump at: 2024-03-22 14:47:02
SHOW MASTER STATUS:
	Log: tidb-binlog
	Pos: 417545171224756227
	GTID:

Finished dump at: 2024-03-22 14:47:05

3.4.3 生成 Binlog

[mysql@lincloud259 tidb_tmp]$ ls
2024-lincloud259-tripdata.zip  2024Q1-capitalbikeshare-tripdata.csv  2024Q2-capitalbikeshare-tripdata.csv  2017Q3-capitalbikeshare-tripdata.csv  2017Q4-capitalbikeshare-tripdata.csv
[mysql@lincloud259 tidb_tmp]$ mysql -h10.24.200.10 -P4000 -uroot
(root@bj4) [(none)]> use bikeshare
(root@bj4) [bikeshare]> load data local infile '2024Q2-capitalbikeshare-tripdata.csv' into table trips FIELDS TERMINATED BY ',' ENCLOSED BY '"'   LINES TERMINATED BY '\r\n'   IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);
Query OK, 1104418 rows affected (44.30 sec)
Records: 1104418  Deleted: 0  Skipped: 0  Warnings: 0

3.5 全量数据导入

因为在添加 Pump 之前,TiDB 集群已经存在并且包含数据,因此需要将全量数据导入到自建 MySQL 实例中。

3.5.1 创建 ECS 自建 MySQL 实例

当前 TiDB 全面兼容 MySQL 5.7 语法,因此自建 MySQL 实例选择 5.7 版本。

[root@lincloud259 22-03-2024_14_46_23]# cd /data/soft
[root@lincloud259 soft]# mkdir mysql && cd mysql
[root@lincloud259 mysql]# ls
mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz
[root@lincloud259 mysql]# cp -rp mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz /usr/local/
c[root@lincloud259 mysql]# cd /usr/local
[root@lincloud259 local]# tar zxpf mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz
[root@lincloud259 local]# ln -s /usr/local/mysql-5.7.24-linux-glibc2.12-x86_64/ /usr/local/mysql
[root@lincloud259 local]# vi /etc/profile
# Add for /usr/local utilities
PATH=$PATH:/usr/local/tidb_tools/bin:/usr/local/mysql/bin
export PATH
[root@lincloud259 local]# . /etc/profile
[root@lincloud259 local]# which mysql
/usr/local/mysql/bin/mysql

[root@lincloud259 local]# groupadd -g 500 mysql
[root@lincloud259 local]# useradd -u 500 -g 500 -G disk -d /home/mysql -s /bin/bash -m mysql
[root@lincloud259 local]# passwd mysql

[root@lincloud259 local]# cd /data
[root@lincloud259 data]# mkdir -p mysql_data/jacky
[root@lincloud259 data]# chown -R mysql:mysql mysql_data/

[root@lincloud259 data]# su - mysql
[mysql@lincloud259 ~]$ cd /data/mysql_data/
[mysql@lincloud259 mysql_data]$ vi my.cnf

# The MySQL server
[mysqld]
# Basic parameters
user	= mysql
port	= 3701
character_set_server			= utf8mb4
skip-character-set-client-handshake	= 1
basedir					= /usr/local/mysql
datadir					= /data/mysql_data/jacky
pid-file				= /data/mysql_data/jacky/mysql.pid
socket					= /data/mysql_data/jacky/mysql.sock
lower_case_table_names			= 1

# Logging
log_error				= error.log
log_queries_not_using_indexes		= 0
long_query_time				= 1
slow_query_log				= 1
slow_query_log_file			= mysql-slow.log

# Binary Logging
log_bin					= binlog
binlog_format				= row
binlog_row_image			= FULL
expire_logs_days			= 0
sync_binlog				= 1

# Replication
server_id				= 1819034970
master-info-repository			= file
relay-log-info_repository		= file
gtid-mode				= on
enforce-gtid-consistency		= true

# InnoDB
innodb_data_file_path=ibdata1:200M:autoextend
innodb_buffer_pool_size = 1G
innodb_file_per_table = ON
innodb_flush_log_at_trx_commit = 1
innodb_flush_method = O_DIRECT
innodb_log_files_in_group=2
innodb_log_file_size=1048576000
innodb_io_capacity = 1000
innodb_io_capacity_max = 2000
innodb_large_prefix = ON
innodb_undo_directory=./

# Caches & Limits
max_connections = 500
max_allowed_packet = 1048576000

[root@lincloud259 ~]# mysqld --defaults-file=/data/mysql_data/my.cnf --initialize

[root@lincloud259 ~]# cp -rp /data/mysql_data/my.cnf /data/mysql_data/jacky/my.cnf

[root@lincloud259 ~]# nohup mysqld_safe --defaults-file=/data/mysql_data/jacky/my.cnf &
[root@lincloud259 ~]# netstat -nlt | grep 3701
tcp6       0      0 :::3701                 :::*                    LISTEN

[root@lincloud259 ~]# su - mysql

[mysql@lincloud259 ~]$ cd /data/mysql_data/jacky/

[mysql@lincloud259 jacky]$ grep -i root ./error.log
2024-03-22T08:38:07.961728Z 1 [Note] A temporary password is generated for root@localhost: ZTy-48D,/r0k

[mysql@lincloud259 jacky]$ mysql -hlocalhost -uroot -p'ZTy-48D,/r0k' -S ./mysql.sock

mysql> set password=password('xxx');
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to 'root'@'%' identified by 'xxx' with grant option;
Query OK, 0 rows affected, 1 warning (0.00 sec)

## ECS 控制台修改安全组,允许公网 3701 访问,然后验证下远程可以访问

[mysql@lincloud259 tidb_tmp]$ mysql -h10.24.200.10 -uroot -pxxx -P3701
(root@bj4) [(none)]> quit

3.5.2 导入备份

[root@lincloud259 22-03-2024_14_46_23]# loader -L info -P 3701 -checkpoint-schema="tidb_loader" -d ./ -h 10.24.200.10 -t 32 -u root -p 'xxx'
2024/03/22 21:20:38 printer.go:52: [info] Welcome to loader
2024/03/22 21:20:38 printer.go:53: [info] Release Version: v1.0.0-79-gf505ab3
2024/03/22 21:20:38 printer.go:54: [info] Git Commit Hash: f505ab3ce55cd9cbb29e2346317164055a1b1c15
2024/03/22 21:20:38 printer.go:55: [info] Git Branch: master
2024/03/22 21:20:38 printer.go:56: [info] UTC Build Time: 2024-03-23 02:40:18
2024/03/22 21:20:38 printer.go:57: [info] Go Version: go version go1.13 linux/amd64
2024/03/22 21:20:38 main.go:51: [info] config: {"log-level":"info","log-file":"","status-addr":":8272","pool-size":32,"dir":"./","db":{"host":"10.24.200.10","user":"root","port":3701,"sql-mode":"@DownstreamDefault","max-allowed-packet":67108864},"checkpoint-schema":"tidb_loader","config-file":"","route-rules":null,"do-table":null,"do-db":null,"ignore-table":null,"ignore-db":null,"rm-checkpoint":false}
2024/03/22 21:20:38 loader.go:532: [info] [loader] prepare takes 0.000142 seconds
2024/03/22 21:20:38 checkpoint.go:207: [info] calc checkpoint finished. finished tables (map[])
2024/03/22 21:20:38 loader.go:715: [info] [loader][run db schema].//bikeshare-schema-create.sql[start]
2024/03/22 21:20:38 loader.go:720: [info] [loader][run db schema].//bikeshare-schema-create.sql[finished]
2024/03/22 21:20:38 loader.go:736: [info] [loader][run table schema].//bikeshare.trips-schema.sql[start]
2024/03/22 21:20:38 loader.go:741: [info] [loader][run table schema].//bikeshare.trips-schema.sql[finished]
2024/03/22 21:20:38 loader.go:773: [info] [loader] create tables takes 0.037831 seconds
2024/03/22 21:20:38 loader.go:788: [info] [loader] all data files have been dispatched, waiting for them finished
2024/03/22 21:20:38 loader.go:158: [info] [loader][restore table data sql].//bikeshare.trips.sql[start]
2024/03/22 21:20:38 loader.go:216: [info] data file bikeshare.trips.sql scanned finished.
2024/03/22 21:20:43 status.go:32: [info] [loader] finished_bytes = 32997729, total_bytes = GetAllRestoringFiles91120404, progress = 36.21 %
2024/03/22 21:20:48 status.go:32: [info] [loader] finished_bytes = 68995049, total_bytes = GetAllRestoringFiles91120404, progress = 75.72 %
2024/03/22 21:20:51 loader.go:165: [info] [loader][restore table data sql].//bikeshare.trips.sql[finished]
2024/03/22 21:20:51 loader.go:791: [info] [loader] all data files has been finished, takes 13.170611 seconds
2024/03/22 21:20:51 status.go:32: [info] [loader] finished_bytes = 91120404, total_bytes = GetAllRestoringFiles91120404, progress = 100.00 %
2024/03/22 21:20:51 main.go:88: [info] loader stopped and exits

[root@iZ2ze5ffbqqbeatcv1j01yZ 22-03-2024_14_46_23]# mysql -h10.24.200.10 -P3701 -uroot -p'xxx' bikeshare
mysql> show tables;
+---------------------+
| Tables_in_bikeshare |
+---------------------+
| trips               |
+---------------------+
1 row in set (0.00 sec)

mysql> select count(*) from trips;
+----------+
| count(*) |
+----------+
|   646510 |
+----------+
1 row in set (0.15 sec)

3.6 添加 Drainer

导入全量备份后,给集群添加 Drainer 服务器同步增量数据到自建 MySQL 实例。

# 服务 数量 Host 端口 Data Dir Deploy Dir Commit_TS
1 Drainer 1 10.23.199.10 8249 /data/tidb_cluster/
tidb-data/jacky/
drainer-8249
/data/tidb_cluster/
tidb-deploy/jacky/
drainer-8249
全量备份 metadata
文件中 Pos 的值

3.7 验证数据

在 TiDB 上加载 2024Q3 的数据文件,同时在自建 MySQL 实例上验证数据同步效果。

3.7.1 TiDB 加载 Q3 数据

[mysql@lincloud259 tidb_tmp]$ mysql -h10.24.200.10 -uroot -P4000

(root@bj4) [(none)]> use bikeshare
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
(root@bj4) [bikeshare]> select count(*) from trips;
+----------+
| count(*) |
+----------+
|  1750928 |
+----------+
1 row in set (0.40 sec)

(root@bj4) [bikeshare]> load data local infile '2024Q3-capitalbikeshare-tripdata.csv' into table trips FIELDS TERMINATED BY ',' ENCLOSED BY '"'   LINES TERMINATED BY '\r\n'   IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);
Query OK, 1191585 rows affected (54.81 sec)
Records: 1191585  Deleted: 0  Skipped: 0  Warnings: 0

(root@bj4) [bikeshare]> select count(*) from trips;
+----------+
| count(*) |
+----------+
|  2942513 |
+----------+
1 row in set (0.64 sec)

3.7.2 自建 MySQL 验证

[root@lincloud259 tidb_cluster]# mysql -h10.24.200.10 -P3701 -uroot -p'xxx' bikeshare

mysql> select count(*) from trips;
+----------+
| count(*) |
+----------+
|  2942513 |
+----------+
1 row in set (0.71 sec)

mysql>

至此数据验证完成。

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论