作者简介:张杰,大数据开发工程师。
一、练习背景
通过对 mysql 至 OceanBase 数据的迁移过程的学习,去了解和熟悉 OceanBase 数据库业务数据表结构及数据的多种迁移方法和迁移工具等。
二、练习内容
1、使用 mysqldump 将 mysql 的表结构和数据同步到 OceanBase 的 MySQL 租户中。
2、使用 datax 配置至少一个表的 MySQL 到 OceanBase 的 MySQL 租户的离线同步。
3、使用 datax 配置至少一个表的 OceanBase 到 CSV 以及 CSV 到 OceanBase 的离线同步。
4、使用 canal 配置 MySQL 的增量同步到 OceanBase 的 MySQL 租户。
三、Mysqldump 数据迁移
1、环境准备:
·安装tpcc
wget http://imysql.com/wp-content/uploads/2014/09/tpcc-mysql-src.tgz gunzip tpcc-mysql-src.tgz tar xf tpcc-mysql-src.tar cd tpcc-mysql/src make
·并将测试数据导入mysql数据库的tpcc数据库
[root@ob1 tpcc-mysql]# ./tpcc_load 172.20.10.10:3306 tpcc root 123456 1
·查看测试业务库数据表
2、使用 mysqldump 导出业务表结构
mysqldump -h 127.1 -uroot -P3306 -p123456 -d tpcc --compact > tpch_ddl.sql
3、使用 mysqldump 导出业务表数据
mysqldump -h 127.1 -uroot -P3306 -p123456 -t tpcc > tpcc_data.sql
4、同步业务表结构至 OceanBase
mysql -h 172.20.10.10 -u root@sys#obdemo -P 2883 -p123456 -c -A tpcc source tpcc_ddl.sql
5、同步业务表数据至 OceanBase
·source tpcc_data.sql
·查看数据已经同步完成
四、Datax 数据迁移
1、环境准备
wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
tar -zxvf data.tar.gz
2、删除 plugin 文件夹下的隐藏文件
cd /usr/local/soft/datax/datax/plugin/reader && rm -rf .* cd /usr/local/soft/datax/datax/plugin/writer && rm -rf .*
3、生成 mysql 同步至 OceanBase的配置模板
python ./bin/datax.py -r mysqlreader -w oceanbasev10writer
4、修改配置文件
vim mysql2ob.json { "job": { "setting": { }, "errorLimit": { "record": 0, "percentage": 0.1 } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "username": "root", "password": "123456", "column": [ "*" ], "connection": [ { "table": [ "student" ], "jdbcUrl": ["jdbc:mysql://172.20.10.10:3306/test?useUnicode=true&characterEncoding=utf8"] } ] } }, "writer": { "name": "oceanbasev10writer", "parameter": { "obWriteMode": "insert", "column": [ "*" ], "preSql": [ "truncate table student" ], "connection": [ { "jdbcUrl": "||_dsc_ob10_dsc_||obdemo:sys||_dsc_ob10_dsc_||jdbc:oceanbase://172.20.10.10:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true", "table": [ "student" ] } ], "username": "root", "password":"123456", "writerThreadCount":10, "batchSize": 1000, "memstoreThreshold": "0.9" } } } ] } }
5、同步任务的启动与完成
·同步任务的启动
python ./bin/datax.py ./job/mysql2ob.json
·执行同步完成
五、datax 实现 OceanBase 以 csv 格式的导入导出
1、OceanBase 以 csv 格式导出数据
·生成 datax 配置 json 模板
python bin/datax.py -r oceanbasev10reader -w txtfilewriter > job/oceanbase2csv.json
·修改 json 文件为可执行文件
chmod 777 oceanbase2csv.json
·编辑配置 json 文件
vim job/oceanbase2csv.json { "job": { "setting": { "speed": { "channel": 4 }, "errorLimit": { "record": 0, "percentage": 0.1 } }, "content": [ { "reader": { "name": "oceanbasev10reader", "parameter": { "column": ["stu_no","stu_name","stu_addr","stu_class"], "connection": [ { "jdbcUrl": ["||_dsc_ob10_dsc_||obdemo:sys||_dsc_ob10_dsc_||jdbc:oceanbase://172.20.10.10:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true"], "table": ["student"] } ], "password": "123456", "username": "root", "where": "" } }, "writer": { "name": "txtfilewriter", "parameter": { "dateFormat": "yyyy-MM-dd hh:mm:ss", "fieldDelimiter": ",", "fileName": "ob2csv", "path": "/usr/local/soft/mysql_dump", "writeMode": "truncate" } } } ] } }
·执行任务
python bin/datax.py job/oceanbase2csv.json
·查看执行结果
2、OceanBase 以 csv 格式导入数据
·生成 datax 配置 json 模板
python bin/datax.py -w oceanbasev10writer -r txtfilereader > job/csv2oceanbase.json
·修改 json 文件为可执行文件
chmod 777 csv2oceanbase.json
·编辑配置 json 文件
vim csv2oceanbase.json { "job": { "setting": { "speed": { "channel": 4 }, "errorLimit": { "record": 0, "percentage": 0.1 } }, "content": [ { "reader": { "name": "txtfilereader", "parameter": { "path": ["/usr/local/soft/mysql_dump"], "fileName": "ob2csv", "encoding": "UTF-8", "column": ["*"], "dateFormat": "yyyy-MM-dd hh:mm:ss" , "nullFormat": "\\N" , "fieldDelimiter": "," } }, "writer": { "name": "oceanbasev10writer", "parameter": { "obWriteMode": "insert", "column": [ "stu_no","stu_name","stu_addr","stu_class" ], "preSql": [ "truncate table student" ], "connection": [ { "jdbcUrl": "||_dsc_ob10_dsc_||obdemo:sys||_dsc_ob10_dsc_||jdbc:oceanbase://172.20.10.10:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true", "table": [ "student" ] } ], "username": "root", "password": "123456", "writerThreadCount": 10, "batchSize": 100, "memstoreThreshold": "0.9" } } } ] } }
·执行任务
python bin/datax.py job/csv2oceanbase.json
·查看执行结果
·注意:
Path:需要给定文件的路径,该路径下不要有其他文件,执行会报错。
六、Canal 实现 mysql 与 OceanBase 的增量同步
1、Mysql 环境准备
·开启 binlog 日志
[mysqld] log-bin=mysql-bin # 开启 binlog binlog-format=ROW # 选择 ROW 模式 server_id=1 # 配置 MySQL replaction 需要定义,不要和 canal 的 slaveId 重复
·授权账号权限
这里我使用的是 root 账号,可以创建账号并赋权限。
CREATE USER canal IDENTIFIED BY 'canal'; GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%'; -- GRANT ALL PRIVILEGES ON *.* TO 'canal'@'%' ; FLUSH PRIVILEGES;
2、部署 canal admin
·下载 canal admin 并解压
wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.admin-1.1.5.tar.gz tar -zxvf canal.admin-1.1.5.tar.gz
·修改配置文件
cd canal-admin/conf vim application.yml
·初始化元数据库
mysql -uroot -P3306 -p123456 source conf/canal_manager.sql
·启动 canal-admin
cd canal/canal-admin/bin/ && ./startup.sh
·查看 canal-admin 的 web 服务
http://127.0.0.1:8080/ Username:admin Password:123456
3、部署canal deployer
·下载canal deployer并解压
wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz tar canal.deployer-1.1.5.tar.gz
·修改配置文件
canal Admin 部署 server 和 instance,配置文件 conf/canal_local.properties 替换 conf/canal.properties。需要修改 conf/canal.properties 里的 manager 地址,其他参数值可以保持默认。
mv canal_local.properties canal.properties && vim canal.properties
·启动 canal-deployer 服务
cd ~/canal/canal-deployer/bin/ && ./startup.sh
4、通过 canal-admin 的 web 服务配置 canal server 和 canal instance
·创建 canal server 并启动
·创建 canal instance 并启动
·导入模板并修改对应的参数
#################################################
## mysql serverId , v1.0.26+ will autoGen canal.instance.mysql.slaveId=1000 # enable gtid use true/false canal.instance.gtidon=false # position info canal.instance.master.address=172.20.10.10:3306 canal.instance.master.journal.name= canal.instance.master.position= canal.instance.master.timestamp= canal.instance.master.gtid= # rds oss binlog canal.instance.rds.accesskey= canal.instance.rds.secretkey= canal.instance.rds.instanceId= # table meta tsdb info canal.instance.tsdb.enable=true #canal.instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_tsdb #canal.instance.tsdb.dbUsername=canal #canal.instance.tsdb.dbPassword=canal #canal.instance.standby.address = #canal.instance.standby.journal.name = #canal.instance.standby.position = #canal.instance.standby.timestamp = #canal.instance.standby.gtid= # username/password canal.instance.dbUsername=canal canal.instance.dbPassword=canal canal.instance.connectionCharset = UTF-8 # enable druid Decrypt database password canal.instance.enableDruid=false #canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ== # table regex canal.instance.filter.regex=.*\\..* # table black regex canal.instance.filter.black.regex= # table field filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2) #canal.instance.filter.field=test1.t_product:id/subject/keywords,test2.t_company:id/name/contact/ch # table field black filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2) #canal.instance.filter.black.field=test1.t_product:subject/product_image,test2.t_company:id/name/contact/ch # mq config canal.mq.topic=example # dynamic topic route by schema or table regex #canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..* canal.mq.partition=0 # hash partition config #canal.mq.partitionsNum=3 #canal.mq.partitionHash=test.table:id^name,.*\\..* #################################################
5、部署 canal adapter
·下载 canal adapter 并解压
wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.adapter-1.1.5.tar.gz
tar canal.adapter-1.1.5.tar.gz
·修改配置文件
cd ~/canal/canal-adapter/conf/ && vim application.yml
server: port: 8081 spring: jackson: date-format: yyyy-MM-dd HH:mm:ss time-zone: GMT+8 default-property-inclusion: non_null canal.conf: mode: tcp #tcp kafka rocketMQ rabbitMQ flatMessage: true zookeeperHosts: syncBatchSize: 1000 retries: 0 timeout: accessKey: secretKey: consumerProperties: # canal tcp consumer canal.tcp.server.host: 172.20.10.10:11111 canal.tcp.zookeeper.hosts: canal.tcp.batch.size: 500 canal.tcp.username: admin canal.tcp.password: 4ACFE3202A5FF5CF467898FC58AAB1D615029441 canalAdapters: - instance: mysqldb # canal instance Name or mq topic name groups: - groupId: group1 outerAdapters: - name: logger - name: rdb key: oceanbase1 properties: jdbc.driverClassName: com.mysql.jdbc.Driver jdbc.url: jdbc:mysql://172.20.10.10:2883/test?useUnicode=true jdbc.username: root@sys#obdemo jdbc.password: 123456
·配置映射文件
cd ~/canal/canal-adapter/conf/rdb/ && vim mytest_user.yml
·启动 canal adapter 服务
cd ~/canal/canal-adapter/bin/ && ./startup.sh
6、同步测试
· mysql 创建一个业务表
· OceanBase 数据库同步创建成功
·查看日志
OceanBase 社区版入门到实战教程直播正在进行中~
快和小伙伴一起进群交流学习吧~
加入直播群方式一:
钉钉群号 3582 5151
加入直播群方式二:
扫码下方二维码加入