OceanBase数据库合并出现checksum error

2024年 7月 15日 81.1k 0

点击上方“IT那活儿”公众号--专注于企业全栈运维技术分享,不管IT什么活儿,干就完了!!!

现象描述

4.x生产集群目前某租户合并出现checksum error,全局索引和主表数据不一致。

OceanBase数据库合并出现checksum error-1

排查步骤

2.1 sys租户查看全局合并信息

1006租户合并中存在CHECKSUM_ERROR select * from CDB_OB_MAJOR_COMPACTION;

OceanBase数据库合并出现checksum error-22.2 查询表和索引表(包括:全局索引和局部索引)与主表之间出现的列校验和不一致的信息

select * from CDB_OB_TABLET_CHECKSUM_ERROR_INFO;
select * from CDB_OB_COLUMN_CHECKSUM_ERROR_INFO;

OceanBase数据库合并出现checksum error-32.3 根据合并的版本号compaction_scn查询主表与索引表数据是否一致(结果不一致)

select sum(row_count) from 
__all_virtual_tablet_replica_checksum where tenant_id = 1006 
and compaction_scn = 1715191202784774818 and tablet_id in 
(select tablet_id from cdb_ob_table_locations where tenant_id = 1006 and table_id = 3696982); --114007773

select sum(row_count) from 
__all_virtual_tablet_replica_checksum where tenant_id = 1006 
and compaction_scn = 1715191202784774818 and tablet_id in 
(select tablet_id from cdb_ob_table_locations where tenant_id = 1006 and table_id = 3697015) ;--114007527

OceanBase数据库合并出现checksum error-42.4 查询主表的schema信息主表为无主键range分区表。

MySQL [oceanbase]> select ddl_stmt_str from __all_virtual_ddl_operation where table_id = 3696982 and ddl_stmt_str ' ';

CREATE TABLE REWARD_USER_ORDER_111202404 PARTITION BY RANGE(PARTITION_ID)
(PARTITION PAR_0 VALUES LESS THAN (1000),
PARTITION PAR_1 VALUES LESS THAN (2000),
PARTITION PAR_2 VALUES LESS THAN (3000),
PARTITION PAR_3 VALUES LESS THAN (4000),
PARTITION PAR_4 VALUES LESS THAN (5000),
PARTITION PAR_5 VALUES LESS THAN (6000),
PARTITION PAR_6 VALUES LESS THAN (7000),
PARTITION PAR_7 VALUES LESS THAN (8000),
PARTITION PAR_8 VALUES LESS THAN (9000),
PARTITION PAR_9 VALUES LESS THAN (MAXVALUE))
AS SELECT * FROM TEMPLATE_REWARD_USER_ORDER WHERE 1 1;

2.5 修改内部表状态

MySQL [oceanbase]> select tenant_name,tenant_id from dba_ob_tenants;
MySQL [oceanbase]> alter system change tenant tenant_id=1005;\\切换meta租户
Query OK, 0 rows affected (0.001 sec)

MySQL [oceanbase]> select * from __all_column_checksum_error_info;
+----------------------------+----------------------------+-----------+---------------------+------------+---------------+----------------+----------------+-----------------+-----------+-------------------+-------------------+
| gmt_create | gmt_modified | tenant_id | frozen_scn | index_type | data_table_id | index_table_id | data_tablet_id | index_tablet_id | column_id | data_column_ckm | index_column_ckm |
+----------------------------+----------------------------+-----------+---------------------+------------+---------------+----------------+----------------+-----------------+-----------+-------------------+-------------------+
| 2024-05-09 03:12:04.297722 | 2024-05-09 03:12:04.297722 |      1006 | 1715191202784774818 |          1 | 3696982 |        3697015 | 0 |               0 | 1 | 81609826643664396 | 81609668040138333 |
+----------------------------+----------------------------+-----------+---------------------+------------+---------------+----------------+----------------+-----------------+-----------+-------------------+-------------------+
1 row in set (0.003 sec)

MySQL [oceanbase]> select * from __all_merge_info;
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
| gmt_create | gmt_modified | tenant_id | cluster | frozen_scn | global_broadcast_scn | is_merge_error | last_merged_scn | merge_status | error_type | suspend_merging | merge_start_time | last_merged_time |
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
| 2023-12-22 16:55:24.023988 | 2024-05-09 02:03:14.332966 | 1005 |       0 | 1715191201458885386 |  1715191201458885386 | 0 | 1715191201458885386 | 0 |          0 | 0 | 1715191201495297 | 1715191394333195 |
| 2023-12-22 16:55:24.029360 | 2024-05-09 03:12:05.177653 | 1006 |       0 | 1715191202784774818 |  1715191202784774818 | 1 | 1715104801481078385 | 1 |          1 | 0 | 1715191203192248 | 1715110900059331 |
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
2 rows in set (0.006 sec)

MySQL [oceanbase]> select * from __all_column_checksum_error_info;
+----------------------------+----------------------------+-----------+---------------------+------------+---------------+----------------+----------------+-----------------+-----------+-------------------+-------------------+
| gmt_create | gmt_modified | tenant_id | frozen_scn | index_type | data_table_id | index_table_id | data_tablet_id | index_tablet_id | column_id | data_column_ckm | index_column_ckm |
+----------------------------+----------------------------+-----------+---------------------+------------+---------------+----------------+----------------+-----------------+-----------+-------------------+-------------------+
| 2024-05-09 03:12:04.297722 | 2024-05-09 03:12:04.297722 |      1006 | 1715191202784774818 |          1 | 3696982 |        3697015 | 0 |               0 | 1 | 81609826643664396 | 81609668040138333 |
+----------------------------+----------------------------+-----------+---------------------+------------+---------------+----------------+----------------+-----------------+-----------+-------------------+-------------------+
1 row in set (0.002 sec)

MySQL [oceanbase]> delete from __all_column_checksum_error_info;
Query OK, 1 row affected (0.007 sec)

MySQL [oceanbase]> select * from __all_column_checksum_error_info;
Empty set (0.001 sec)

MySQL [oceanbase]> select * from __all_merge_info;
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
| gmt_create | gmt_modified | tenant_id | cluster | frozen_scn | global_broadcast_scn | is_merge_error | last_merged_scn | merge_status | error_type | suspend_merging | merge_start_time | last_merged_time |
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
| 2023-12-22 16:55:24.023988 | 2024-05-09 02:03:14.332966 | 1005 |       0 | 1715191201458885386 |  1715191201458885386 | 0 | 1715191201458885386 | 0 |          0 | 0 | 1715191201495297 | 1715191394333195 |
| 2023-12-22 16:55:24.029360 | 2024-05-09 03:12:05.177653 | 1006 |       0 | 1715191202784774818 |  1715191202784774818 | 1 | 1715104801481078385 | 1 |          1 | 0 | 1715191203192248 | 1715110900059331 |
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
2 rows in set (0.002 sec)
MySQL [oceanbase]> update __all_merge_info set is_merge_error = 0, error_type = 0 where tenant_id = 1006;
Query OK, 1 row affected (0.009 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MySQL [oceanbase]> commit;
Query OK, 0 rows affected (0.000 sec)

MySQL [oceanbase]>
MySQL [oceanbase]> select * from __all_merge_info;
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
| gmt_create | gmt_modified | tenant_id | cluster | frozen_scn | global_broadcast_scn | is_merge_error | last_merged_scn | merge_status | error_type | suspend_merging | merge_start_time | last_merged_time |
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
| 2023-12-22 16:55:24.023988 | 2024-05-09 02:03:14.332966 | 1005 |       0 | 1715191201458885386 |  1715191201458885386 | 0 | 1715191201458885386 | 0 |          0 | 0 | 1715191201495297 | 1715191394333195 |
| 2023-12-22 16:55:24.029360 | 2024-05-09 17:27:22.123728 | 1006 |       0 | 1715191202784774818 |  1715191202784774818 | 0 | 1715104801481078385 | 1 |          0 | 0 | 1715191203192248 | 1715110900059331 |
+----------------------------+----------------------------+-----------+---------+---------------------+----------------------+----------------+---------------------+--------------+------------+-----------------+------------------+------------------+
2 rows in set (0.002 sec)

2.6 重新查看合并信息,错误码已经消失当前还处于合并中,要想解决需要切换1006租户1号日志流的leader到其他节点。

select * from CDB_OB_MAJOR_COMPACTION;

OceanBase数据库合并出现checksum error-52.7 查看1006租户的 1号日志流信息

select * from __all_virtual_ls_info where tenant_id = 1006 and ls_id = 1;

OceanBase数据库合并出现checksum error-62.8 切换日志流,解决问题注:--查看是否有error,没有error后才可以进行切换日志流

select count(*) from __all_virtual_tablet_meta_table where tenant_id = 1006 and status = 1;

OceanBase数据库合并出现checksum error-7

ALTER SYSTEM SWITCH REPLICA leader LS = 1 SERVER = '10.161.243.22:2882' TENANT = cbss_reports;

--确认切换完毕:

select * from __all_virtual_ls_info where tenant_id = 1006 and ls_id = 1;

-- 确认是否还有小于1715191202784774818版本的数据(成功应该是0):

select count(*) from __all_virtual_tablet_meta_table where tenant_id = 1006 and report_scn < 1715191202784774818;

OceanBase数据库合并出现checksum error-8END

本文作者:熊文豪(上海新炬中北团队)

本文来源:“IT那活儿”公众号

OceanBase数据库合并出现checksum error-9

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论