重建损坏的PostgreSQL备库
重建损坏的PostgreSQL备库
场景说明:
由于磁盘资源不足, 将node2节点暂时关闭,此时node1、node3都在正常运行,待 增加空间后(此时主节点增加了操作数据,新增了WAL文件,已备份到node3节点),开启node2节点,此时发现node报错:找不到 000000010000000000000009。
错误内容如下:
[postgres@Node2 /]$ pg2024-04-24 10:00:54.080 CST [113689] LOG: started streaming WAL from primary at 0/9000000 on timeline 1
2024-04-24 10:00:54.080 CST [113689] FATAL: could not receive data from WAL stream: ERROR: requested WAL segment 000000010000000000000009 has already been removed
2024-04-24 10:00:54.081 CST [94151] LOG: waiting for WAL to become available at 0/9002000
_ctl stop
@font-face{ font-family:"Times New Roman"; } @font-face{ font-family:"宋体"; } @font-face{ font-family:"Calibri"; } p.MsoNormal{ mso-style-name:正文; mso-style-parent:""; margin:0pt; margin-bottom:.0001pt; mso-pagination:none; text-align:justify; text-justify:inter-ideograph; font-family:Calibri; mso-fareast-font-family:宋体; mso-bidi-font-family:'Times New Roman'; font-size:10.5000pt; mso-font-kerning:1.0000pt; } span.msoIns{ mso-style-type:export-only; mso-style-name:""; text-decoration:underline; text-underline:single; color:blue; } span.msoDel{ mso-style-type:export-only; mso-style-name:""; text-decoration:line-through; color:red; } @page{mso-page-border-surround-header:no; mso-page-border-surround-footer:no;}@page Section0{ } div.Section0{page:Section0;}
1、恢复操作如下:
[postgres@Node2 /]$ pg_resetwal -l 00000001000000000000003F /postgres/data
pg_resetwal: error: lock file "postmaster.pid" exists
pg_resetwal: hint: Is a server running? If not, delete the lock file and try again.
[postgres@Node2 /]$
2、由于此时无法使用操作pg_resetwal 恢复,果断使用重建备库node2节点,操作如下:
p_ctl stop
cp -r /postgres/data/* /postgres/backup --建议清除前先备份
rm -r /postgres/data
--检查是否可以注册备节点
/postgres/server/bin/repmgr -h 192.168.4.71 -p5432 -U repmgr -d repmgr -f /postgres/repmgr/repmgr.conf standby clone --dry-run
--注意备库repmgr.conf文件中的目录要为空
/postgres/server/bin/repmgr -h 192.168.4.71 -p5432 -U repmgr -d repmgr -f /postgres/repmgr/repmgr.conf standby clone --force
--注册备库
/postgres/server/bin/repmgr -f /postgres/repmgr/repmgr.conf standby register
3、验证数据的完整性
cssd=>
cssd=> select relid,schemaname,relname from pg_stat_user_tables;
relid | schemaname | relname
-------+---------------+---------------------------------
17419 | schema_server | TInstrumentInLocation
18101 | schema_server | TSysFormFieldConfig
17072 | schema_server | TFlowRecycleAbnormalReturn
16658 | schema_server | TFlowCleanBatch
17941 | schema_server | TMasterTag
18173 | schema_server | TTraceClean
18234 | schema_server | TTrainDocument
18038 | schema_server | TReportAuth
18092 | schema_server | TSysConfiguration
17932 | schema_server | TMasterSupplier