如何处理资源池化开启RTO后出现not mark dirt问题

问题现象

出现场景是failover时,备机持有最新页面,主机从备机获取最新页面后,应该置脏的,但是没有置脏。

  • buffer置脏:buffinfo->dirtyflag = false

  • 磁盘上的lsn:buffDesc->lsn_on_disk 2/419077B0,

  • buffer的lsn:PageGetLSN(bufferinfo->pageinfo.page) 2/4198B4F0

  • xlog的lsn:bufferinfo->lsn 2/419077B0

  • 总结:xlog的lsn = 磁盘上的lsn < buffer的lsn

    2024-02-02 15:12:24.163 [unknown] [unknown] localhost 140190590953216 0[0:0#0]  0 [BACKEND] PANIC:  extreme_rto segment page not mark dirty:lsn 2/419077B0, lsn_disk 2/40187550,                                   lsn_page 2/4198B4F0, page 1663/15201/5004 60990

    报错点

    bufferinfo没有被置脏,但是页面是最新页面

    SSMarkBufferDirtyForERTO,初步怀疑BUF_ERTO_NEED_MARK_DIRTY为什么被异常。

    相关日志

    MarkSegPageRedoChildPageDirty

    lsn_on_disk 不对

    分析结果

    1

    报错信息

    报错页面:1663/16388/5005/16384 0-3606,现在的问题是,该页面lsn_on_disk(0/DF6D60C8),小于buffer上的lsn(0/E8C0D3B0),但是没有被置脏。

    extreme_rto segment page not mark dirty:lsn 0/DDAACD58, lsn_disk 0/DF6D60C8, lsn_page 0/E8C0D3B0, page 1663/16388/5005 3606

      2024-02-06 14:57:16.998 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] PANIC:  extreme_rto segment page not mark dirty:lsn 0/DDAACD58, lsn_disk 0/DF6D60C8,                                   lsn_page 0/E8C0D3B0, page 1663/16388/5005 3606
      2024-02-06 14:57:16.998 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] CONTEXT:  xlog redo [segpage] segment head extend: relfilenode/fork:, nblocks[3606->3607], (phy loc 128/101203), reset_zero:1
      2024-02-06 14:57:16.998 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] BACKTRACELOG:  tid[2717562]'s backtrace:
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb() [0x1163ac4]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z9errfinishiz+0x324) [0x1157c1c]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z29MarkSegPageRedoChildPageDirtyP14RedoBufferInfo+0x2a4) [0x22038ec]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z21SegPageRedoChildStateP17XLogRecParseState+0x84) [0x2203a04]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_Z21ProcSegPageCommonRedoP17XLogRecParseState+0xf0) [0x2203bf8]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto24RedoPageManagerDdlActionEP17XLogRecParseState+0x108) [0x20055b8]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto35PageManagerProcSegPipeLineSyncStateEP17XLogRecParseState+0x120) [0x20060ec]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto25PageManagerRedoParseStateEP17XLogRecParseState+0xcc) [0x20063d8]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto30PageManagerRedoDistributeItemsEP17XLogRecParseState+0xc0) [0x20066b0]
             /home/zhoucong/work/openGauss-server-list/openGauss-server/dest/bin/gaussdb(_ZN11extreme_rto19RedoPageManagerMainEv+0x12c) [0x20067f8]

      新增日志:

      2024-02-06 14:57:12.829:Mark need flush in flush copy 表示需要刷盘这里会置脏。

      2024-02-06 14:57:16.984:XLogBlockRedoForExtremeRTO表示对该页面做回放, redoaction=2(REDO_DONE),表示页面没有经过回放。

      2024-02-06 14:57:16.984:MarkBufferDirtyForETRO表示对页面置脏,此时并没有脏页标记。

        postgresql-2024-02-06_145411.log:2024-02-06 14:54:57.448 [unknown] [unknown] localhost 281447834966560 0[0:0#0]  0 [BACKEND] LOG:  [FlushBuffer] FlushBuffer, lsn_on_disk: 0/ddaacd58, bufferinfo.lsn: 0/ddaaea88spc/db/rel/bucket fork-block: 1663/16388/5005/16384 0-3606.
        postgresql-2024-02-06_145659.log:2024-02-06 14:57:12.829 [unknown] [unknown] localhost 281458419489312 0[0:0#0]  0 [BACKEND] LOG:  [SS] Mark need flush in flush copy, spc/db/rel/bucket fork-block: 1663/16388/5005/16384 0-3606, page lsn (0xe8c0d3b0)
        postgresql-2024-02-06_145659.log:2024-02-06 14:57:16.984 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] WARNING:  [XLogBlockRedoForExtremeRTO] XLogBlockRedoForExtremeRTO, redoaction: 2spc/db/rel/bucket fork-block: 1663/16388/5005/16384 0-3606.
        postgresql-2024-02-06_145659.log:2024-02-06 14:57:16.984 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] WARNING:  [SSMarkBufferDirtyForERTO] MarkBufferDirtyForETRO, buf_ctl->state: 32spc/db/rel/bucket fork-block: 1663/16388/5005/16384 0-3606.
        postgresql-2024-02-06_145659.log:2024-02-06 14:57:16.984 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] WARNING:  [SS] clear BUF_DIRTY_NEED_FLUSH, spc/db/rel/bucket fork-block: 1663/16388/5005/16384 0-3606
        postgresql-2024-02-06_145659.log:2024-02-06 14:57:16.984 [unknown] [unknown] localhost 281456656767520 0[0:0#0]  0 [BACKEND] LOG:  [SS] find, spc/db/rel/bucket fork-block: 1663/16388/5005/16384 0-3606

        2

        置脏代码

        置脏代码:SegPageRedoChildState回放段页式头页之后,会调用MarkSegPageRedoChildPageDirty给页面置脏,会首先调用SSMarkBufferDirtyForERTO方法,最终会将脏页标记打在bufferinfo->dirtyflag上。dms_buf_ctl_t 是dms上buffer的控制信息,。以下场景会置脏:

        • buf_ctrl带有标记BUF_ERTO_NEED_MARK_DIRTY

        • buf_ctrl带有标记BUF_DIRTY_NEED_FLUSH

        • bufDesc->extra->lsn_on_disk == Invalid 说明页面是从备机请求过来的

        补充日志发现, buf_ctl->state: 32 (BUF_IS_RELPERSISTENT),没有buffer标记。

          void SSMarkBufferDirtyForERTO(RedoBufferInfo* bufferinfo)
          {
             if (!ENABLE_DMS || bufferinfo->pageinfo.page == NULL) {
                 return;
             }


             /* For buffer need flush, we need to mark dirty here */
             if (!IsRedoBufferDirty(bufferinfo)) {
                 dms_buf_ctrl_t* buf_ctrl = GetDmsBufCtrl(bufferinfo->buf - 1);
                 BufferDesc *bufDesc = GetBufferDescriptor(bufferinfo->buf - 1);
                 if (buf_ctrl->state & BUF_ERTO_NEED_MARK_DIRTY) { buf_ctrl->state == BUF_IS_RELPERSISTENT
                     MakeRedoBufferDirty(bufferinfo);
                 } else if ((buf_ctrl->state & BUF_DIRTY_NEED_FLUSH) || CheckPageNeedSkipInRecovery(bufferinfo->buf) ||
                         XLogRecPtrIsInvalid(bufDesc->extra->lsn_on_disk)) {
                     buf_ctrl->state |= BUF_ERTO_NEED_MARK_DIRTY;
                     MakeRedoBufferDirty(bufferinfo);
                 }
             }
          }

          3

          分析页面

          执行如下命令分析页面

            ./pagehack -t heap -f +data/base/16388/3 -s 101203 -D -c UDS:/home/zhoucong/work/dss/dss0/.dss_unix_d_socket > ~/3

            实际页面lsn为 0/DDAAEA88,lsn_on_disk(0/DF6D60C8),buffer上的lsn(0/E8C0D3B0)

              page information of block 101203/114688
                     pd_lsn: 0/DDAAEA88
                     pd_checksum: 0x4C1E, verify success
                     pd_flags:
                     pd_lower: 1488, non-empty
                     pd_upper: 2312, old
                     pd_special: 8168, size 24
                     Page size & version: 8192, 5
                     pd_xid_base: 9182949318625544, pd_multi_base: 9182811879677896
                     pd_prune_xid: 9182949318625544

              4

              dms日志

                _20240206145629759.dlog.gz:UTC+8 2024-02-06 14:56:09.124|DMS|2696417|INFO>[DMS][1663/16388/5005/16384/0 0-3606][proc claim owner]: src_id=1, src_sid=559, dst_id=0, dst_sid=65535, has_edp=0, req_mode=1 [dms_msg.c:1242]
                dms_20240206145629759.dlog.gz:UTC+8 2024-02-06 14:56:09.125|DMS|2696417|INFO>[DCS][1663/16388/5005/16384/0 0-3606][drc_claim_page_owner]: mode =1, claimed_owner=0, edp_map=0, copy_insts=2 [drc_page.c:497]
                dms.dlog:UTC+8 2024-02-06 14:57:06.008|DMS|2717224|INFO>[DRC rebuild][1663/16388/5005/16384/0 0-3606]remote_ditry: 0, lock_mode: 1, is_edp: 0, inst_id: 1, lsn: 3904951216, is_dirty: 0 [dms_reform_drc_rebuild.c:92]
                dms.dlog:UTC+8 2024-02-06 14:57:06.008|DMS|2717224|INFO>[DRC][1663/16388/5005/16384/0 0-3606]buf_res create successful [drc_res_mgr.c:311]
                dms.dlog:UTC+8 2024-02-06 14:57:11.340|DMS|2717284|INFO>[DRC repair][1663/16388/5005/16384/0 0-3606]1-1-0, CVT:255-0-0-0-0-18446744073709551615-65535, EDP:255-0-0, FLAG:0-1-0 [dms_reform_drc_repair.c:247]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page enter] [dcs_page.c:373]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS][1663/16388/5005/16384/0 0-3606][ask master local]: src_id=0, req_mode=1, curr_mode=0, prep_ruid=77549 [dms_msg.c:558]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS][1663/16388/5005/16384/0 0-3606][dms_ask_master4res_l] result type=2 [dms_msg.c:588]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS]1663/16388/5005/16384/0 0-3606][ask owner for res]: send ok, src_id=0, src_sid=702, dst_id=1, dst_sid=65535, req_mode=1 [dms_msg.c:397]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DCS][1663/16388/5005/16384/0 0-3606][owner ack page ready]: lock mode=1, edp=0, src_id=1, src_sid=599, dest_id=0,dest_sid=702, mode=1, remote dirty=0, remote remote diry=0, page_lsn=0, page_scn=0,curr_page_lsn=3904951216, curr_global_lsn=0 [dcs_page.c:327]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS][1663/16388/5005/16384/0 0-3606][claim ownership req]: send ok, src_id=0, src_sid=702, dst_id=0, dst_sid=65535, has_edp=0, ruid=0 [dms_msg.c:294]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717231|INFO>[DMS][1663/16388/5005/16384/0 0-3606][proc claim owner]: src_id=0, src_sid=702, dst_id=0, dst_sid=65535, has_edp=0, req_mode=1 [dms_msg.c:1242]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717231|INFO>[DCS][1663/16388/5005/16384/0 0-3606][drc_claim_page_owner]: mode =1, claimed_owner=1, edp_map=0, copy_insts=1 [drc_page.c:497]
                dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page leave] ret: 0 [dcs_page.c:375]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page enter] [dcs_page.c:373]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS][1663/16388/5005/16384/0 0-3606][ask master local]: src_id=0, req_mode=2, curr_mode=1, prep_ruid=201176 [dms_msg.c:558]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS][1663/16388/5005/16384/0 0-3606][dms_ask_master4res_l] result type=2 [dms_msg.c:588]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS]1663/16388/5005/16384/0 0-3606][ask owner for res]: send ok, src_id=0, src_sid=925, dst_id=1, dst_sid=65535, req_mode=2 [dms_msg.c:397]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][owner ack page ready]: lock mode=2, edp=0, src_id=1, src_sid=609, dest_id=0,dest_sid=925, dirty=0, remote diry=0, page_lsn=0, page_scn=0 [dcs_page.c:303]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][owner ack page ready]: lock mode=2, edp=0, src_id=1, src_sid=609, dest_id=0,dest_sid=925, mode=2, remote dirty=0, remote remote diry=0, page_lsn=0, page_scn=0,curr_page_lsn=3904951216, curr_global_lsn=0 [dcs_page.c:327]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS][1663/16388/5005/16384/0 0-3606][claim ownership req]: send ok, src_id=0, src_sid=925, dst_id=0, dst_sid=65535, has_edp=0, ruid=0 [dms_msg.c:294]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717230|INFO>[DMS][1663/16388/5005/16384/0 0-3606][proc claim owner]: src_id=0, src_sid=925, dst_id=0, dst_sid=65535, has_edp=0, req_mode=2 [dms_msg.c:1242]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717230|INFO>[DCS][1663/16388/5005/16384/0 0-3606][drc_claim_page_owner]: mode =2, claimed_owner=0, edp_map=0, copy_insts=0 [drc_page.c:497]
                dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page leave] ret: 0 [dcs_page.c:375]

                5

                分析xlog

                分析页面时发现,0/DF6D60C8修改的页面是1663/16388/4986/16384/0,不是出问题的1663/16388/5005/16384,目前来看这个页面的lsn错乱了

                  REDO @ 0/DF6D6058; LSN 0/DF6D60C8: prev 0/DF6D5FD0; xid 0; term 1; len 17; total 111; crc 11782719; desc: Heap2 - visible: cutoff xid 14253, blkref #0: rel 1663/16388/4986/16384/0, forknum:2 storage SEGMENT PAGE fork vm blk 0 (phy loc 8/4549) lastlsn 0/DF6D6058, blkref #1: rel 1663/16388/4986/16384/0, forknum:0 storage SEGMENT PAGE blk 41192 (phy loc 1024/136485) lastlsn 0/9C8773F0
                  REDO @ 0/DF6D60C8; LSN 0/DF6D6138: prev 0/DF6D6058; xid 0; term 1; len 17; total 111; crc 3273069089; desc: Heap2 - visible: cutoff xid 14253, blkref #0: rel 1663/16388/4986/16384/0, forknum:2 storage SEGMENT PAGE fork vm blk 0 (phy loc 8/4549) lastlsn 0/DF6D60C8, blkref #1: rel 1663/16388/4986/16384/0, forknum:0 storage SEGMENT PAGE blk 41193 (phy loc 1024/136486) lastlsn 0/9C87C1A8
                  REDO @ 0/DF6D6138; LSN 0/DF6D61A8: prev 0/DF6D60C8; xid 0; term 1; len 17; total 111; crc 3465733980; desc: Heap2 - visible: cutoff xid 14253, blkref #0: rel 1663/16388/4986/16384/0, forknum:2 storage SEGMENT PAGE fork vm blk 0 (phy loc 8/4549) lastlsn 0/DF6D6138, blkref #1: rel 1663/16388/4986/16384/0, forknum:0 storage SEGMENT PAGE blk 41194 (phy loc 1024/136487) lastlsn 0/9C880FE8

                  6

                  DMS日志

                  回放该日志时,主节点通过dms向备节点要了页面,所以该页面不是通过磁盘读取的,而是通过dms从备机获取到的。

                  UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS]1663/16388/5005/16384/0 0-3606][ask owner for res]: send ok, src_id=0, src_sid=925, dst_id=1, dst_sid=65535, req_mode=2 [dms_msg.c:397]

                    _20240206145629759.dlog.gz:UTC+8 2024-02-06 14:56:09.124|DMS|2696417|INFO>[DMS][1663/16388/5005/16384/0 0-3606][proc claim owner]: src_id=1, src_sid=559, dst_id=0, dst_sid=65535, has_edp=0, req_mode=1 [dms_msg.c:1242]
                    dms_20240206145629759.dlog.gz:UTC+8 2024-02-06 14:56:09.125|DMS|2696417|INFO>[DCS][1663/16388/5005/16384/0 0-3606][drc_claim_page_owner]: mode =1, claimed_owner=0, edp_map=0, copy_insts=2 [drc_page.c:497]
                    dms.dlog:UTC+8 2024-02-06 14:57:06.008|DMS|2717224|INFO>[DRC rebuild][1663/16388/5005/16384/0 0-3606]remote_ditry: 0, lock_mode: 1, is_edp: 0, inst_id: 1, lsn: 3904951216, is_dirty: 0 [dms_reform_drc_rebuild.c:92]
                    dms.dlog:UTC+8 2024-02-06 14:57:06.008|DMS|2717224|INFO>[DRC][1663/16388/5005/16384/0 0-3606]buf_res create successful [drc_res_mgr.c:311]
                    dms.dlog:UTC+8 2024-02-06 14:57:11.340|DMS|2717284|INFO>[DRC repair][1663/16388/5005/16384/0 0-3606]1-1-0, CVT:255-0-0-0-0-18446744073709551615-65535, EDP:255-0-0, FLAG:0-1-0 [dms_reform_drc_repair.c:247]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page enter] [dcs_page.c:373]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS][1663/16388/5005/16384/0 0-3606][ask master local]: src_id=0, req_mode=1, curr_mode=0, prep_ruid=77549 [dms_msg.c:558]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS][1663/16388/5005/16384/0 0-3606][dms_ask_master4res_l] result type=2 [dms_msg.c:588]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS]1663/16388/5005/16384/0 0-3606][ask owner for res]: send ok, src_id=0, src_sid=702, dst_id=1, dst_sid=65535, req_mode=1 [dms_msg.c:397]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DCS][1663/16388/5005/16384/0 0-3606][owner ack page ready]: lock mode=1, edp=0, src_id=1, src_sid=599, dest_id=0,dest_sid=702, mode=1, remote dirty=0, remote remote diry=0, page_lsn=0, page_scn=0,curr_page_lsn=3904951216, curr_global_lsn=0 [dcs_page.c:327]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DMS][1663/16388/5005/16384/0 0-3606][claim ownership req]: send ok, src_id=0, src_sid=702, dst_id=0, dst_sid=65535, has_edp=0, ruid=0 [dms_msg.c:294]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717231|INFO>[DMS][1663/16388/5005/16384/0 0-3606][proc claim owner]: src_id=0, src_sid=702, dst_id=0, dst_sid=65535, has_edp=0, req_mode=1 [dms_msg.c:1242]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717231|INFO>[DCS][1663/16388/5005/16384/0 0-3606][drc_claim_page_owner]: mode =1, claimed_owner=1, edp_map=0, copy_insts=1 [drc_page.c:497]
                    dms.dlog:UTC+8 2024-02-06 14:57:12.824|DMS|2717284|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page leave] ret: 0 [dcs_page.c:375]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page enter] [dcs_page.c:373]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS][1663/16388/5005/16384/0 0-3606][ask master local]: src_id=0, req_mode=2, curr_mode=1, prep_ruid=201176 [dms_msg.c:558]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS][1663/16388/5005/16384/0 0-3606][dms_ask_master4res_l] result type=2 [dms_msg.c:588]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS]1663/16388/5005/16384/0 0-3606][ask owner for res]: send ok, src_id=0, src_sid=925, dst_id=1, dst_sid=65535, req_mode=2 [dms_msg.c:397]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][owner ack page ready]: lock mode=2, edp=0, src_id=1, src_sid=609, dest_id=0,dest_sid=925, dirty=0, remote diry=0, page_lsn=0, page_scn=0 [dcs_page.c:303]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][owner ack page ready]: lock mode=2, edp=0, src_id=1, src_sid=609, dest_id=0,dest_sid=925, mode=2, remote dirty=0, remote remote diry=0, page_lsn=0, page_scn=0,curr_page_lsn=3904951216, curr_global_lsn=0 [dcs_page.c:327]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DMS][1663/16388/5005/16384/0 0-3606][claim ownership req]: send ok, src_id=0, src_sid=925, dst_id=0, dst_sid=65535, has_edp=0, ruid=0 [dms_msg.c:294]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717230|INFO>[DMS][1663/16388/5005/16384/0 0-3606][proc claim owner]: src_id=0, src_sid=925, dst_id=0, dst_sid=65535, has_edp=0, req_mode=2 [dms_msg.c:1242]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717230|INFO>[DCS][1663/16388/5005/16384/0 0-3606][drc_claim_page_owner]: mode =2, claimed_owner=0, edp_map=0, copy_insts=0 [drc_page.c:497]
                    dms.dlog:UTC+8 2024-02-06 14:57:16.984|DMS|2717562|INFO>[DCS][1663/16388/5005/16384/0 0-3606][dcs request page leave] ret: 0 [dcs_page.c:375]

                    经过分析,该问题是在回放时该页面已经落盘,后续回放时,强制从备机获取最新页面,但是在buffer换入时没有更新旧的buffer的lsn_on_disk,并且lsn_on_disk != invalid && lsn_on_disk < page.lsn,但却没有被置脏,最终因为校验失败而报错。

                    修改方法

                    在TerminateReadPage将从备机获取到的新页面的lsn_on_disk初始化成InvalidXLogRecPtr,后续逻辑会给该页面置脏

                      @@ -326,6 +326,7 @@ Buffer TerminateReadPage(BufferDesc* buf_desc, ReadBufferMode read_mode, const X
                                 buf_desc->extra->seg_fileno == EXTENT_INVALID) {
                                 CalcSegDmsPhysicalLoc(buf_desc, buffer, !g_instance.dms_cxt.SSRecoveryInfo.in_flushcopy);
                             }
                      ++      buf_desc->extra->lsn_on_disk = InvalidXLogRecPtr;
                         }
                         if (BufferIsValid(buffer)) {
                             buf_ctrl->been_loaded = true;

                      其他

                      BufferDesc说明

                        (gdb) p *bufdesc
                        $5 = {tag = {rnode = {spcNode = 1664, dbNode = 0, relNode = 4161, bucketNode = 16384, opt = 0}, forkNum = 0, blockNum = 0}, state = 14390126709355577345, buf_id = 1, wait_backend_pid = 0, io_in_progress_lock = 0xfffb1963bb00,
                         content_lock = 0xfffb1963bb80, extra = 0xfffbad50a230, lsn_dirty = 0}
                        (gdb) p *bufdesc->extra
                        $4 = {seg_fileno = 0 '00', seg_blockno = 4181, rec_lsn = 1086214936, dirty_queue_loc = 1, encrypt = false, lsn_on_disk = 1085681688, aio_in_progress = false}

                        bufdesc->tag->rnode->spcNode:表空间的id

                        bufdesc->tag->rnode->dbNode: 数据库id

                        bufdesc->tag->rnode->relNode:表的id

                        bufdesc->tag->rnode->bucketNode: 高斯的bucket表用的,用于表示该buffer的类型,其中段页式普通表bucketNode=16384。

                        bufdesc->tag->forkNum:表的文件后缀信息

                        bufdesc->tag->blockNum:页面号(普通表直接使用页面号,段页式表这里是逻辑页面,实际页面保存在bufdesc->extra中,在读buffer时通过seg_logic_to_physic_mapping函数换算)

                        bufdesc->content_lock:buffer内容锁

                        bufdesc->extra->seg_fileno:段页式文件的文件号

                        bufdesc->extra->seg_blockno:段页式文件的物理block号,每个段页式文件有131072个block。

                        bufdesc->extra->rec_lsn:

                        bufdesc->extra->lsn_on_disk:磁盘上修改该buffer的最新lsn

                          typedef struct RelFileNode {
                             Oid spcNode; /* tablespace */
                             Oid dbNode;  /* database */
                             Oid relNode; /* relation */
                             int2 bucketNode; /* bucketid */
                             uint2 opt;
                          } RelFileNode;

                            /*
                            * The physical storage of a relation consists of one or more forks. The
                            * main fork is always created, but in addition to that there can be
                            * additional forks for storing various metadata. ForkNumber is used when
                            * we need to refer to a specific fork in a relation.
                            */
                            typedef int ForkNumber;


                            #define SEGMENT_EXT_8192_FORKNUM -8
                            #define SEGMENT_EXT_1024_FORKNUM -7
                            #define SEGMENT_EXT_128_FORKNUM -6
                            #define SEGMENT_EXT_8_FORKNUM -5


                            #define PAX_DFS_TRUNCATE_FORKNUM -4
                            #define PAX_DFS_FORKNUM -3
                            #define DFS_FORKNUM -2
                            #define InvalidForkNumber -1
                            #define MAIN_FORKNUM 0
                            #define FSM_FORKNUM 1
                            #define VISIBILITYMAP_FORKNUM 2
                            #define BCM_FORKNUM 3
                            #define INIT_FORKNUM 4
                            // used for data file cache, you can modify than as you like
                            #define PCA_FORKNUM 5
                            #define PCD_FORKNUM 6