openGauss 5.0.0企业版两节点CM高可用实践

2024年 3月 22日 60.0k 0

引言

CM支持VIP管理

1.支持业务配置VIP连接数据库,当主机故障,发生主备切换时,业务连接可自动重连到新的主机(毫秒级别);

2.当数据库出现双主时,通过VIP连接数据库可确保连接唯一的主机,降低双主丢数据的风险。

CM支持两节点部署

1.通过引入第三方网关IP,有效解决CM集群两节点部署模式下自仲裁问题,支持CMS和DN;

2.同时支持动态配置CM集群故障切换策略和数据库集群脑裂故障恢复策略,从而能够尽可能确保集群数据的完整性和一致性。

安装准备

安装准备工作,已经在openGauss 5.0.0企业版x86单机安装描述过,此处就不在累赘。主备步骤如下:

1.CPU架构是X86,操作系统是Centos7.6。请根据安装操作系统下载对应数据库安装包。

2.关闭防火墙和SELINUX

3.关闭RemoveIPC

4.设置时区和时间

5.设置网卡MTU值

6.设置root允许远程登录

7.数据库用户和用户组

8.Core_Pattern设置

9.安装python3.6

安装XML文件说明

    [opengauss@test2 dn1]$ cat opt/software/cm2.xml














































    安装openGauss

    root下预安装

      ./gs_preinstall -U opengauss -G dbgrp -X opt/software/cm2.xml
      Parsing the configuration file.
      Successfully parsed the configuration file.
      Installing the tools on the local node.
      Successfully installed the tools on the local node.
      Are you sure you want to create trust for root (yes/no)?yes
      Please enter password for root
      Password:
      Password:
      Successfully created SSH trust for the root permission user.
      Setting host ip env
      Successfully set host ip env.
      Distributing package.
      Begin to distribute package to tool path.
      Successfully distribute package to tool path.
      Begin to distribute package to package path.
      Successfully distribute package to package path.
      Successfully distributed package.
      Are you sure you want to create the user[opengauss] and create trust for it (yes/no)? yes
      Please enter password for cluster user.
      Password:
      Please enter password for cluster user again.
      Password:
      Generate cluster user password files successfully.


      Successfully created [opengauss] user on all nodes.
      Preparing SSH service.
      Successfully prepared SSH service.
      Installing the tools in the cluster.
      Successfully installed the tools in the cluster.
      Checking hostname mapping.
      Successfully checked hostname mapping.
      Creating SSH trust for [opengauss] user.
      Please enter password for current user[opengauss].
      Password:
      Checking network information.
      All nodes in the network are Normal.
      Successfully checked network information.
      Creating SSH trust.
      Creating the local key file.
      Successfully created the local key files.
      Appending local ID to authorized_keys.
      Successfully appended local ID to authorized_keys.
      Updating the known_hosts file.
      Successfully updated the known_hosts file.
      Appending authorized_key on the remote node.
      Successfully appended authorized_key on all remote node.
      Checking common authentication file content.
      Successfully checked common authentication content.
      Distributing SSH trust file to all node.
      Distributing trust keys file to all node successfully.
      Successfully distributed SSH trust file to all node.
      Verifying SSH trust on all hosts.
      Successfully verified SSH trust on all hosts.
      Successfully created SSH trust.
      Successfully created SSH trust for [opengauss] user.
      Checking OS software.
      Successfully check os software.
      Checking OS version.
      Successfully checked OS version.
      Creating cluster's path.
      Successfully created cluster's path.
      Set and check OS parameter.
      Setting OS parameters.
      Successfully set OS parameters.
      Warning: Installation environment contains some warning messages.
      Please get more details by "/opt/software/openGauss/script/gs_checkos -i A -h ps-vbdb-test2,ps-vbdb-test3 --detail".
      Set and check OS parameter completed.
      Preparing CRON service.
      Successfully prepared CRON service.
      Setting user environmental variables.
      Successfully set user environmental variables.
      Setting the dynamic link library.
      Successfully set the dynamic link library.
      Setting Core file
      Successfully set core path.
      Setting pssh path
      Successfully set pssh path.
      Setting Cgroup.
      Successfully set Cgroup.
      Set ARM Optimization.
      No need to set ARM Optimization.
      Fixing server package owner.
      Setting finish flag.
      Successfully set finish flag.
      Preinstallation succeeded.

      切换到普通用户,安装

        gs_install -X opt/software/cm2.xml
        Parsing the configuration file.
        Check preinstall on every node.
        Successfully checked preinstall on every node.
        Creating the backup directory.
        Successfully created the backup directory.
        begin deploy..
        Installing the cluster.
        begin prepare Install Cluster..
        Checking the installation environment on all nodes.
        begin install Cluster..
        Installing applications on all nodes.
        Successfully installed APP.
        begin init Instance..
        encrypt cipher and rand files for database.
        Please enter password for database:
        Please repeat for database:
        begin to create CA cert files
        The sslcert will be generated in home/opengauss/app/share/sslcert/om
        Create CA files for cm beginning.
        Create CA files on directory [/home/opengauss/app_a07d57c3/share/sslcert/cm]. file list: ['cacert.pem', 'server.key', 'server.crt', 'client.key', 'client.crt', 'server.key.cipher', 'server.key.rand', 'client.key.cipher', 'client.key.rand']
        Non-dss_ssl_enable, no need to create CA for DSS
        Cluster installation is completed.
        Configuring.
        Deleting instances from all nodes.
        Successfully deleted instances from all nodes.
        Checking node configuration on all nodes.
        Initializing instances on all nodes.
        Updating instance configuration on all nodes.
        Check consistence of memCheck and coresCheck on database nodes.
        Successful check consistence of memCheck and coresCheck on all nodes.
        Configuring pg_hba on all nodes.
        Configuration is completed.
        Starting cluster.
        ======================================================================
        Successfully started primary instance. Wait for standby instance.
        ======================================================================
        .
        Successfully started cluster.
        ======================================================================
        cluster_state      : Normal
        redistributing     : No
        node_count         : 2
        Datanode State
           primary           : 1
           standby           : 1
           secondary         : 0
           cascade_standby   : 0
           building          : 0
           abnormal          : 0
           down              : 0


        Successfully installed application.
        end deploy.

        查询集群状态

          gs_om -t status --detail
          [  CMServer State   ]


          node             node_ip         instance                          state
          --------------------------------------------------------------------------
          1  test1 xx.x.xx.xx    1    /data/openGauss/cm/cm_server Primary
          2  test2 xx.x.xx.xx    2    /data/openGauss/cm/cm_server Standby


          [   Cluster State   ]


          cluster_state   : Normal
          redistributing  : No
          balanced        : Yes
          current_az      : AZ_ALL


          [  Datanode State   ]


          node             node_ip         instance                 state
          ---------------------------------------------------------------------------
          1  test1 xx.x.xx.xx    6001 data/openGauss/dn1 P Primary Normal
          2  test2 xx.x.xx.xx    6002 data/openGauss/dn2 S Standby Normal
          数据库的启动和停止:


          [opengauss@test2 ~]$ gs_om -t stop
          Stopping cluster.
          =========================================
          Successfully stopped cluster.
          =========================================
          End stop cluster.
          [opengauss@test2 ~]$ gs_om -t start
          Starting cluster.
          ======================================================================
          Successfully started primary instance. Wait for standby instance.
          ======================================================================
          .
          Successfully started cluster.
          ======================================================================
          cluster_state      : Normal
          redistributing     : No
          node_count         : 2
          Datanode State
             primary           : 1
             standby           : 1
             secondary         : 0
             cascade_standby   : 0
             building          : 0
             abnormal          : 0
             down              : 0


          Successfully started cluster.

          主节点上进程信息:

          备节点上进程信息:

          安装成功后,登录数据操作:

          主节点:

          备节点:

          主备切换操作

          原主切换前集群信息:

            [opengauss@test1 ~]$ ps ux
            USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
            opengau+ 154168  0.0  0.0  21956   832 ?        Ss   Aug03   0:00 ssh-agent -a home/opengauss/gaussdb_tmp/gauss_socket_tmp
            opengau+ 166310  0.6  0.0  41724  8784 ?        S    00:00   8:48 home/opengauss/app/bin/om_monitor -L var/log/gaussdb_log/opengauss/cm/om_monitor
            opengau+ 168867 13.5  0.1 1509496 26928 ?       Sl   00:00 174:54 home/opengauss/app/bin/cm_agent
            opengau+ 168885 12.5  2.8 6652180 471124 ?      Sl   00:00 162:34 home/opengauss/app/bin/cm_server
            opengau+ 168905  0.0  0.2 1409964 41324 ?       Sl   00:00   0:00 gaussdb fenced UDF master process
            opengau+ 169254  5.1  7.6 7782296 1257508 ?     Ssl  00:00  66:18 /home/opengauss/app/bin/gaussdb -D /data/openGauss/dn1 -M standby
            gs_om -t status --detail
            [  CMServer State   ]


            node             node_ip         instance                          state
            --------------------------------------------------------------------------
            1  test1 xx.x.xx.xx    1    /data/openGauss/cm/cm_server Primary
            2  test2 xx.x.xx.xx    2    /data/openGauss/cm/cm_server Standby


            [   Cluster State   ]


            cluster_state   : Normal
            redistributing  : No
            balanced        : Yes
            current_az      : AZ_ALL


            [  Datanode State   ]


            node             node_ip         instance                 state
            ---------------------------------------------------------------------------
            1  test1 xx.x.xx.xx    6001 /data/openGauss/dn1 P Primary Normal                  ##主节点显示P
            2  test2 xx.x.xx.xx    6002 /data/openGauss/dn2 S Standby Normal

            切换成功后,原主变成备节点

              gs_om -t status --detail
              [  CMServer State   ]


              node             node_ip         instance                          state
              --------------------------------------------------------------------------
              1  test1 xx.x.xx.xx    1    /data/openGauss/cm/cm_server Primary
              2  test2 xx.x.xx.xx    2    /data/openGauss/cm/cm_server Standby


              [   Cluster State   ]


              cluster_state   : Normal
              redistributing  : No
              balanced        : No
              current_az      : AZ_ALL


              [  Datanode State   ]


              node             node_ip         instance                 state
              ---------------------------------------------------------------------------
              1  test1 xx.x.xx.xx    6001 /data/openGauss/dn1 P Standby Normal
              2  test2 xx.x.xx.xx    6002 /data/openGauss/dn2 S Primary Normal
              [opengauss@test1 ~]$ ps ux
              USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
              opengau+ 154168  0.0  0.0  21956   832 ?        Ss   Aug03   0:00 ssh-agent -a /home/opengauss/gaussdb_tmp/gauss_socket_tmp
              opengau+ 166310  0.6  0.0  41724  8784 ?        S    00:00   9:10 /home/opengauss/app/bin/om_monitor -L /var/log/gaussdb_log/opengauss/cm/om_monitor
              opengau+ 181143  0.0  0.0 115544  2056 pts/1    S    21:33   0:00 -bash
              opengau+ 212240 13.6  0.1 1443956 26628 ?       Sl   22:21   0:47 /home/opengauss/app/bin/cm_agent
              opengau+ 212259 12.9  2.5 6391332 416212 ?      Sl   22:21   0:44 /home/opengauss/app/bin/cm_server
              opengau+ 212271  7.4  7.6 7730032 1251812 ?     Sl   22:21   0:25 /home/opengauss/app/bin/gaussdb -D /data/openGauss/dn1 -M pending
              opengau+ 212278  0.0  0.2 1409968 41272 ?       Sl   22:21   0:00 gaussdb fenced UDF master process
              opengau+ 216922  0.0  0.0 155460  1864 pts/1    R+   22:27   0:00 ps ux
              [opengauss@test1 ~]$ gsql -d postgres  -p 15400 -r
              gsql ((openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:07:56 commit 0 last mr  )
              Non-SSL connection (SSL connection is recommended when requiring high-security)
              Type "help" for help.


              openGauss=# insert into test values(1);
              ERROR:  cannot execute INSERT in a read-only transaction
              openGauss=# select * from test;
              a
              ---
              1
              1
              (2 rows)


              openGauss=# q
              [opengauss@ps-vbdb-test2 ~]$

              原备节点升主成功:

                gs_ctl switchover -D /data/openGauss/dn2
                [2023-08-04 22:26:43.517][171430][][gs_ctl]: gs_ctl switchover ,datadir is /data/openGauss/dn2
                [2023-08-04 22:26:43.517][171430][][gs_ctl]: switchover term (1)
                [2023-08-04 22:26:43.525][171430][][gs_ctl]: waiting for server to switchover........
                [2023-08-04 22:26:48.567][171430][][gs_ctl]: done
                [2023-08-04 22:26:48.567][171430][][gs_ctl]: switchover completed (/data/openGauss/dn2)
                [opengauss@test2 dn2]$ gs_ctl status --detail
                gs_ctl: unrecognized option '--detail'
                Try "gs_ctl --help" for more information.
                [opengauss@test2 dn2]$ ps ux
                USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
                opengau+  46514  0.0  0.0  72472   964 ?        Ss   Aug03   0:00 ssh-agent -a /home/opengauss/gaussdb_tmp/gauss_socket_tmp
                opengau+  46590  0.0  0.0  72472   776 ?        Ss   Aug03   0:00 ssh-agent -s
                opengau+  52674  0.0  0.0 115544  2084 pts/0    S    Aug03   0:00 -bash
                opengau+  54665  0.7  0.0  41728  8796 ?        S    00:00  10:31 /home/opengauss/app/bin/om_monitor -L /var/log/gaussdb_log/opengauss/cm/om_monitor
                opengau+ 167866 13.8  0.1 1443960 26636 ?       Sl   22:21   0:48 /home/opengauss/app/bin/cm_agent
                opengau+ 167884 11.8  2.5 6260128 415892 ?      Sl   22:21   0:41 /home/opengauss/app/bin/cm_server
                opengau+ 167897 11.6  7.7 7869340 1265916 ?     Sl   22:21   0:41 /home/opengauss/app/bin/gaussdb -D /data/openGauss/dn2 -M pending
                opengau+ 167904  0.0  0.2 1409968 41244 ?       Sl   22:21   0:00 gaussdb fenced UDF master process
                opengau+ 171967  0.0  0.0 155460  1860 pts/0    R+   22:27   0:00 ps ux
                [opengauss@test2 dn2]$ gsql -d postgres  -p 15400 -r
                gsql ((openGauss 5.0.0 build a07d57c3) compiled at 2023-03-29 03:07:56 commit 0 last mr  )
                Non-SSL connection (SSL connection is recommended when requiring high-security)
                Type "help" for help.


                openGauss=# d+
                                                       List of relations
                Schema | Name | Type  |   Owner   |    Size    |             Storage              | Description
                --------+------+-------+-----------+------------+----------------------------------+-------------
                public | test | table | opengauss | 8192 bytes | {orientation=row,compression=no} |
                (1 row)


                openGauss=# insert into test values(1);
                INSERT 0 1
                openGauss=#

                切换成功后,执行gs_om -t refreshconf保存主备机器信息:

                  gs_om -t refreshconf
                  Generating dynamic configuration file for all nodes.
                  Successfully generated dynamic configuration file.

                  强行stop备节点,CM会自动拉起:

                    gs_ctl stop -D /data/openGauss/dn2
                    [2023-08-04 22:57:36.610][197800][][gs_ctl]: gs_ctl stopped ,datadir is /data/openGauss/dn2
                    waiting for server to shut down.... done
                    server stopped
                    [opengauss@test2 dn2]$ ps ux
                    USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
                    opengau+  46514  0.0  0.0  72472   964 ?        Ss   Aug03   0:00 ssh-agent -a /home/opengauss/gaussdb_tmp/gauss_socket_tmp
                    opengau+  46590  0.0  0.0  72472   776 ?        Ss   Aug03   0:00 ssh-agent -s
                    opengau+  52674  0.0  0.0 115544  2116 pts/0    S    Aug03   0:00 -bash
                    opengau+  54665  0.7  0.0  41728  8796 ?        S    00:00  10:45 /home/opengauss/app/bin/om_monitor -L /var/log/gaussdb_log/opengauss/cm/om_monitor
                    opengau+ 193209 13.8  0.1 1443956 26632 ?       Sl   22:54   0:23 /home/opengauss/app/bin/cm_agent
                    opengau+ 193227 11.9  2.5 6325664 415728 ?      Sl   22:54   0:20 /home/opengauss/app/bin/cm_server
                    opengau+ 193247  0.0  0.2 1409968 41264 ?       Sl   22:54   0:00 gaussdb fenced UDF master process
                    opengau+ 197815  0.0  0.2 1344972 33648 ?       Sl   22:57   0:00 /home/opengauss/app/bin/gaussdb -D /data/openGauss/dn2 -M pending
                    opengau+ 197826  0.0  0.0 1196260 15560 ?       R    22:57   0:00 /home/opengauss/app/bin/gaussdb -V
                    opengau+ 197827  0.0  0.0 155460  1860 pts/0    R+   22:57   0:00 ps ux

                    总结

                    通过本次实验验证了解了openGauss两节点CM集群切换操作,进一步熟悉高可用特性。

                    相关文章

                    Oracle如何使用授予和撤销权限的语法和示例
                    Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
                    下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
                    社区版oceanbase安装
                    Oracle 导出CSV工具-sqluldr2
                    ETL数据集成丨快速将MySQL数据迁移至Doris数据库

                    发布评论