Greenplum MADlib安装与卸载

2023年 8月 12日 20.8k 0

Apache MADlib是一个开源库,用于可扩展的数据库内分析。Greenplum MADlib 扩展提供了在 Greenplum 数据库中运行机器学习和深度学习工作负载的能力。

1. 安装 MADlib

1.1 安装MADlib软件包

从VMware Tanzu下载合适版本的MADlib 扩展包上传软件包到Greenplum的Master主机解压缩$ tar xzvf madlib-1.18.0+2-gp6-rhel7-x86_64.tar.gz

通过运行gppkg命令安装软件包。例如:[gpadmin@gpmdw opt]$ gppkg -i ./madlib-1.18.0+2-gp6-rhel7-x86_64/madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Starting gppkg with args: -i ./madlib-1.18.0+2-gp6-rhel7-x86_64/madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing package madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix /usr/local/greenplum-db-6.19.1'20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg locally20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix /usr/local/greenplum-db-6.19.1'20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing rpms cmdStr='rpm -i --force /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix=/usr/local/greenplum-db-6.19.1'20220209:10:16:21:002504 gppkg:gpmdw:gpadmin-[INFO]:-Completed local installation of madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg.20220209:10:16:21:002504 gppkg:gpmdw:gpadmin-[INFO]:-Please run the following command to deploy MADlibusage: madpack install [-s schema_name] -p greenplum -c user@host:port/databaseExample: $ $GPHOME/madlib/bin/madpack install -s madlib -p greenplum -c gpadmin@mdw:5432/testdb This will install MADlib objects into a Greenplum database named "testdb" running on server "mdw" on port 5432. Installer will try to login as "gpadmin" and will prompt for password. The target schema will be "madlib". To upgrade to a new version of MADlib from version v1.0 or later, use option "upgrade", instead of "install" For additional options run:$ madpack --helpRelease notes and additional documentation can be found at http://madlib.apache.org20220209:10:16:21:002504 gppkg:gpmdw:gpadmin-[INFO]:-madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg successfully installed.

2. 将 MADlib 函数添加到数据库

安装 MADlib 包后,运行命令将 MADlib 函数添加到 Greenplum 数据库。 madpack位于 $GPHOME/madlib/bin目录中。

[gpadmin@gpmdw ~]$ $GPHOME/madlib/bin/madpack -s madlib -p greenplum -c gpadmin@gpmdw:5432/test installmadpack.py: INFO : Detected Greenplum DB version 6.19.1.madpack.py: INFO : *** Installing MADlib ***madpack.py: INFO : MADlib tools version = 1.18.0 (/usr/local/greenplum-db-6.19.1/madlib/Versions/1.18.0/bin/../madpack/madpack.py)madpack.py: INFO : MADlib database version = None (host=gpmdw:5432, db=postgres, schema=madlib)madpack.py: INFO : Testing PL/Python environment...madpack.py: INFO : > PL/Python environment OK (version: 2.7.12)madpack.py: INFO : > Preparing objects for the following modules:madpack.py: INFO : > - array_opsmadpack.py: ERROR : Failed executing m4 on /usr/local/greenplum-db/madlib/Versions/1.18.0/ports/greenplum/modules/array_ops/array_ops.sql_inmadpack.py: ERROR : Building database objects failed. Before retrying: drop madlib schema OR install MADlib into a different schema.

注意:如果安装时报错如上,说明缺少了m4依赖。请先使用root用户安装m4。

[root@gpmdw ~]$ yum install m4 -y

m4安装完成后,切换回gpadmin,执行安装

[gpadmin@gpmdw ~]$ $GPHOME/madlib/bin/madpack install -s madlib -p greenplum -c gpadmin@gpmdw:5432/testmadpack.py: INFO : Detected Greenplum DB version 6.19.1.madpack.py: INFO : *** Installing MADlib ***madpack.py: INFO : MADlib tools version = 1.18.0 (/usr/local/greenplum-db-6.19.1/madlib/Versions/1.18.0/bin/../madpack/madpack.py)madpack.py: INFO : MADlib database version = None (host=gpmdw:5432, db=test, schema=madlib)madpack.py: INFO : Testing PL/Python environment...madpack.py: INFO : > PL/Python environment OK (version: 2.7.12)madpack.py: INFO : > Preparing objects for the following modules:madpack.py: INFO : > - array_opsmadpack.py: INFO : > - bayesmadpack.py: INFO : > - crfmadpack.py: INFO : > - elastic_netmadpack.py: INFO : > - linalgmadpack.py: INFO : > - pmmlmadpack.py: INFO : > - probmadpack.py: INFO : > - sketchmadpack.py: INFO : > - svecmadpack.py: INFO : > - svmmadpack.py: INFO : > - tsamadpack.py: INFO : > - stemmermadpack.py: INFO : > - conjugate_gradientmadpack.py: INFO : > - knnmadpack.py: INFO : > - ldamadpack.py: INFO : > - statsmadpack.py: INFO : > - svec_utilmadpack.py: INFO : > - utilitiesmadpack.py: INFO : > - assoc_rulesmadpack.py: INFO : > - convexmadpack.py: INFO : > - dbscanmadpack.py: INFO : > - deep_learningmadpack.py: INFO : > - glmmadpack.py: INFO : > - graphmadpack.py: INFO : > - linear_systemsmadpack.py: INFO : > - recursive_partitioningmadpack.py: INFO : > - regressmadpack.py: INFO : > - samplemadpack.py: INFO : > - summarymadpack.py: INFO : > - kmeansmadpack.py: INFO : > - pcamadpack.py: INFO : > - validationmadpack.py: INFO : Installing MADlib:madpack.py: INFO : > Created madlib schemamadpack.py: INFO : > Created madlib.MigrationHistory tablemadpack.py: INFO : > Wrote version info in MigrationHistory tablemadpack.py: INFO : MADlib 1.18.0 installed successfully in madlib schema.

MADlib已成功添加到test数据库中。

3. 卸载MADlib

使用gpadmin用户,确保Greenplum正常运行。

[gpadmin@gpmdw ~]$ gpstate20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Starting gpstate with args: 20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63'20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.26 (Greenplum Database 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Jan 18 2022 13:41:23'20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Obtaining Segment details from master...20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Gathering data from segments...20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Greenplum instance status summary20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-----------------------------------------------------20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:- Master instance = Active20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:- Master standby = No master standby configured20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:- Total segment instance count from metadata = 12

如果没有数据库使用 MADlib 函数,请使用 Greenplum 实用程序,并选择卸载 MADlib 包。删除包时,必须指定包和版本。此示例卸载 MADlib 软件包版本 1.18.0。使用如下命令卸载即可:

[gpadmin@gpmdw ~]$ gppkg -r madlib-1.18.0+2-gp6-rhel7-x86_6420220209:10:02:15:000737 gppkg:gpmdw:gpadmin-[INFO]:-Starting gppkg with args: -r madlib-1.18.0+2-gp6-rhel7-x86_6420220209:10:02:15:000737 gppkg:gpmdw:gpadmin-[INFO]:-Uninstalling package madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg20220209:10:02:15:000737 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm uninstallation cmdStr='rpm --test -e madlib-1.18.0-1 --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database'20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm uninstallation cmdStr='rpm --test -e madlib-1.18.0-1 --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database'20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-Uninstalling rpms cmdStr='rpm -e madlib-1.18.0-1 --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database'20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-Completed local uninstallation of madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg.20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg successfully uninstalled.

如果没有数据库使用 MADlib 函数,请使用 Greenplum 实用程序,并选择卸载 MADlib 包。删除包时,必须指定包和版本。此示例卸载 MADlib 软件包版本 1.18.0。

卸载完成后,重启数据库:

[gpadmin@gpmdw ~]$ gpstop -r

深圳市金鑫泉科技有限公司做为Greenplum的全球合作伙伴,可以为您提供优质、高效的服务。欢迎联系。本文来源:https://blog.csdn.net/SZJXQ2021/article/details/123628512

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论