分布式编译系统的搭建

2023年 8月 15日 98.4k 0

准备工作

数据库版本

GreatSQL-8.0.25-17

生成数据

使用 TPC-H 生成数据

#TPC-H Population Generator (Version 3.0.0)

#生成10G的数据
$ ./dbgen -vf -s 10

修改my.cnf

vim /etc/my.cnf

#设置IPB为8G
innodb_buffer_pool_size = 8G

#设置并行查询的使用最大内存(此处为8G,根据具体配置设置)
parallel_memory_limit= 8G

#打开并行查询
force_parallel_execute=1

#设置双1(方便导入数据)
innodb_flush_log_at_trx_commit = 1
sync_binlog = 1

#关闭binlog
skip-log_bin

datadir = /data/GreatSQL
socket = mysql.sock

启动数据库后,可以检查配置是否生效

mysql> show variables like '%double%';
mysql> show variables like 'log_bin';
mysql> show variables like 'sync_binlog';
mysql> show variables like 'innodb_flush_log_at_trx_commit';
mysql> show variables like 'innodb_buffer_pool_size';

并行查询相关参数

mysql> show global variables like '%parall%';
+----------------------------------+----------------+
| force_parallel_execute | ON |
| innodb_parallel_dblwr_encrypt | OFF |
| innodb_parallel_doublewrite_path | xb_doublewrite |
| innodb_parallel_read_threads | 4 |
| parallel_cost_threshold | 1000 |
| parallel_default_dop | 4 |
| parallel_max_threads | 64 |
| parallel_memory_limit | 8589934592 |
| parallel_queue_timeout | 0 |
| slave_parallel_type | LOGICAL_CLOCK |
| slave_parallel_workers | 2 |
+----------------------------------+----------------+
11 rows in set (0.01 sec)

启动数据库

启动数据库:

$ systemctl start greatsql.service

文件准备

本次的工作在/data/tpch

可执行程序为dbgen,依赖一个数据分布文件dists.dss。可以将dbgen和dists.dss拷贝到同一目录使用

<strong>dss.ddl</strong><strong>dss.ri</strong> 文件

准备表结构和索引文件 dss.ddldss.ri 到工作目录

$ cd /data/tpch/tpch_2.18.0/dbgen
$ cp dss.ri /data/tpch/
$ cp dss.ddl /data/tpch/

load.sql

修改 load.sql 文件 修改文件的路径

$ cd /data/tpch/
$ cp load.sql loadfix.sql
$ vim loadfix.sql

导入数据

$ /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock

#创建数据库
mysql> create database tpch;
mysql> use tpch;

#导入表结构
mysql> source /data/tpch/dss.ddl;

#导入数据(文件见附录)
mysql> sh loadfile

#导入索引、外键等
mysql> source /data/tpch/dssfix.ri

注:binlog要关再导入,否则binlog会爆

Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mysqld variable and try again

查看导入的表(20G数据)

mysql> select table_name,table_rows
from information_schema.tables
where table_name in
('customer','lineitem','nation','orders','part','partsupp','region','supplier');

+------------+------------+
| TABLE_NAME | TABLE_ROWS |
+------------+------------+
| region | 5 |
| nation | 25 |
| part | 3860136 |
| supplier | 197853 |
| customer | 2884322 |
| partsupp | 17084176 |
| orders | 29678499 |
| lineitem | 87786966 |
+------------+------------+
8 rows in set (0.00 sec)

编写并运行测试脚本

#测试脚本见附录
#这里是在tmux中运行,避免因为终端关闭导致测试终止
$ sh auto.sh&

#测试结束后在当前脚本的目录查看生成的日志
$ cat tpch-PQ-******.log

测试运行时,观察相关指标。

mysql> show global status like '%PQ%';
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| PQ_memory_refused | 0 |
| PQ_memory_used | 0 |
| PQ_threads_refused | 0 |
| PQ_threads_running | 0 |
+--------------------+-------+
4 rows in set (0.00 sec)

mysql> show processlist;
mysql> explain for connection **;

测试结果

开启并行查询(16线程)的执行时间,与不开启并行查询的执行时间如下:

SQL1 SQL3 SQL5 SQL6 SQL10 SQL12 SQL19
PQ16 1m25.645s 1m5.514s 8m56.306s 35.451s 44.564s 59.115s 5.771s
NOPQ 6m1.724s 5m19.083s 37m42.078s 2m16.331s 1m57.998s 2m39.672s 24.907s

注:本文章重点讲测试过程,具体的测试结果就不展开了。

**
**

附录-相关文件

导入脚本

$ cat loadfile
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/region.tbl' into table region FIELDS TERMINATED BY '|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/nation.tbl' into table nation FIELDS TERMINATED BY '|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/supplier.tbl' into table supplier FIELDS TERMINATED BY '|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/part.tbl' into table part FIELDS TERMINATED BY '|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/customer.tbl' into table customer FIELDS TERMINATED BY '|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/partsupp.tbl' into table partsupp FIELDS TERMINATED BY '|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/orders.tbl' into table orders FIELDS TERMINATED BY '|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile '/data/tpch/data/lineitem.tbl' into table lineitem FIELDS TERMINATED BY '|';" tpch &

测试脚本:

脚本是东拼西凑的,写的不好,希望有大佬能指导一下

$ cat auto.sh
#include TH=$PATH:/usr/local/bin
export PATH
#set -u
#set -x
#set -e
. ~/.bash_profile > /dev/null 2>&1
exec 3>&1 4>&2 1>> tpch-PQ-`date +'%Y%m%d%H%M%S'`.log 2>&1

# 定义要执行的SQL文件存放的目录
SQL_DIR="/data/tpch/SQLs"

# 判断目录是否存在
if [ ! -d "$SQL_DIR" ]; then
echo "SQL文件目录不存在!"
exit 1
fi

# 进入SQL文件目录
cd $SQL_DIR

I=1
II=3
while [ $I -le $II ]
do
# 执行SQL文件
for file in `ls *.sql`
do
echo "正在执行:$file"
time /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock -Dtpch

dss.ddl

-- Sccsid: @(#)dss.ddl 2.1.8.1
drop database tpch;
create database tpch;
use tpch;
CREATE TABLE NATION ( N_NATIONKEY INTEGER NOT NULL,
N_NAME CHAR(25) NOT NULL,
N_REGIONKEY INTEGER NOT NULL,
N_COMMENT VARCHAR(152));

CREATE TABLE REGION ( R_REGIONKEY INTEGER NOT NULL,
R_NAME CHAR(25) NOT NULL,
R_COMMENT VARCHAR(152));

CREATE TABLE PART ( P_PARTKEY INTEGER NOT NULL,
P_NAME VARCHAR(55) NOT NULL,
P_MFGR CHAR(25) NOT NULL,
P_BRAND CHAR(10) NOT NULL,
P_TYPE VARCHAR(25) NOT NULL,
P_SIZE INTEGER NOT NULL,
P_CONTAINER CHAR(10) NOT NULL,
P_RETAILPRICE DECIMAL(15,2) NOT NULL,
P_COMMENT VARCHAR(23) NOT NULL );

CREATE TABLE SUPPLIER ( S_SUPPKEY INTEGER NOT NULL,
S_NAME CHAR(25) NOT NULL,
S_ADDRESS VARCHAR(40) NOT NULL,
S_NATIONKEY INTEGER NOT NULL,
S_PHONE CHAR(15) NOT NULL,
S_ACCTBAL DECIMAL(15,2) NOT NULL,
S_COMMENT VARCHAR(101) NOT NULL);

CREATE TABLE PARTSUPP ( PS_PARTKEY INTEGER NOT NULL,
PS_SUPPKEY INTEGER NOT NULL,
PS_AVAILQTY INTEGER NOT NULL,
PS_SUPPLYCOST DECIMAL(15,2) NOT NULL,
PS_COMMENT VARCHAR(199) NOT NULL );

CREATE TABLE CUSTOMER ( C_CUSTKEY INTEGER NOT NULL,
C_NAME VARCHAR(25) NOT NULL,
C_ADDRESS VARCHAR(40) NOT NULL,
C_NATIONKEY INTEGER NOT NULL,
C_PHONE CHAR(15) NOT NULL,
C_ACCTBAL DECIMAL(15,2) NOT NULL,
C_MKTSEGMENT CHAR(10) NOT NULL,
C_COMMENT VARCHAR(117) NOT NULL);

CREATE TABLE ORDERS ( O_ORDERKEY INTEGER NOT NULL,
O_CUSTKEY INTEGER NOT NULL,
O_ORDERSTATUS CHAR(1) NOT NULL,
O_TOTALPRICE DECIMAL(15,2) NOT NULL,
O_ORDERDATE DATE NOT NULL,
O_ORDERPRIORITY CHAR(15) NOT NULL,
O_CLERK CHAR(15) NOT NULL,
O_SHIPPRIORITY INTEGER NOT NULL,
O_COMMENT VARCHAR(79) NOT NULL);

CREATE TABLE LINEITEM ( L_ORDERKEY INTEGER NOT NULL,
L_PARTKEY INTEGER NOT NULL,
L_SUPPKEY INTEGER NOT NULL,
L_LINENUMBER INTEGER NOT NULL,
L_QUANTITY DECIMAL(15,2) NOT NULL,
L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL,
L_DISCOUNT DECIMAL(15,2) NOT NULL,
L_TAX DECIMAL(15,2) NOT NULL,
L_RETURNFLAG CHAR(1) NOT NULL,
L_LINESTATUS CHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NULL,
L_COMMITDATE DATE NOT NULL,
L_RECEIPTDATE DATE NOT NULL,
L_SHIPINSTRUCT CHAR(25) NOT NULL,
L_SHIPMODE CHAR(10) NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL);

dss.ri

-- Sccsid: @(#)dss.ri 2.1.8.1
-- tpch Benchmark Version 8.0

-- For table REGION
ALTER TABLE tpch.REGION
ADD PRIMARY KEY (R_REGIONKEY);

-- For table NATION
ALTER TABLE tpch.NATION
ADD PRIMARY KEY (N_NATIONKEY);

ALTER TABLE tpch.NATION
ADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references tpch.REGION(R_REGIONKEY);

COMMIT WORK;

-- For table PART
ALTER TABLE tpch.PART
ADD PRIMARY KEY (P_PARTKEY);

COMMIT WORK;

-- For table SUPPLIER
ALTER TABLE tpch.SUPPLIER
ADD PRIMARY KEY (S_SUPPKEY);

ALTER TABLE tpch.SUPPLIER
ADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references tpch.NATION(N_NATIONKEY);

COMMIT WORK;

-- For table PARTSUPP
ALTER TABLE tpch.PARTSUPP
ADD PRIMARY KEY (PS_PARTKEY,PS_SUPPKEY);

COMMIT WORK;

-- For table CUSTOMER
ALTER TABLE tpch.CUSTOMER
ADD PRIMARY KEY (C_CUSTKEY);

ALTER TABLE tpch.CUSTOMER
ADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references tpch.NATION(N_NATIONKEY);

COMMIT WORK;

-- For table LINEITEM
ALTER TABLE tpch.LINEITEM
ADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);

COMMIT WORK;

-- For table ORDERS
ALTER TABLE tpch.ORDERS
ADD PRIMARY KEY (O_ORDERKEY);

COMMIT WORK;

-- For table PARTSUPP
ALTER TABLE tpch.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references tpch.SUPPLIER(S_SUPPKEY);

COMMIT WORK;

ALTER TABLE tpch.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references tpch.PART(P_PARTKEY);

COMMIT WORK;

-- For table ORDERS
ALTER TABLE tpch.ORDERS
ADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references tpch.CUSTOMER(C_CUSTKEY);

COMMIT WORK;

-- For table LINEITEM
ALTER TABLE tpch.LINEITEM
ADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY) references tpch.ORDERS(O_ORDERKEY);

COMMIT WORK;

ALTER TABLE tpch.LINEITEM
ADD FOREIGN KEY LINEITEM_FK2 (L_PARTKEY,L_SUPPKEY) references
tpch.PARTSUPP(PS_PARTKEY,PS_SUPPKEY);

COMMIT WORK;

SQL语句

--SQL1
select /*+ PQ(16) */
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate date '1995-03-01'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate limit 10;

--SQL6
select /*+ PQ(16) */
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= date '1993-01-01'
and l_shipdate = date '1994-05-01'
and o_orderdate '1-URGENT'
and o_orderpriority '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('TRUCK', 'FOB')
and l_commitdate = date '1996-01-01'
and l_receiptdate = 10 and l_quantity = 15 and l_quantity = 22 and l_quantity

相关问题

参考资料

https://imysql.com/2012/12/21/tpch-for-mysql-manual.html

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论