今天序风给我发了几个论坛里用户提的问题,其中一个感觉挺有意思,所以顺手在这里写些东西记录下。
问题描述
先描述一下这个用户提出的问题:创建一张表 t1,表中的 id 和 user_id 满足 id / 1000000 = user_id(向下取整),例如 23999999 / 1000000 = 23。
create table t1(
id int,
user_id int)
partition by hash(user_id) partitions 128;
insert into t1 values(1000000, 1);
insert into t1 values(1000001, 1);
insert into t1 values(2000000, 2);
insert into t1 values(2000001, 2);
insert into t1 values(23000000, 23);
insert into t1 values(23999999, 23);
select * from t1;
+----------+---------+
| id | user_id |
+----------+---------+
| 1000000 | 1 |
| 1000001 | 1 |
| 2000000 | 2 |
| 2000001 | 2 |
| 23000000 | 23 |
| 23999999 | 23 |
+----------+---------+
6 rows in set (0.272 sec)
但是用户业务的查询条件是 where id = 23000000,不包含分区键 user_id ,默认会查询所有分区。在下面这个计划里,大家也可以看到 3 号算子 table scan 扫描了 partitions(p[0-127]) 这所有 128 个分区。
explain select * from t1 where id = 23000000;
+------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------+
| ============================================================= |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| ------------------------------------------------------------- |
| |0 |PX COORDINATOR | |1 |485 | |
| |1 |└─EXCHANGE OUT DISTR |:EX10000|1 |485 | |
| |2 | └─PX PARTITION ITERATOR| |1 |484 | |
| |3 | └─TABLE FULL SCAN |t1 |1 |484 | |
| ============================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([INTERNAL_FUNCTION(t1.id, t1.user_id)]), filter(nil), rowset=16 |
| 1 - output([INTERNAL_FUNCTION(t1.id, t1.user_id)]), filter(nil), rowset=16 |
| dop=1 |
| 2 - output([t1.user_id], [t1.id]), filter(nil), rowset=16 |
| force partition granule |
| 3 - output([t1.user_id], [t1.id]), filter([t1.id = 23000000]), rowset=16 |
| access([t1.user_id], [t1.id]), partitions(p[0-127]) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+------------------------------------------------------------------------------------+
19 rows in set (0.142 sec)
如果想让计划做分区裁剪,就得人肉把上面这个计划的过滤条件里增加上一个 user_id = 23。例如下面这个计划,虽然新增的这个过滤条件并不能过滤掉更多数据,但是在执行前可以利用分区键 user_id = 23,裁剪出需要进行 table scan 扫描的只有一个分区 partitions(p23)。
explain select * from t1 where id = 23000000 and user_id = 23;
+-----------------------------------------------------------------------------------------------+
| Query Plan |
+-----------------------------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------- |
| |0 |TABLE FULL SCAN|t1 |1 |4 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t1.id], [t1.user_id]), filter([t1.id = 23000000], [t1.user_id = 23]), rowset=16 |
| access([t1.user_id], [t1.id]), partitions(p23) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false,false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+-----------------------------------------------------------------------------------------------+
11 rows in set (0.007 sec)
有时候过滤还不只是等值查询, 可能是 IN 查询,比如 where id IN (23000000, 2000001…),如果不想让查询遍历所有分区,还需要人肉在过滤条件中增加 where user_id IN (23, 2…) 这样对应的过滤条件,十分麻烦。
-- 计划中可以看到扫描了 partitions(p[0-127]) 全部 128 个分区
explain select * from t1 where id IN (23000000, 2000001);
+----------------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------------+
| ============================================================= |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| ------------------------------------------------------------- |
| |0 |PX COORDINATOR | |2 |486 | |
| |1 |└─EXCHANGE OUT DISTR |:EX10000|2 |485 | |
| |2 | └─PX PARTITION ITERATOR| |2 |484 | |
| |3 | └─TABLE FULL SCAN |t1 |2 |484 | |
| ============================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([INTERNAL_FUNCTION(t1.id, t1.user_id)]), filter(nil), rowset=16 |
| 1 - output([INTERNAL_FUNCTION(t1.id, t1.user_id)]), filter(nil), rowset=16 |
| dop=1 |
| 2 - output([t1.user_id], [t1.id]), filter(nil), rowset=16 |
| force partition granule |
| 3 - output([t1.user_id], [t1.id]), filter([t1.id IN (23000000, 2000001)]), rowset=16 |
| access([t1.user_id], [t1.id]), partitions(p[0-127]) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------------+
19 rows in set (0.579 sec)
-- 需要人肉推算分区键和非分区键之间的关系,并且人肉加上包含分区键的过滤条件,之后才能进行分区裁剪
explain select * from t1 where id IN (23000000, 2000001) and user_id IN (23, 2);
+-----------------------------------------------------------------------------------------------------------------+
| Query Plan |
+-----------------------------------------------------------------------------------------------------------------+
| ============================================================= |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| ------------------------------------------------------------- |
| |0 |PX COORDINATOR | |2 |10 | |
| |1 |└─EXCHANGE OUT DISTR |:EX10000|2 |10 | |
| |2 | └─PX PARTITION ITERATOR| |2 |8 | |
| |3 | └─TABLE FULL SCAN |t1 |2 |8 | |
| ============================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([INTERNAL_FUNCTION(t1.id, t1.user_id)]), filter(nil), rowset=16 |
| 1 - output([INTERNAL_FUNCTION(t1.id, t1.user_id)]), filter(nil), rowset=16 |
| dop=1 |
| 2 - output([t1.user_id], [t1.id]), filter(nil), rowset=16 |
| force partition granule |
| 3 - output([t1.user_id], [t1.id]), filter([t1.id IN (23000000, 2000001)], [t1.user_id IN (23, 2)]), rowset=16 |
| access([t1.user_id], [t1.id]), partitions(p2, p23) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false,false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+-----------------------------------------------------------------------------------------------------------------+
19 rows in set (0.074 sec)
说明:
为了方便大家理解,上面的描述对问题做了一定程度的简化,和用户原本的描述有一定出入,原问题链接详见:https://ask.oceanbase.com/t/topic/35608904。
利用生成列
如果希望能够利用和分区键有一定关系(一般是等值关系或者子串关系)的非分区键进行分区裁剪,显然就需要让 OceanBase 数据库先明确知道这两个列之间确实满足这种关系。
其中一种方法就是把其中的分区键写成基于另一个非分区键的生成列(substr 或者等值),这样另一个非分区键就也可以用来进行分区裁剪了,下面就简单举一个例子:
create table t2(
id int,
user_id int generated always as (substr(id, 1, length(id) - 6)) virtual)
partition by hash(user_id) partitions 128;
insert into t2(id) values(1000000);
insert into t2(id) values(1000001);
insert into t2(id) values(2000000);
insert into t2(id) values(2000001);
insert into t2(id) values(23000000);
insert into t2(id) values(23999999);
select * from t2;
+----------+---------+
| id | user_id |
+----------+---------+
| 1000000 | 1 |
| 1000001 | 1 |
| 2000000 | 2 |
| 2000001 | 2 |
| 23000000 | 23 |
| 23999999 | 23 |
+----------+---------+
6 rows in set (0.016 sec)
-- 分区键 user_id 做过滤条件,会做分区裁剪,table scan 只需要扫描一个分区 partitions(p23)
explain select * from t2 where user_id = 23;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------- |
| |0 |TABLE FULL SCAN|t2 |1 |4 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t2.id], [column_conv(INT,PS:(11,0),NULL,cast(substr(cast(t2.id, VARCHAR(1048576)), 1, length(cast(t2.id, VARCHAR(1048576))) - 6), INT(-1, |
| 0)))]), filter([column_conv(INT,PS:(11,0),NULL,cast(substr(cast(t2.id, VARCHAR(1048576)), 1, length(cast(t2.id, VARCHAR(1048576))) - 6), INT(-1, 0))) = |
| 23]), rowset=16 |
| access([t2.id]), partitions(p23) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false], |
| range_key([t2.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
13 rows in set (0.008 sec)
-- 非分区键 id 做过滤条件,也做了分区裁剪,table scan 也只扫描了一个分区:partitions(p23)
explain select * from t2 where id = 23000000;
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------- |
| |0 |TABLE FULL SCAN|t2 |1 |4 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t2.id], [column_conv(INT,PS:(11,0),NULL,cast(substr(cast(t2.id, VARCHAR(1048576)), 1, length(cast(t2.id, VARCHAR(1048576))) - 6), INT(-1, |
| 0)))]), filter([t2.id = 23000000]), rowset=16 |
| access([t2.id]), partitions(p23) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false], |
| range_key([t2.__pk_increment]), range(MIN ; MAX)always true |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
12 rows in set (0.008 sec)
具体的代码这里就不细说了,大概位置应该在 ObTableLocation::add_partition_columns 这里,可以看到添加可以用来进行分区裁剪的 partition_columns 时,也对相关的生成列 gen_cols 进行了特殊处理。
利用 check 约束
除了生成列,大家肯定会想到 check 约束看上去也能约束两个列之间满足某种特定的关系,和生成列的约束十分类似。那么 check 约束是否也可以让非分区键也可以进行分区裁剪?下面顺手做一个简单的测试:
create table t4(
id varchar(10), user_id varchar(10),
constraint cst check(user_id = substr(id, 1, 2)))
partition by list columns(user_id) (
partition p0 values in ('23'),
partition p1 values in ('45'),
partition p2 values in ('67'));
insert into t4 values(230000, 23);
insert into t4 values(450000, 45);
insert into t4 values(670000, 67);
-- 分区键 user_id 做过滤条件,会做分区裁剪,table scan 只需要扫描一个分区 partitions(p1)
explain select * from t4 where user_id = '45';
+------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------- |
| |0 |TABLE FULL SCAN|t4 |1 |4 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t4.id], [t4.user_id]), filter([t4.user_id = '45']), rowset=16 |
| access([t4.user_id], [t4.id]), partitions(p1) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false], |
| range_key([t4.__pk_increment]), range(MIN ; MAX)always true |
+------------------------------------------------------------------------------------+
11 rows in set (0.010 sec)
-- 非分区键 id 做过滤条件,也做了分区裁剪,table scan 也只扫描了一个分区:partitions(p1)
-- 在计划里我们可以看到 filter 中除了 id = '450000',优化器还利用 check 约束,额外地增加了一个 user_id = substr(cast('450000', VARCHAR(10)), 1, 2) 的新过滤条件
explain select * from t4 where id = '450000';
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------- |
| |0 |TABLE FULL SCAN|t4 |1 |4 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t4.id], [t4.user_id]), filter([t4.id = '450000'], [t4.user_id IS NULL OR t4.user_id = substr(cast('450000', VARCHAR(10)), 1, 2)]), rowset=16 |
| access([t4.user_id], [t4.id]), partitions(p1) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false,false], |
| range_key([t4.__pk_increment]), range(MIN ; MAX)always true |
+------------------------------------------------------------------------------------------------------------------------------------------------------------+
11 rows in set (0.010 sec)
具体的代码这里就不细说了,大概位置应该在 ObTransformConstPropagate::replace_check_constraint_exprs 这里。不过看上去利用 check 约束的条件可能会比生成列更严格些,比如需要让列类型是字符串等。因为已经提供了相关代码,所以大家如果对此感兴趣的话,可以自行研究下~
总结
OceanBase 支持利用和分区键有一定关系(一般是等值关系或者子串关系)的非分区键进行分区裁剪,但是需要通过生成列(generated column)或者 check 约束(check constraint),让数据库明确知道这两个列之间确实满足这种关系。
OceanBase 的论坛问答区真是写博客的灵感源泉,总是能看到各种有意思的问题,看来以后要多逛一逛社区论坛了~