前文《OceanBase SQL 执行计划解读(二)──── 表连接和子查询》介绍了子查询的执行计划特点,还没有完全说完。本文继续介绍子查询的执行计划以及分析函数(窗口函数)的执行计划特点。
熟悉常用 SQL 的执行计划是为了反过来快速解读分析复杂 SQL 的执行计划。
子查询
本文不讨论非相关子查询。
标量子查询表达式(Scalar Subquery Expression
)是一类从一行返回一列值的子查询。标量子查询表达式的值是子查询的查询列的值。如果子查询返回 0 行,则标量子查询表达式的值是 NULL。如果子查询返回多行,则标量子查询表达式返回一个错误。
SUBPLAN FILTER
和 SCALAR GROUP BY
EXPLAIN extended_noaddr SELECT (SELECT w_name FROM BMSQL_WAREHOUSE w WHERE w.w_id = c.C_W_ID) ware_name , c.C_D_ID ,c.C_FIRST ,c.C_LAST , (SELECT count(*) FROM BMSQL_OORDER o WHERE o.O_C_ID =c.C_ID ) order_cnt , (SELECT sum(o.O_OL_CNT) FROM BMSQL_OORDER o WHERE o.O_C_ID =c.C_ID ) item_cnt FROM BMSQL_CUSTOMER c ; ================================================================== |ID|OPERATOR |NAME |EST. ROWS|COST | ------------------------------------------------------------------ |0 |SUBPLAN FILTER | |30000000 |2.236362e+12 | |1 | TABLE SCAN |C |30000000 |34332847 | |2 | TABLE GET |W |1 |36 | |3 | SCALAR GROUP BY| |1 |5159 | |4 | TABLE SCAN |O(BMSQL_OORDER_IDX4)|12106 |2847 | |5 | SCALAR GROUP BY| |1 |69350 | |6 | TABLE SCAN |O(BMSQL_OORDER_IDX4)|12106 |67037 | ================================================================== Outputs & filters: ------------------------------------- 0 - output([subquery(1)], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [subquery(2)], [subquery(3)]), filter(nil), exec_params_([C.C_W_ID], [C.C_ID], [C.C_ID]), onetime_exprs_(nil), init_plan_idxs_(nil) 1 - output([C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_ID]), filter(nil), access([C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_ID]), partitions(p0), is_index_back=false, range_key([C.C_W_ID], [C.C_D_ID], [C.C_ID]), range(MIN,MIN,MIN ; MAX,MAX,MAX)always true 2 - output([W.W_NAME]), filter(nil), access([W.W_NAME]), partitions(p0), is_index_back=false, range_key([W.W_ID]), range(MIN ; MAX)always true, range_cond([W.W_ID = ?]) 3 - output([T_FUN_COUNT(*)]), filter(nil), group(nil), agg_func([T_FUN_COUNT(*)]) 4 - output([1]), filter(nil), access([O.O_C_ID]), partitions(p0), is_index_back=false, range_key([O.O_C_ID], [O.O_W_ID], [O.O_D_ID], [O.O_ID]), range(MIN,MIN,MIN,MIN ; MAX,MAX,MAX,MAX)always true, range_cond([O.O_C_ID = ?]) 5 - output([T_FUN_SUM(O.O_OL_CNT)]), filter(nil), group(nil), agg_func([T_FUN_SUM(O.O_OL_CNT)]) 6 - output([O.O_OL_CNT]), filter(nil), access([O.O_OL_CNT]), partitions(p0), is_index_back=true, range_key([O.O_C_ID], [O.O_W_ID], [O.O_D_ID], [O.O_ID]), range(MIN,MIN,MIN,MIN ; MAX,MAX,MAX,MAX)always true, range_cond([O.O_C_ID = ?])
SUBPLAN FILTER
用于驱动表达式中的子查询,OceanBase 会以 NESTED-LOOP
算法来执行 SUBPLAN FILTER
算子。即循环遍历左边的记录集,然后去右边结果集中取数据。所以,子查询是否能命中索引,对性能影响很大。
说明:
- 标量子查询要求只返回一笔记录。可以直接取列,也可以用统计函数(
count
、min
、max
、sum
)。 - 算子 0 是
SUBPLAN FILTER
。output
表示输出列,后面包括 3 个子查询结果。filter
表示算子上的过滤条件,这里是空(nil
)。exec_params_
表示左表(结果集)传递给右表(结果集)的参数,一般关联子查询这里都是连接条件,如果是非关联子查询,这里就是空(nil
)。onetime_exprs_
表示只计算一次的对象(如子查询1),通常非关联的子查询结果集只需要计算一次。这里是关联子查询,所以值是空(nil
)。 - 算子 2 是第一个子查询,直接主键访问,用
TABLE GET
. - 算子 3 和 4 是第二个子查询,扫描索引(
TABLE SCAN
),然后再聚合 。不过这里没有分组逻辑,所以 group 参数是空。
算子 SCALAR GROUP BY
是聚合函数生成标量结果常用的算法,用在没有 GROUP BY
语句的时候。当有GROUP BY
语句时,使用的就是 HASH GROUP BY
或者 MERGE GROUP BY
算子。
EXPLAIN extended_noaddr SELECT c.C_W_ID , count(*) FROM BMSQL_CUSTOMER c GROUP BY c.C_W_ID HAVING count(*) > 1000; =========================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------- |0 |MERGE GROUP BY| |50 |41251449| |1 | TABLE SCAN |C |30000000 |33009350| =========================================== Outputs & filters: ------------------------------------- 0 - output([C.C_W_ID], [T_FUN_COUNT(*)]), filter([T_FUN_COUNT(*) > 1000]), group([C.C_W_ID]), agg_func([T_FUN_COUNT(*)]) 1 - output([C.C_W_ID]), filter(nil), access([C.C_W_ID]), partitions(p0), is_index_back=false, range_key([C.C_W_ID], [C.C_D_ID], [C.C_ID]), range(MIN,MIN,MIN ; MAX,MAX,MAX)always true
说明:
- 如上,有明显的
GROUP BY
子句,使用的是MERGE GROUP BY
算子, 分组表达式是C.C_W_ID
。(group([C.C_W_ID])
) - 有
HAVING
子句,会在算子MERGE GROUP BY
上产生一个filter
。
MERGE GROUP BY
和 HASH GROUP BY
有时候,没有 GROUP BY
子句,也会用到 MERGE GROUP BY
算子。
EXPLAIN extended_noaddr SELECT c.C_W_ID ,c.C_D_ID ,c.C_FIRST ,c.C_LAST ,c.C_BALANCE ,c.C_PAYMENT_CNT FROM BMSQL_CUSTOMER c WHERE (SELECT count(*) FROM BMSQL_OORDER o WHERE o.O_C_ID=c.C_ID) > 10; ================================================================== |ID|OPERATOR |NAME |EST. ROWS|COST | ------------------------------------------------------------------ |0 |HASH RIGHT OUTER JOIN| |10000000 |84866769| |1 | SUBPLAN SCAN |VIEW1 |3008 |18473351| |2 | MERGE GROUP BY | |3008 |18472936| |3 | TABLE SCAN |O(BMSQL_OORDER_IDX4)|36414597 |8468513 | |4 | TABLE SCAN |C |30000000 |35656345| ================================================================== Outputs & filters: ------------------------------------- 0 - output([C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_BALANCE], [C.C_PAYMENT_CNT]), filter([CASE WHEN (T_OP_IS_NOT, VIEW1.O.O_C_ID, NULL, 0) THEN VIEW1.COUNT(*) ELSE 0 END > 10]), equal_conds([VIEW1.O.O_C_ID = C.C_ID]), other_conds(nil) 1 - output([VIEW1.COUNT(*)], [VIEW1.O.O_C_ID]), filter(nil), access([VIEW1.COUNT(*)], [VIEW1.O.O_C_ID]) 2 - output([T_FUN_COUNT(*)], [O.O_C_ID]), filter(nil), group([O.O_C_ID]), agg_func([T_FUN_COUNT(*)]) 3 - output([O.O_C_ID]), filter(nil), access([O.O_C_ID]), partitions(p0), is_index_back=false, range_key([O.O_C_ID], [O.O_W_ID], [O.O_D_ID], [O.O_ID]), range(MIN,MIN,MIN,MIN ; MAX,MAX,MAX,MAX)always true 4 - output([C.C_ID], [C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_BALANCE], [C.C_PAYMENT_CNT]), filter(nil), access([C.C_ID], [C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_BALANCE], [C.C_PAYMENT_CNT]), partitions(p0), is_index_back=false, range_key([C.C_W_ID], [C.C_D_ID], [C.C_ID]), range(MIN,MIN,MIN ; MAX,MAX,MAX)always true
说明:
- 没有
GROUP BY
子句,但是针对WHERE
条件中的关联子查询,优化器改写了算法为HASH RIGHT OUTER JOIN
,事先将子查询结果分组统计出来 (按o.o_c_id
做GROUP BY
),所以有算子 3MERGE GROUP BY
。使用MERGE
是利用了索引的有序性。 - 算子 1
SUBPLAN SCAN
从子查询视图扫描数据。在算子 0HASH RIGHT OUTER JOIN
使用filter
应用子查询的过滤条件 (>10
) .
如果子查询中结果集没有好的索引可以使用,优化器算法会调整为使用 HASH GROUP BY
。
EXPLAIN extended_noaddr SELECT c.C_W_ID ,c.C_D_ID ,c.C_FIRST ,c.C_LAST ,c.C_BALANCE ,c.C_PAYMENT_CNT FROM BMSQL_CUSTOMER c WHERE (SELECT count(*) FROM BMSQL_HISTORY h WHERE h.H_C_ID = c.C_ID) > 100; =================================================== |ID|OPERATOR |NAME |EST. ROWS|COST | --------------------------------------------------- |0 |HASH RIGHT OUTER JOIN| |10000000 |88584899| |1 | SUBPLAN SCAN |VIEW1|3008 |22191481| |2 | HASH GROUP BY | |3008 |22191066| |3 | TABLE SCAN |H |36189654 |8391834 | |4 | TABLE SCAN |C |30000000 |35656345| =================================================== Outputs & filters: ------------------------------------- 0 - output([C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_BALANCE], [C.C_PAYMENT_CNT]), filter([CASE WHEN (T_OP_IS_NOT, VIEW1.H.H_C_ID, NULL, 0) THEN VIEW1.COUNT(*) ELSE 0 END > 100]), equal_conds([VIEW1.H.H_C_ID = C.C_ID]), other_conds(nil) 1 - output([VIEW1.COUNT(*)], [VIEW1.H.H_C_ID]), filter(nil), access([VIEW1.COUNT(*)], [VIEW1.H.H_C_ID]) 2 - output([T_FUN_COUNT(*)], [H.H_C_ID]), filter(nil), group([H.H_C_ID]), agg_func([T_FUN_COUNT(*)]) 3 - output([H.H_C_ID]), filter(nil), access([H.H_C_ID]), partitions(p0), is_index_back=false, range_key([H.__pk_increment]), range(MIN ; MAX)always true 4 - output([C.C_ID], [C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_BALANCE], [C.C_PAYMENT_CNT]), filter(nil), access([C.C_ID], [C.C_W_ID], [C.C_D_ID], [C.C_FIRST], [C.C_LAST], [C.C_BALANCE], [C.C_PAYMENT_CNT]), partitions(p0), is_index_back=false, range_key([C.C_W_ID], [C.C_D_ID], [C.C_ID]), range(MIN,MIN,MIN ; MAX,MAX,MAX)always true
分析函数
分析函数(某些数据库下也叫做窗口函数)与聚集函数类似,计算总是基于一组行的集合,不同的是,聚集函数一组只能返回一行,而分析函数每组可以返回多行,组内每一行都是基于窗口的逻辑计算的结果。分析函数可以显著优化需要 self-join
的查询。有些分析函数也可以当聚集函数使用。
分析函数包括:
MAX
、MIN
、AVG
COUNT
、SUM
GROUP_CONCAT
、LISTAGG
ROW_NUMBER
、RANK
、DENSE_RANK
、PERCENT_RANK
CUME_DIST
FIRST_VALUE
、LAST_VALUE
NTH_VALUE
、NTILE
LEAD
、LAG
算子 WINDOW_FUNCTION
如下面示例,统计各个仓库下的各个区的销量在本仓库内的排名。
EXPLAIN extended_noaddr SELECT d.D_W_ID , d.D_ID , d.D_NAME , d.D_YTD ,ROW_NUMBER () OVER (PARTITION BY d.D_W_ID ORDER BY d.D_YTD DESC ) rn FROM BMSQL_DISTRICT d ORDER BY rn ; ; ========================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------ |0 |SORT | |10000 |82278| |1 | WINDOW FUNCTION| |10000 |32980| |2 | SORT | |10000 |31070| |3 | TABLE SCAN |D |10000 |4022 | ========================================== Outputs & filters: ------------------------------------- 0 - output([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD], [T_WIN_FUN_ROW_NUMBER()]), filter(nil), sort_keys([T_WIN_FUN_ROW_NUMBER(), ASC]) 1 - output([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD], [T_WIN_FUN_ROW_NUMBER()]), filter(nil), win_expr(T_WIN_FUN_ROW_NUMBER()), partition_by([D.D_W_ID]), order_by([D.D_YTD, DESC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING) 2 - output([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD]), filter(nil), sort_keys([D.D_W_ID, ASC], [D.D_YTD, DESC]), prefix_pos(1) 3 - output([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD]), filter(nil), access([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD]), partitions(p0), is_index_back=false, range_key([D.D_W_ID], [D.D_ID]), range(MIN,MIN ; MAX,MAX)always true
说明:
- 分析函数对应的算子是
WINDOW FUNCTION
,依赖下层算子的有序输出,有分区表达式和排序表达式。 output
是算子的输出表达式,包含分析函数的结果,filter
固定为nil
。win_expr
表示在窗口中使用哪个窗口函数 ,partition_by
表示窗口内的分组表达式,order_by
表示窗口每组内部统计时的排序表达式 。window_type
window_type
表示窗口类型,有两种:range
和rows
。range
表示按照逻辑位置偏移进行计算窗口上下界限,rows 表示按照实际物理位置偏移进行计算窗口上下界限;默认使用range
方式。upper
和lower
分别定义窗口的上限和下限。UNBOUNDED
表示无边界,按照最大的选择(默认)。CURRENT ROW
表示从当前行开始,如果出现数字则表示移动的行数。PRECEDING
表示向前取边界,FOLLOWING
则表示向后取边界。- 算子 2 的
SORT
会包含分区窗口表达式和窗口内的排序表达式。 - 算子 0 的
SORT
是最外层的排序表达式。
下面可以看看不同分析函数下执行计划里算子 WINDOW_FUNCTION
的各个参数。
EXPLAIN extended_noaddr SELECT w.W_STATE, w_id, w_name, w_ytd, max(w_ytd) over (partition by W_STATE order by w_ytd desc rows between 1 preceding and 1 following) max_ytd_in_3 from BMSQL_WAREHOUSE w ORDER BY w.W_STATE ; ========================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------ |0 |WINDOW FUNCTION| |1000 |148632| |1 | SORT | |1000 |148441| |2 | TABLE SCAN |W |1000 |141926| ========================================== Outputs & filters: ------------------------------------- 0 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD], [T_FUN_MAX(W.W_YTD)]), filter(nil), win_expr(T_FUN_MAX(W.W_YTD)), partition_by([W.W_STATE]), order_by([W.W_YTD, DESC]), window_type(ROWS), upper(1 PRECEDING), lower(1 FOLLOWING) 1 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), filter(nil), sort_keys([W.W_STATE, ASC], [W.W_YTD, DESC]) 2 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), filter(nil), access([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), partitions(p0), is_index_back=false, range_key([W.W_ID]), range(MIN ; MAX)always true
说明:
window_type
是ROWS
,upper
是同一个分组内向前一笔,lower
是同一个分组内向后一笔。如果同一个分组内没有向前或向后一笔,那就是空。
下面输出各个仓库的销量以及包括前2笔在内的最大销量。
EXPLAIN extended_noaddr SELECT w.W_STATE, w_id, w_name, w_ytd, max(w_ytd) over (order by w_ytd desc rows between 2 PRECEDING AND current row ) max_ytd_in_3 from BMSQL_WAREHOUSE w ORDER BY w.W_YTD DESC ; ========================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------ |0 |WINDOW FUNCTION| |1000 |288860| |1 | SORT | |1000 |288669| |2 | TABLE SCAN |W |1000 |285169| ========================================== Outputs & filters: ------------------------------------- 0 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD], [T_FUN_MAX(W.W_YTD)]), filter(nil), win_expr(T_FUN_MAX(W.W_YTD)), partition_by(nil), order_by([W.W_YTD, DESC]), window_type(ROWS), upper(2 PRECEDING), lower(CURRENT ROW) 1 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), filter(nil), sort_keys([W.W_YTD, DESC]) 2 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), filter(nil), access([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), partitions(p0), is_index_back=false, range_key([W.W_ID]), range(MIN ; MAX)always true
说明:
- 分区表达式
partition_by
并不是必须的,可以为空。
下面看看行转列函数的执行计划。
EXPLAIN extended_Noaddr SELECT d.D_W_ID , d.D_ID ,d.d_name, d.D_YTD , listagg(d.D_NAME,',') WITHIN GROUP (ORDER BY d.D_YTD DESC ) OVER (PARTITION BY d.D_W_ID) d_names FROM BMSQL_DISTRICT d WHERE d.D_W_ID = 10 ; ======================================== |ID|OPERATOR |NAME|EST. ROWS|COST| ---------------------------------------- |0 |WINDOW FUNCTION| |10 |40 | |1 | TABLE SCAN |D |10 |39 | ======================================== Outputs & filters: ------------------------------------- 0 - output([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD], [T_FUN_GROUP_CONCAT(D.D_NAME, ',') order_items(D.D_YTD)]), filter(nil), win_expr(T_FUN_GROUP_CONCAT(D.D_NAME, ',') order_items(D.D_YTD)), partition_by([D.D_W_ID]), order_by(nil), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING) 1 - output([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD]), filter(nil), access([D.D_W_ID], [D.D_ID], [D.D_NAME], [D.D_YTD]), partitions(p0), is_index_back=false, range_key([D.D_W_ID], [D.D_ID]), range(10,MIN ; 10,MAX), range_cond([D.D_W_ID = 10])
说明:
- 使用
LISTAGG
语法时,窗口函数表达式是T_FUN_GROUP_CONCAT
。
再看一个 窗口类型为 RANGE
的示例。
EXPLAIN extended_noaddr SELECT w.W_STATE, w_id, w_name, w_ytd, max(w_ytd) over (order by w_ytd desc range between 1000000 PRECEDING AND current row ) max_ytd_in_3 from BMSQL_WAREHOUSE w ORDER BY w.W_YTD DESC ; ========================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------ |0 |WINDOW FUNCTION| |1000 |191810| |1 | SORT | |1000 |191619| |2 | TABLE SCAN |W |1000 |188120| ========================================== Outputs & filters: ------------------------------------- 0 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD], [T_FUN_MAX(W.W_YTD)]), filter(nil), win_expr(T_FUN_MAX(W.W_YTD)), partition_by(nil), order_by([W.W_YTD, DESC]), window_type(RANGE), upper(1000000 PRECEDING), lower(CURRENT ROW) 1 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD], [W.W_YTD + 1000000], [W.W_YTD - 1000000]), filter(nil), sort_keys([W.W_YTD, DESC]) 2 - output([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), filter(nil), access([W.W_STATE], [W.W_ID], [W.W_NAME], [W.W_YTD]), partitions(p0), is_index_back=false, range_key([W.W_ID]), range(MIN ; MAX)always true
说明:
window_type
是RANGE
。是按实际值计算窗口大小,不是按行数固定窗口大小。算子 1 的output
里多了两列 ([W.W_YTD + 1000000], [W.W_YTD - 1000000]
)。
分析函数的代价
分析函数看起来很酷,不过也有代价,那就是每次调用分析函数都可能会有一次排序,排序需要内存,可能需要增大内部参数 _sort_area_size
的值。为了性能还建议使用并行( OB 的并行会在下篇文章介绍)。
下面这个示例会涉及到一次全表扫描和三次排序。
EXPLAIN extended_noaddr SELECT c_w_id, c_d_id, c_id, c.C_LAST ,c.C_FIRST , C_YTD_PAYMENT , rank() OVER (PARTITION BY C_W_ID, c_d_id ORDER BY C_YTD_PAYMENT DESC ) rank_in_district, rank() OVER (PARTITION BY c_w_id ORDER BY C_YTD_PAYMENT DESC ) rank_in_warehouse, rank() OVER (ORDER BY C_YTD_PAYMENT DESC ) rank_in_all FROM BMSQL_CUSTOMER c WHERE c.C_YTD_PAYMENT >= 1000000 ORDER BY c.C_YTD_PAYMENT DESC ; ; ================================================= |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------------- |0 |WINDOW FUNCTION | |15488448 |314812916| |1 | SORT | |15488448 |311854430| |2 | WINDOW FUNCTION | |15488448 |185456913| |3 | SORT | |15488448 |182498427| |4 | WINDOW FUNCTION| |15488448 |118046685| |5 | SORT | |15488448 |115088200| |6 | TABLE SCAN |C |15488448 |50636458 | ================================================= Outputs & filters: ------------------------------------- 0 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT], [T_WIN_FUN_RANK()], [T_WIN_FUN_RANK()], [T_WIN_FUN_RANK()]), filter(nil), win_expr(T_WIN_FUN_RANK()), partition_by(nil), order_by([C.C_YTD_PAYMENT, DESC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING) 1 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT], [T_WIN_FUN_RANK()], [T_WIN_FUN_RANK()]), filter(nil), sort_keys([C.C_YTD_PAYMENT, DESC]) 2 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT], [T_WIN_FUN_RANK()], [T_WIN_FUN_RANK()]), filter(nil), win_expr(T_WIN_FUN_RANK()), partition_by([C.C_W_ID]), order_by([C.C_YTD_PAYMENT, DESC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING) 3 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT], [T_WIN_FUN_RANK()]), filter(nil), sort_keys([C.C_W_ID, ASC], [C.C_YTD_PAYMENT, DESC]), prefix_pos(1) 4 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT], [T_WIN_FUN_RANK()]), filter(nil), win_expr(T_WIN_FUN_RANK()), partition_by([C.C_W_ID], [C.C_D_ID]), order_by([C.C_YTD_PAYMENT, DESC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING) 5 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT]), filter(nil), sort_keys([C.C_W_ID, ASC], [C.C_D_ID, ASC], [C.C_YTD_PAYMENT, DESC]), prefix_pos(2) 6 - output([C.C_YTD_PAYMENT], [C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST]), filter([C.C_YTD_PAYMENT >= 1000000]), access([C.C_YTD_PAYMENT], [C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST]), partitions(p0), is_index_back=false, filter_before_indexback[false], range_key([C.C_W_ID], [C.C_D_ID], [C.C_ID]), range(MIN,MIN,MIN ; MAX,MAX,MAX)always true
执行结果如下图:
说明:
- 执行的顺序,首先算子 6 是针对表的扫描,先执行过滤条件(
filter
)。 - 算子 5 是第一次排序,排序列是分区列加上排序列(
sort_keys([C.C_W_ID, ASC], [C.C_D_ID, ASC], [C.C_YTD_PAYMENT, DESC])
)。 - 算子 4 是第一个窗口函数,分区列是
partition_by([C.C_W_ID], [C.C_D_ID])
, 排序列是order_by([C.C_YTD_PAYMENT, DESC])
。 - 算子 3 是一个优化,利用了第一个窗口函数的结果继续进行排序。
- 算子 1 在算子 2 结果集基础上进一步排序。
上面示例 3 个分析函数使用的窗口函数算子都是 T_WIN_FUN_RANK
,只是分区列不同所以还是有三次排序。如果分区列和排序列一样的话,是可以规避多次排序的。如下面示例。
EXPLAIN extended_noaddr SELECT * FROM ( SELECT c_w_id, c_d_id, c_id, c.C_LAST ,c.C_FIRST , C_YTD_PAYMENT ,rank() OVER (PARTITION BY c_w_id,c_d_id ORDER BY C_YTD_PAYMENT) ytd_rank ,first_value(C_YTD_PAYMENT) OVER (PARTITION BY c_w_id,c_d_id ORDER BY C_YTD_PAYMENT) first_ytd -- ,last_value(C_YTD_PAYMENT) OVER (PARTITION BY C_W_ID,c_d_id ORDER BY C_YTD_PAYMENT ) last_ytd ,last_value(C_YTD_PAYMENT) OVER (PARTITION BY C_W_ID,c_d_id ORDER BY C_YTD_PAYMENT rows between unbounded preceding and unbounded following) last_ytd_all FROM BMSQL_CUSTOMER c WHERE c.C_YTD_PAYMENT >= 1000000 ) t WHERE t.c_w_id = 23 ; =========================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------- |0 |SUBPLAN SCAN |T |1533 |147862| |1 | WINDOW FUNCTION| |15477 |144430| |2 | SORT | |15477 |141474| |3 | TABLE SCAN |C |15477 |41003 | =========================================== Outputs & filters: ------------------------------------- 0 - output([T.C_W_ID], [T.C_D_ID], [T.C_ID], [T.C_LAST], [T.C_FIRST], [T.C_YTD_PAYMENT], [T.YTD_RANK], [T.FIRST_YTD], [T.LAST_YTD_ALL]), filter([T.C_D_ID = 10]), access([T.C_W_ID], [T.C_D_ID], [T.C_ID], [T.C_LAST], [T.C_FIRST], [T.C_YTD_PAYMENT], [T.YTD_RANK], [T.FIRST_YTD], [T.LAST_YTD_ALL]) 1 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT], [T_WIN_FUN_RANK()], [T_WIN_FUN_NTH_VALUE(C.C_YTD_PAYMENT,1)], [T_WIN_FUN_NTH_VALUE(C.C_YTD_PAYMENT,1)]), filter(nil), win_expr(T_WIN_FUN_RANK()), partition_by([C.C_W_ID]), order_by([C.C_YTD_PAYMENT, ASC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING) win_expr(T_WIN_FUN_NTH_VALUE(C.C_YTD_PAYMENT,1)), partition_by([C.C_W_ID]), order_by([C.C_YTD_PAYMENT, ASC]), window_type(RANGE), upper(UNBOUNDED PRECEDING), lower(CURRENT ROW) win_expr(T_WIN_FUN_NTH_VALUE(C.C_YTD_PAYMENT,1)), partition_by([C.C_W_ID]), order_by([C.C_YTD_PAYMENT, ASC]), window_type(ROWS), upper(UNBOUNDED PRECEDING), lower(UNBOUNDED FOLLOWING) 2 - output([C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST], [C.C_YTD_PAYMENT]), filter(nil), sort_keys([C.C_YTD_PAYMENT, ASC]) 3 - output([C.C_YTD_PAYMENT], [C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST]), filter([C.C_YTD_PAYMENT >= 1000000]), access([C.C_YTD_PAYMENT], [C.C_W_ID], [C.C_D_ID], [C.C_ID], [C.C_LAST], [C.C_FIRST]), partitions(p0), is_index_back=false, filter_before_indexback[false], range_key([C.C_W_ID], [C.C_D_ID], [C.C_ID]), range(23,MIN,MIN ; 23,MAX,MAX), range_cond([C.C_W_ID = 23])

说明:
- 这里
where
条件有两个(t.c_w_id = 23 AND t.c_d_id = 10
),但实际条件下推到子查询里只有t.c_w_id=23
。 这是因为子查询使用的分析函数里的分区列只包含列t.c_w_id
。看算子 3 的range_cond
和算子 1的win_expr
。 最后一部的filter
才是条件t.c_d_id = 10
。 - 算子 2 的
sort_keys([C.C_YTD_PAYMENT, ASC])
只包含了列c.c_ytd_payment
这是因为算子 3 返回的数据在列c_w_id
上已经是有序的。 - 算子 1 的一共用了 3 个窗口函数表达式(
win_expr
),分别是:T_WIN_FUN_RANK()
、T_WIN_FUN_NTH_VALUE(C.C_YTD_PAYMENT,1)
、T_WIN_FUN_NTH_VALUE(C.C_YTD_PAYMENT,1)
,它们的order_by
条件都是一样的,只是窗口的下限不一样,可以共用一个SORT
操作。 - 从这个例子还看出,默认的窗口范围是
upper(UNBOUNDED PRECEDING), lower(CURRENT ROW)
。被注释掉的last_value
使用默认的窗口范围,只会返回当前行值。
参考
更多执行计划总结,请参考前面文章。下一篇会重点介绍 OB 特有的并行和分布式执行计划。