点击上方“IT那活儿”公众号--专注于企业全栈运维技术分享,不管IT什么活儿,干就完了!!!
oracle Height-balanced直方图无法准确评估字段值出现频率的相关研究.数据库:oracle 19.12
一
理论知识
oracle 12c前height-balanced直方图存在不能准确评估表中某字段出现频率的缺陷,因此oracle 12c引进top frequency和hybrid直方图解决该问题,以下为实验记录。
二
建测试数据
2.1 建表并insert数据
LEO@rmlis> create table book (book_id number);
Table created.
insert into book values(1);
insert into book values(1);
insert into book values(1);
insert into book values(5);
insert into book values(5);
insert into book values(5);
insert into book values(10);
insert into book values(10);
insert into book values(25);
insert into book values(25);
insert into book values(25);
insert into book values(25);
insert into book values(50);
insert into book values(100);
insert into book values(100);
LEO@rmlis> commit;
Commit complete.
2.2 数据查询
select count(*) as num_of_rows, book_id
from book
group by book_id
order by 1 desc;
NUM_OF_ROWS BOOK_ID
----------- ----------
4 25
3 1
3 5
2 10
2 100
1 50
6 rows selected.
三
height-balanced直方图
3.1 生成height-balanced直方图
--以下生成12c之前的height-balanced直方图:
LEO@rmlis> begin
2 dbms_stats.gather_table_stats(ownname => 'LEO',
3 tabname => 'BOOK',
4 method_opt => 'FOR COLUMNS BOOK_ID SIZE 3',
5 estimate_percent => 100);
6 end;
7 /
PL/SQL procedure successfully completed.
说明:指定estimate_percent参数,bucket设置为3。
3.2 查height-balanced直方图
LEO@rmlis> select a.column_name,
2 a.table_name,
3 b.num_rows,
4 a.num_distinct Cardinality,
5 round(a.num_distinct b.num_rows * 100, 2) selectivity,
6 a.histogram,
7 a.num_buckets
8 from dba_tab_col_statistics a, dba_tables b
9 where a.owner = b.owner
10 and a.table_name = b.table_name
11 and a.owner = 'LEO'
12 and a.table_name = 'BOOK';
COLUMN_NAME TABLE_NAME NUM_ROWS CARDINALITY SELECTIVITY HISTOGRAM NUM_BUCKETS
------------------------- ------------------------- ---------- ----------- ----------- --------------- -----------
BOOK_ID BOOK 15 6 40 HEIGHT BALANCED 3
说明:如上所示,该直方图为HEIGHT BALANCED类型。
3.3 查执行计划
LEO@rmlis> select * from book where book_id=25;
Execution Plan
----------------------------------------------------------
Plan hash value: 3071066230
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 9 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| BOOK | 3 | 9 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("BOOK_ID"=25)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
613 bytes sent via SQL*Net to client
397 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
4 rows processed
说明:此处优化器评估book_id=25为3行,实际为4行。
LEO@rmlis> select * from book where book_id=100;
Execution Plan
----------------------------------------------------------
Plan hash value: 3071066230
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 9 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| BOOK | 3 | 9 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("BOOK_ID"=100)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
603 bytes sent via SQL*Net to client
398 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed
说明:优化器评估book_id=100也为3行,实际为2行。如上所示,直方图height-balanced并不能准确评估表中某列值的实际记录数。
四
Hybrid直方图
4.1 生成Hybrid直方图
--因数据库版本为oracle 19.12,现对该表生成Hybrid直方图:
LEO@rmlis> begin
2 dbms_stats.gather_table_stats(ownname => 'LEO',
3 tabname => 'BOOK',
4 method_opt => 'FOR COLUMNS BOOK_ID SIZE 3');
5 end;
6 /
PL/SQL procedure successfully completed.
说明:只需将estimate_percent参数配置删除,默认为auto_sample_size。
4.2 查Hybrid直方图
LEO@rmlis> select a.column_name,
2 a.table_name,
3 b.num_rows,
4 a.num_distinct Cardinality,
5 round(a.num_distinct / b.num_rows * 100, 2) selectivity,
6 a.histogram,
7 a.num_buckets
8 from dba_tab_col_statistics a, dba_tables b
9 where a.owner = b.owner
10 and a.table_name = b.table_name
11 and a.owner = 'LEO'
12 and a.table_name = 'BOOK';
COLUMN_NAME TABLE_NAME NUM_ROWS CARDINALITY SELECTIVITY HISTOGRAM NUM_BUCKETS
------------------------- ------------------------- ---------- ----------- ----------- --------------- -----------
BOOK_ID BOOK 15 6 40 HYBRID 3
说明:如上所示,成功将height-balanced直方图修改为Hybrid直方图。
4.3 查执行计划
--为避免共享池中的数据影响实验结果,将共享池数据进行flush:
LEO@rmlis> conn / as sysdba
Connected.
SYS@rmlis> alter system flush shared_pool;
System altered.
SYS@rmlis> conn leo/leo;
Connected.
LEO@rmlis> set autotrace traceonly;
LEO@rmlis> select * from book where book_id=25;
Execution Plan
----------------------------------------------------------
Plan hash value: 3071066230
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 12 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| BOOK | 4 | 12 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("BOOK_ID"=25)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
613 bytes sent via SQL*Net to client
397 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
4 rows processed
LEO@rmlis> select * from book where book_id=100;
Execution Plan
----------------------------------------------------------
Plan hash value: 3071066230
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| BOOK | 2 | 6 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("BOOK_ID"=100)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
603 bytes sent via SQL*Net to client
398 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed
说明:如上所示,优化器预估的返回值与实际值吻合,正确评估表中数据有利于优化器做出更优的执行计划。
END