一、如何利用函数索引优化查询
函数索引是一种优化技术,主要用途是加速包含函数调用的查询语句。当查询中包含函数调用时,数据库需要对每行都做函数计算,这会导致查询变慢。使用函数索引可以在查询时快速定位匹配的函数值,从而避免重复计算,提高查询效率。下面是一个利用函数索引来优化查询的示例。
假设有一张表t1,t1中有一个数据类型为date的列date_col:
create table t1(date_col date, sales_col int, name_col varchar(10));
如果需要获取3月份的数据,那么可以使用Month()函数来查询:
select * from t1 where month(date_col) = 3;
这种场景下,数据库需要为表中每行计算month(date_col),过滤掉不符合"month(date_col) = 3"的行。如果需要频繁使用月份信息来过滤数据,每次都需要重新计算month(date_col),就会造成大量开销。因此可以在date_col上建一个函数索引,将month(date_col)存储到了索引表中,来加速查询,创建函数索引的语句如下:
create index i1 on t1((month(date_col)));
建立索引后,查询时就可以直接使用该索引,避免了对每个行进行函数计算,提高了查询效率。
explain select * from t1 where month(date_col) = 3;
+-------------------------------------------------------------------------------------------------+
| Query Plan |
+-------------------------------------------------------------------------------------------------+
| ================================================== |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| -------------------------------------------------- |
| |0 |TABLE RANGE SCAN|t1(i1)|1 |7 | |
| ================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t1.date_col], [t1.sales_col], [t1.name_col]), filter(nil), rowset=16 |
| access([t1.__pk_increment], [t1.date_col], [t1.sales_col], [t1.name_col]), partitions(p0) |
| is_index_back=true, is_global_index=false, |
| range_key([t1.SYS_NC19$], [t1.__pk_increment]), range(3,MIN ; 3,MAX), |
| range_cond([t1.SYS_NC19$ = 3]) |
+-------------------------------------------------------------------------------------------------+
二、OceanBase 4.2在MySQL模式下支持函数索引功能
OceanBase 4.1 以及之前的版本中,已在Oracle模式下支持了函数索引功能。OceanBase 4.2 在MySQL模式支持函数索引功能,兼容MySQL 8.0。
示例1:使用create index语句创建函数索引。
语法如下:
CREATE [UNIQUE | FULLTEXT | SPATIAL] INDEX index_name
ON tbl_name (expr,...)
[index_option] ...
key_part: (expr) [ASC | DESC]
expr是一个合法的函数索引表达式,且允许是布尔表达式,例如"c1=c1"。与MySQL不同的是,OceanBase禁止在函数索引的定义中引用生成列。
例如以下语句在t1_func表上创建了一个索引定义是c1+c2 < 1的函数索引i1。
create table t1_func(c1 int, c2 int);
create index i1 on t1_func ((c1+c2 < 1));
示例2:使用alter table语句创建函数索引。
语法如下:
ALTER TABLE tbl_name
[alter_option [, alter_option] ...]
[partition_options]
alter_option: {
table_options
| ADD {INDEX | KEY} [index_name]
[index_type] (key_part,...) [index_option] ...
| ADD SPATIAL [INDEX | KEY] [index_name]
(key_part,...) [index_option] ...
| ...
key_part: (expr) [ASC | DESC]
例如以下语句在t1_func上添加了3个函数索引,其中一个名字是i2,另外两个由系统自动生成的名称,格式为‘functional_index’前缀加编号。
alter table t1_func add index ((concat(c1,'a')));
alter table t1_func add index ((c1+1));
alter table t1_func add index i2 ((concat(c1,'a')));
示例3:使用create table语句在建表时创建函数索引。
语法如下:
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] tbl_name
(create_definition,...)
[table_options]
[partition_options]
create_definition: {
col_name column_definition
| {INDEX | KEY} [index_name] [index_type] (key_part,...)
[index_option] ...
| SPATIAL [INDEX | KEY] [index_name] (key_part,...)
[index_option] ...
| [CONSTRAINT [symbol]] UNIQUE [INDEX | KEY]
[index_name] [index_type] (key_part,...)
[index_option] ...
...
}
key_part: (expr) [ASC | DESC]
例如以下语句在创建表t2_func时,创建了一个函数索引i1。
create table t2_func (c1 int, c2 int, index i1 ((c1+1)), unique key ((c1+c2)));
可以使用show create table语句查看创建的函数索引:
show create table t1_func;
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1_func | CREATE TABLE `t1_func` (
`c1` int(11) DEFAULT NULL,
`c2` int(11) DEFAULT NULL,
KEY `i1` (((`c1` + `c2`) < 1)) BLOCK_SIZE 16384 LOCAL,
KEY `functional_index` (concat(`c1`,'a')) BLOCK_SIZE 16384 LOCAL,
KEY `functional_index_2` ((`c1` + 1)) BLOCK_SIZE 16384 LOCAL,
KEY `i2` (concat(`c1`,'a')) BLOCK_SIZE 16384 LOCAL
) DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'lz4_1.0' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
show create table t2_func;
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t2_func | CREATE TABLE `t2_func` (
`c1` int(11) DEFAULT NULL,
`c2` int(11) DEFAULT NULL,
UNIQUE KEY `functional_index` ((`c1` + `c2`)) BLOCK_SIZE 16384 LOCAL,
KEY `i1` ((`c1` + 1)) BLOCK_SIZE 16384 LOCAL
) DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'lz4_1.0' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
也可以使用show index命令来查看已创建的函数索引,例如:
show index from t1_func;
+---------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+---------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------------+
| t1_func | 1 | i1 | 1 | SYS_NC18$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | ((`c1` + `c2`) < 1) |
| t1_func | 1 | functional_index | 1 | SYS_NC19$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | concat(`c1`,'a') |
| t1_func | 1 | functional_index_2 | 1 | SYS_NC20$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | (`c1` + 1) |
| t1_func | 1 | i2 | 1 | SYS_NC19$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | concat(`c1`,'a') |
+---------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------------+
show index from t2_func;
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------+
| t2_func | 1 | i1 | 1 | SYS_NC18$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | (`c1` + 1) |
| t2_func | 0 | functional_index | 1 | SYS_NC19$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | (`c1` + `c2`) |
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------+
三、OceanBase 4.2禁止非确定性函数用于函数索引
OceanBase 4.2版本禁止了一些非确定性的系统函数被用于创建函数索引和生成列,以提升稳定性。这些系统函数的结果会随着系统或用户环境的变化而变化的函数,例如:
select current_time();
+----------------+
| current_time() |
+----------------+
| 20:44:22 |
+----------------+
select current_time();
+----------------+
| current_time() |
+----------------+
| 20:44:24 |
+----------------+
MySQL模式
以下函数被禁止用于生成列和函数索引:
aes_decrypt | from_unixtime | statement_digest |
aes_encrypt | host_ip | sys_privilege_check |
benchmark | icu_version | sysdate |
connection_id | is_serving_tenant | time_to_usec |
current_date | last_excution_id | unix_timestamp |
current_time | last_insert_id | usec_to_time |
current_timestamp | lnnvl | user |
current_user | localtimestamp | utc_date |
current_user_priv | mysql_port | utc_time |
database | name_const | utc_timestamp |
des_decrypt | ob_version | uuid |
des_encrypt | rand | uuid_short |
des_hex_str | random_bytes | validate_password_strength |
dump | row_count | version |
encrypt | rpc_port | weight_string |
found_rows | sleep |
一些时间相关的系统函数禁止在参数为TIME类型时用于函数索引,例如:
create table time_func(c1 time, c2 date);
create index i1 on time_func((date(c1)));
ERROR 3758 (HY000): Expression of functional index contains a disallowed function.
涉及以下函数:
adddate | dayofyear | timestampdiff |
date | last_day | to_days |
date_sub | month | week |
datediff | monthname | weekday |
day | quarter | weekofyear |
dayname | timestamp | year |
dayofmonth | timestamp_nvl | yearweek |
dayofweek | timestampadd |
此外,cast函数将time类型的参数转换成非time的其他时间类型,或者将timestamp类型的参数转换为非timestamp的其他数据类型时,也被禁止用于生成列和函数索引,例如:
create index i1 on time_func((cast(c1 as DATE)));
ERROR 3758 (HY000): Expression of functional index contains a disallowed function.
Oracle模式
与之前的版本相比,4.2版本在Oracle模式下新增禁止了以下函数用于生成列和函数索引:
dump | ob_version | rpc_port |
host_ip | regexp_count | scn_to_timestamp |
is_serving_tenant | regexp_instr | timestamp_to_scn |
last_trace_id | regexp_substr |
以下系统函数在参数是字符串类型时禁止用于函数索引:
add_months | next_day |
last_day | timestamp_nvl |
months_between |
例如:
create table t1(c1 date, c2 varchar(40), c3 timestamp);
create index i1 on t1(ADD_MONTHS(c2, 1));
ORA-01743: only pure functions can be indexed
此外,cast函数在将字符串转为时间类型、将时间类型转为字符串、将不带timezone信息的时间类型转为带timezone信息的时间类型的情况下禁止用于函数索引,例如:
create table t2(c1 varchar(20), c2 date, c3 TIMESTAMP WITH TIME ZONE, c4 number);
create index i1 on t2(cast(c1 as DATE));
ORA-01743: only pure functions can be indexed
create index i1 on t2(cast(c2 as TIMESTAMP WITH TIME ZONE));
ORA-01743: only pure functions can be indexed
create index i1 on t2(cast(c3 as VARCHAR(10)));
ORA-01743: only pure functions can be indexed
四、总结
Oceanbase 4.2版本支持了在MySQL模式下创建和使用函数索引,并且禁止了部分非确定性函数用于创建函数索引和生成列以提升稳定性。但是其中部分函数在MySQL和Oracle中是允许用于函数索引的,例如:
(Mysql 8.0.31)
mysql> create table time_func(c1 timestamp, c2 date);
Query OK, 0 rows affected (0.04 sec)
mysql> create index i1 on time_func((date(c1)));
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
Oceanbase后续版本将会放开对这些函数的限制,进一步提升兼容性。