Oracle主机CPU使用率突增问题排查

2024年 4月 28日 77.6k 0

点击上方“IT那活儿”公众号--专注于企业全栈运维技术分享,不管IT什么活儿,干就完了!!!    

故障介绍

突然收到数据库主机CPU使用率突增告警,最终查到原因还是SQL语句执行的问题,具体原因和解决办法由以下的模拟故障来展示。

环境构建

环境介绍:

操作系统是Redhat 7.8,数据库版本是19.3单实例版本。2.1 先创建空表

SQL> create table tobj (
  2  id number,
  3  name varchar2(128)
  4  )
  5  partition by range(id)
  6  (
  7  partition p01 values less than(500),
  8  partition p02 values less than(1000),
  9  partition p03 values less than(1500),
 10  partition p04 values less than(2000),
 11  partition p05 values less than(2500),
 12  partition p06 values less than(3000),
 13  partition p07 values less than(4000),
 14  partition p08 values less than(5000)
 15  );

Table created.

再插入数据:

SQL> insert into tobj (id,name) select object_id,object_name from dba_objects where object_id commit;

Commit complete.

2.2 创建local索引

SQL> create index ind_id on tobj(id) local;

Index created.

查看分区索引状态:

select index_owner,index_name,partition_name,status,tablespace_name
from dba_ind_partitions where index_name='IND_ID';

2.3 正常情况下SQL语句的执行

查询指定id的数据,看执行计划是否走索引。

SQL> select * from tobj where id =3009;

                  ID NAME
-------------------- --------------------------------------------------------------------------------------------------------------------------------
                3009 GV_$RSRC_PLAN_HISTORY

Execution Plan
----------------------------------------------------------
Plan hash value: 15562560

---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |     1 | 79 |     2   (0)| 00:00:01 |       | |
| 1 |  PARTITION RANGE SINGLE | |     1 | 79 |     2   (0)| 00:00:01 |     7 | 7 |
| 2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TOBJ |     1 | 79 |     2   (0)| 00:00:01 |     7 | 7 |
|* 3 |    INDEX RANGE SCAN | IND_ID |     1 | |     1   (0)| 00:00:01 |     7 | 7 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID"=3009)

可以看出正常走了索引扫描。

2.4 异常情况下SQL语句的执行

1)创建一张表并插入数据

SQL> create table tobj_sw (
  2 id number,
  3 name varchar2(128)
  4 );

Table created.

SQL> insert into tobj_sw (id,name) select object_id,object_name from dba_objects where object_id>5000 and object_id commit;

Commit complete.

2)给分区表添加一个新分区

SQL> alter table tobj add partition p09 values less than('8000');

Table altered.

SQL> select partition_name from user_tab_partitions where table_name='TOBJ';

PARTITION_NAME
--------------------------------------------------------------------------------
P01
P02
P03
P04
P05
P06
P07
P08
P09

9 rows selected.

3)查询分区索引的状态

select 
index_owner,index_name,partition_name,status,tablespace_name
from dba_ind_partitions where index_name='IND_ID';

4)分区交换

SQL> alter table tobj exchange partition p09 with table tobj_sw;

Table altered.

5)查询普通表的数据

SQL> select count(*) from tobj_sw;

  COUNT(*)
----------
         0

6)查询分区表的数据

SQL> select count(*) from tobj;

  COUNT(*)
----------
      7984

7)查询分区索引的状态

select 
index_owner,index_name,partition_name,status,tablespace_name
from dba_ind_partitions where index_name='IND_ID';

发现新的分区的索引状态失效了。8)查询新分区的数据,看执行计划是否走索引

SQL> set autotrace on
SQL> select * from tobj where id =5009;

                  ID NAME
-------------------- --------------------------------------------------------------------------------------------------------------------------------
                5009 DBA_LOCKDOWN_PROFILES

Execution Plan
----------------------------------------------------------
Plan hash value: 4166245901

-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |     1 | 79 |     5   (0)| 00:00:01 |       | |
| 1 |  PARTITION RANGE SINGLE| |     1 | 79 |     5   (0)| 00:00:01 |     9 | 9 |
|* 2 |   TABLE ACCESS FULL | TOBJ |     1 | 79 |     5   (0)| 00:00:01 |     9 | 9 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ID"=5009)

可以看到查询新分区里的数据,走的是全表扫描。9)查询旧分区的数据,看执行计划是否走索引

SQL> select * from tobj where id =4009;

                  ID NAME
-------------------- --------------------------------------------------------------------------------------------------------------------------------
                4009 GV$HANG_STATISTICS

Execution Plan
----------------------------------------------------------
Plan hash value: 15562560

---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |     1 | 79 |     2   (0)| 00:00:01 |       | |
| 1 |  PARTITION RANGE SINGLE | |     1 | 79 |     2   (0)| 00:00:01 |     8 | 8 |
| 2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TOBJ |     1 | 79 |     2   (0)| 00:00:01 |     8 | 8 |
|* 3 |    INDEX RANGE SCAN | IND_ID |     1 | |     1   (0)| 00:00:01 |     8 | 8 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID"=4009)

看结果,不在新分区里的数据,走的是索引扫描。

解决办法

3.1 创建一张表并插入数据

SQL> create table tobj_sw2 (
  2 id number,
  3 name varchar2(128)
  4 );

Table created.

SQL> insert into tobj_sw2 (id,name) select object_id,object_name from dba_objects where object_id>8000 and object_id commit;

Commit complete.

3.2 给分区表添加一个新分区

SQL> alter table tobj add partition p10 values less than('10000');

Table altered.

SQL> create index ind_id2 on tobj_sw2(id);

Index created.

SQL> select partition_name from user_tab_partitions where table_name='TOBJ';

PARTITION_NAME
--------------------------------------------------------------------------------
P01
P02
P03
P04
P05
P06
P07
P08
P09
P10

10 rows selected.

3.3 查询分区索引的状态

select 
index_owner,index_name,partition_name,status,tablespace_name
from dba_ind_partitions where index_name='IND_ID';

3.4 分区交换

SQL> alter table tobj exchange partition p10 with table tobj_sw2 including indexes;

Table altered.

3.5 查询分区索引的状态

select index_owner,index_name,partition_name,status,tablespace_name
from dba_ind_partitions where index_name='IND_ID';

3.6 查询新分区的数据,看执行计划是否走索引

SQL> set autotrace on
SQL> select * from tobj where id =8009;

                  ID NAME
-------------------- --------------------------------------------------------------------------------------------------------------------------------
                8009 ALL_RULES

Execution Plan
----------------------------------------------------------
Plan hash value: 15562560

---------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |     1 | 79 |     2   (0)| 00:00:01 |       | |
| 1 |  PARTITION RANGE SINGLE | |     1 | 79 |     2   (0)| 00:00:01 |    10 | 10 |
| 2 |   TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| TOBJ |     1 | 79 |     2   (0)| 00:00:01 |    10 | 10 |
|* 3 |    INDEX RANGE SCAN | IND_ID |     1 | |     1   (0)| 00:00:01 |    10 | 10 |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID"=8009)

可以看到查询新分区里的数据,走的是索引扫描。

总 结:

虽然分区表使用的是local 索引,但是经过分区交换,若不加“including indexes”,那么该涉及分区的SQL语句无法使用索引,只能全表扫描。此次数据库主机CPU使用率高就是这个原因,业务在分区表做交换分区时,未添加“including indexes”,导致执行计划走了全表扫描,从而引发CPU使用率突增,最终通过重建索引解决该问题。
END

本文作者:胡 伟(上海新炬中北团队)

本文来源:“IT那活儿”公众号

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论