其他2

2023年 9月 2日 200.9k 0

锁 单个SESSION锁等待队列超过20个 #应用#

一、数据库状态检查的步骤/命令(通过本步骤可以确定数据库当前的状态是否OK)

检查连接情况,确保能正常登录、有应用连接、连接数使用率正常

col PROCESS_LIMIT for a20;
select process_current,value process_limit,round(process_current / value, 2) * 100 || '%' as process_pct
from (select count(*) as process_current from v$process) a, v$parameter where name = 'processes';

二、请描述告警出现的可能原因

锁等待,可能是应用逻辑或者语句效率问题

三、预计多久不处理就可以出现UIOC(P1)

持续等待则影响业务语句正常执行,需15分钟内处理正常

四、告警检查的步骤(通过本步骤确定告警是否是误告)

--单个会话的锁等待队列长度

select /*+rule*/'lock_num:'||nvl(max(count(aa.holding_session)),0) lock_num  from dba_waiters aa group by aa.holding_session;

五、需要保存现场的步骤/命令

create table dbmgr.ash_20220000 as select tt.* from gv$active_session_history tt;
create table dbmgr.lock_20220000 as select *from gv$lock;
create table dbmgr.locked_object_20220000 as select *from GV$LOCKED_OBJECT;

六、需要反馈上级以及运营的信息

将堵塞的情况反馈运营/DA,同时确认业务的影响情况,运营确认是否kill holder会话

七、建议处理的流程,步骤和对应命令

1、检查堵塞情况,找到holder,发给运营评估能否kill

a.
set line 1000 pagesize 999
col EVENT for a40
select 
s.username,
 decode(lc.block, 0, 'waiter: ', 'holder: ') || lc.sid sid,
 decode(lc.lmode,
        0,
        'none',
        1,
        'null',
        2,
        'row share',
        3,
        'row excl.',
        4,
        'share',
        5,
        's/row excl.',
        6,
        'exclusive',
        lmode,
        ltrim(to_char(lc.lmode, '990'))) lmode,
 decode(lc.request,
        0,
        'none',
        1,
        'null',
        2,
        'row share',
        3,
        'row excl.',
        4,
        'share',
        5,
        's/row excl.',
        6,
        'exclusive',
        request,
        ltrim(to_char(lc.request, '990'))) request,
 lc.type,
decode( s.sql_id,null,s.PREV_SQL_ID,s.sql_id) sql_id,
s.EVENT,s.BLOCKING_SESSION_STATUS,s.BLOCKING_SESSION,s.BLOCKING_INSTANCE,
 'alter system kill session ' || '''' || S.SID || ',' || S.SERIAL# || ',@' ||
       s.INST_ID || ''''||' immediate;'  kill_command
  from gv$lock lc, gv$session s
 where (lc.id1, lc.id2, lc.type) in
       (select t.id1, t.id2, t.type from gv$lock t where t.request > 0)
   and lc.sid = s.sid
   and lc.inst_id = s.inst_id
 order by lc.id1, lc.request;

b.
with l as(
select /*+materialize*/ decode(request,0,'Holder:','    Waiter:') role, l.inst_id,l.sid, id1,id2,lmode,request,l.type from gv$lock l
where (id1,id2,l.type) in (select id1,id2,type from gv$lock where request>0 ) 
order by id1,request)
select l.*, s.SQL_ID, s.PREV_SQL_ID from l, gv$session s where l.sid = s.sid ;

with l as (select/*+materialize*/ sid from gv$lock where block = 1)
select 'alter system kill session '''||s.sid||','||serial#||',@'||inst_id||''' immediate;' as sql from gv$session s, l where s.sid = l.sid 
order by s.sid,s.serial#;

2、反馈运营会话详情,评估能否kill

--查看sid对应会话详情

set linesize 150 pagesize 999
col EVENT for a30
col USERNAME for a10
col MACHINE for a10
col OSUSER for a10
col PROGRAM for a10
select inst_id,sid,serial#,sql_id,event,blocking_instance,blocking_session,blocking_session_status,final_blocking_instance,final_blocking_session,final_blocking_session_status,username,status,LAST_CALL_ET,machine,osuser,program,sql_hash_value,to_char(logon_time,'yyyy-mm-dd hh24:mi:ss'),prev_sql_id from gv$session where sid=&sid;

--查看sql_id对应具体语句sql_text

set linesize 150 pagesize 999
select a.sql_id,a.child_number,a.sql_text,a.parsing_schema_name,a.first_load_time,a.last_load_time,a.last_active_time,a.plan_hash_value,a.sql_plan_baseline from gv$sql a where sql_id='&sql_id' order by last_active_time desc;

3、杀会话(注意核对会话信息)

执行步骤1生成的kill session语句

八、验证问题已经恢复的步骤和命令

--单个会话的锁等待队列长度,恢复为0则无堵塞

select /*+rule*/'lock_num:'||nvl(max(count(aa.holding_session)),0) lock_num  from dba_waiters aa group by aa.holding_session;

LOCK_NUM

-------------------------------------------------

lock_num:0

oracle行锁等待处理

1、接到告警,有几个SQL大量行锁等待

2、查询top event sql

set linesize 1000
select inst_id,sql_id,EVENT,sum(DELTA_TIME),count(1)
from gv$active_session_history ash
where ash.SAMPLE_TIME >= sysdate - 1/24
group by inst_id,sql_id,EVENT
order by count(1) desc,sum(DELTA_TIME) desc
OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY;

3、查SQL文本,确定被锁的表为 beneficiary_info

select

distinct sql_id,sql_text from gv$sqlarea t where t.sql_id in

('3cp53mvc1wvwj','brwrfn81as7nk');

4、查表锁情况及阻塞源

select sess.inst_id,sess.sid,sess.serial#, lo.oracle_username,lo.os_user_name, ao.object_name,lo.locked_mode,
sess.prev_sql_id,sess.sql_id,sess.status,sess.LAST_CALL_ET,sess.LOGON_TIME,sess.event,
sess.ROW_WAIT_OBJ#,sess.BLOCKING_SESSION,sess.FINAL_BLOCKING_SESSION,sess.BLOCKING_INSTANCE,
(select s.sql_text from gv$sql s where s.sql_id=sess.sql_id and rownum=1) sql_text,
(select s.sql_text from gv$sql s where s.sql_id=sess.prev_sql_id and rownum=1) sql_text1
from gv$locked_object lo,dba_objects ao,gv$session sess
where ao.object_id=lo.object_id
and lo.session_id=sess.sid
and lo.inst_id = sess.inst_id
and ao.object_name=upper('beneficiary_info');

发现有大量阻塞,且没有明确的阻塞源

5、sys登录数据库做hang分析

SYS@lucd01> oradebug setmypid
Statement processed.
SYS@lucd01> oradebug -g all hanganalyze 3
Hang Analysis in /paic/app/oracle/rdbms/diag/rdbms/lucd0/lucd01/trace/lucd01_diag_409518.trc
SYS@lucd01> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0

6、快速截取hang分析关键信息

cnsz083184:lucd01 > grep -n "HANG ANALYSIS:" /paic/app/oracle/rdbms/diag/rdbms/lucd0/lucd01/trace/lucd01_diag_409518.trc

2445:HANG ANALYSIS:

cnsz083184:lucd01 > vi /paic/app/oracle/rdbms/diag/rdbms/lucd0/lucd01/trace/lucd01_diag_409518.trc

vi后esc :2440 跳转到相应行

从HANG ANALYSIS:开始截取,截取到主要阻塞,重点关注 Chains

most likely to have caused the hang 内容

*** 2022-07-28T14:39:38.300811+08:00
===============================================================================
HANG ANALYSIS:
instances (db_name.oracle_sid): lucd0.lucd01, lucd0.lucd02, lucd0.lucd03
oradebug_node_dump_level: 3
analysis initiated by oradebug
os thread scheduling delay history: (sampling every 1.000000 secs)
0.000000 secs at [ 14:39:37 ]
NOTE: scheduling delay has not been sampled for 0.630678 secs
0.000000 secs from [ 14:39:33 - 14:39:38 ], 5 sec avg
0.000000 secs from [ 14:38:38 - 14:39:38 ], 1 min avg
0.000000 secs from [ 14:34:38 - 14:39:38 ], 5 min avg
vktm time drift history
===============================================================================

Chains most likely to have caused the hang:
[a] Chain 1 Signature: 'SQL*Net message from client'

相关文章

Oracle如何使用授予和撤销权限的语法和示例
Awesome Project: 探索 MatrixOrigin 云原生分布式数据库
下载丨66页PDF,云和恩墨技术通讯(2024年7月刊)
社区版oceanbase安装
Oracle 导出CSV工具-sqluldr2
ETL数据集成丨快速将MySQL数据迁移至Doris数据库

发布评论