关于shared pool的深入探讨(六)
原文链接:
http://www.eygle.com/internal/shared_pool-6.htm
研究了几天shared pool,没想到忽然就撞到问题上来了.
作为一个案例写出来给大家参考一下吧.
问题起因是公司做短信群发,就是那个18万买的4000字的短信小说.
群发的时候每隔一段时间就会发生一次消息队列拥堵的情况
在数据库内部实际上是向一个数据表中记录发送日志.
我们介入来检查数据库的问题,在一个拥堵时段我开始诊断:
SQL> select sid,event,p1,p1raw from v$session_wait;
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
76 latch free 2147535824 8000CBD0
83 latch free 2147535824 8000CBD0
148 latch free 3415346832 CB920E90
288 latch free 2147535824 8000CBD0
285 latch free 2147535824 8000CBD0
196 latch free 2147535824 8000CBD0
317 latch free 2147535824 8000CBD0
2 pmon timer 300 0000012C
1 rdbms ipc message 300 0000012C
4 rdbms ipc message 300 0000012C
6 rdbms ipc message 180000 0002BF20
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
18 rdbms ipc message 6000 00001770
102 rdbms ipc message 6000 00001770
311 rdbms ipc message 6000 00001770
194 rdbms ipc message 6000 00001770
178 rdbms ipc message 6000 00001770
3 log file parallel write 1 00000001
13 log file sync 2705 00000A91
16 log file sync 2699 00000A8B
104 log file sync 2699 00000A8B
308 log file sync 2694 00000A86
262 log file sync 2705 00000A91
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
172 log file sync 2689 00000A81
169 log file sync 2705 00000A91
108 log file sync 2694 00000A86
38 log file sync 2707 00000A93
34 db file scattered read 63 0000003F
5 smon timer 300 0000012C
27 SQLNet message to client 1413697536 54435000
60 SQLNet message to client 1413697536 54435000
239 SQLNet message to client 1413697536 54435000
...ignore some idle waiting here...
11 SQLNet message from client 675562835 28444553
12 SQL*Net message from client 1413697536 54435000
170 rows selected.
在这次查询中,我发现大量的latch free等待,再次查询时这些等待消失,应用也恢复了正常.
SQL> select sid,event,p1,p1raw from v$session_wait where event not like 'SQL*Net%';
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
2 pmon timer 300 0000012C
1 rdbms ipc message 300 0000012C
4 rdbms ipc message 300 0000012C
6 rdbms ipc message 180000 0002BF20
18 rdbms ipc message 6000 00001770
102 rdbms ipc message 6000 00001770
178 rdbms ipc message 6000 00001770
194 rdbms ipc message 6000 00001770
311 rdbms ipc message 6000 00001770
3 log file parallel write 1 00000001
148 log file sync 2547 000009F3
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
273 log file sync 2544 000009F0
190 log file sync 2545 000009F1
5 smon timer 300 0000012C
14 rows selected.
接下来我们来看这些latch free等待的是哪些latch
SQL> select addr,latch#,name,gets,spin_gets from v$latch order by spin_gets;
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
80001398 3 session switching 111937 0
80002010 6 longop free list 37214 0
800023A0 7 cached attr list 0 0
80002628 10 event group latch 2391668 0
.....
80003F3C 28 message pool operations parent latch 3 0
.....
80006030 60 mostly latch-free SCN 19 0
80005F8C 59 file number translation table 68 0
80005F14 58 dlm cr bast queue latch 0 0
80005E8C 57 name-service request 0 0
80005E14 56 name-service memory objects 0 0
80005DA0 55 name-service namespace bucket 0 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
80005D2C 54 name-service pending queue 0 0
80005CB4 53 name-service request queue 0 0
80004E08 52 name-service entry 0 0
80008AB0 76 KCL lock element parent latch 0 0
80008A48 75 KCL instance latch 0 0
80007F18 73 redo copy 816 0
80007BBC 71 archive process latch 0 0
80007B54 70 archive control 1 0
80006A10 68 Active checkpoint queue latch 2003308 0
800064B0 66 large memory latch 0 0
80006448 65 cache protection latch 0 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
800060EC 61 batching SCNs 0 0
8000CAB0 96 global transaction 6833807 0
8000CA48 95 global tx free list 58258 0
8000C238 93 cost function 0 0
80009FCC 91 temp lob duration state obj allocation 0 0
8000995C 87 ktm global data 8118 0
80009228 84 transaction branch allocation 282388 0
80008EC4 80 begin backup scn array 6968 0
80008D54 79 loader state object freelist 42712 0
80008B80 78 KCL freelist latch 0 0
80008B18 77 KCL name table latch 0 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
8000D484 118 presentation list 0 0
8000D41C 117 session timer 855944 0
.....
8000E9D0 129 process queue 44 0
8000E900 127 query server freelists 66 0
8000FC84 140 AQ Propagation Scheduling System Load 0 0
8000E898 126 query server process 10 0
8000E27C 125 job_queue_processes parameter latch 111937 0
8000DA1C 124 NLS data objects 2 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
8000D95C 123 ncodef allocation latch 111937 0
8000D674 122 virtual circuits 0 0
8000D60C 121 virtual circuit queues 159877 0
8000D5A4 120 virtual circuit buffers 0 0
8000D4EC 119 address list 2 0
.....
8000CD70 102 Direct I/O Adaptor 2 0
.....
80002408 8 GDS latch 30 0
800092E4 85 sort extent pool 69834 1
8000EC38 132 parallel query alloc buffer 80 1
8000E968 128 error message lists 22 1
80001400 4 process group creation 2615542 2
8000EAA0 131 parallel query stats 14 2
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
8000CD08 101 Token Manager 1151107 2
8000CB18 97 global tx hash mapping 507846 2
80006378 63 cache buffer handles 315924 4
8000EA38 130 process queue reference 190993 5
80003E3C 26 channel handle pool latch 2391680 18
80003EAC 27 channel operations parent latch 4783425 24
80009B90 89 intra txn parallel recovery 32654 33
8000FCF8 141 fixed table rows for x$hs_session 161368 41
800012C8 1 process allocation 2391688 154
80009B28 88 parallel txn reco latch 174519 271
8000CCA0 100 library cache load lock 14947545 5958
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
8000C8D0 94 user lock 13086412 6078
8000914C 82 list of block allocation 120650357 12024
80006A78 69 Checkpoint queue latch 154361751 17686
80009D34 90 sequence cache 64611720 32027
80009090 81 dml lock allocation 234465024 45351
800091C0 83 transaction allocation 214227648 48345
800096AC 86 undo global data 188271244 49641
800028A0 13 enqueue hash chains 373244264 131322
80007E04 72 redo allocation 439389808 201498
80001468 5 session idle bit 2039097976 204969
80002838 12 enqueues 471338482 273695
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ---------------------------------------------------------------- ---------- ----------
80001330 2 session allocation 261826230 428312
800063E0 64 multiblock read objects 1380614923 1366278
800026B8 11 messages 207935758 1372606
80001218 0 latch wait list 203479569 1445342
80006310 62 cache buffers chains 3.8472E+10 2521699
8000A17C 92 row cache objects 1257586714 2555872
80007F80 74 redo writing 264722932 4458044
80006700 67 cache buffers lru chain 5664313769 30046921
8000CBD0 98 shared pool 122433688 59070585
8000CC38 99 library cache 4414533796 1037032730
142 rows selected.
SQL> select startup_time from v$instance;
STARTUP_T
---------
13-AUG-04
检查数据库启动时间
我们注意到,在当前数据库中竞争最严重的两个latch是shared pool和library cache.
显然这极有可能是SQL的过度解析造成的.
进一步我们检查v$sqlarea发现:
SQL> select sql_text,VERSION_COUNT,INVALIDATIONS,PARSE_CALLS,OPTIMIZER_MODE,PARSING_USER_ID,PARSING_SCHEMA_ID,ADDRESS,HASH_VALUE
from v$sqlarea where version_count >1000;
2
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------
VERSION_COUNT INVALIDATIONS PARSE_CALLS OPTIMIZER_MODE PARSING_USER_ID PARSING_SCHEMA_ID ADDRESS HASH_VALUE
------------- ------------- ----------- ------------------------- --------------- ----------------- -------- ----------
insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,MSGLEN,MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC
,ADDRDEST,ADDRFEE,ADDRUSER,SERVICECODE,PLANID,FEETY