[English]
作者:
fuyuncat
来源:
www.HelloDBA.com
1、并行化跟踪设置
在OLAP或DSS系统中,或者在OLTP中大批量处理数据时,我们经常会借助于并行化处理来最大化利用硬件资源使作业在更短的时间内返回结果。由于影响并行化的因素比较多,如cpu数量、系统参数设置、分区等等,在并行化过程中,又可能会遇到一些意想不到的问题,例如,语句不能并行化或者并行度(DOP Degree Of Parallel)和预期的不一致。这时我们就可以通过Parallel Trace对并行过程进行分析,找出问题所在。
在9i之前,可以通过设置10384~10399 event对并行会话进行trace,在9i以后,可以通过在会话中设置“_px_trace”来实现:
alter session set "_px_trace"=[[Verbosity,]area],[[Verbosity,]area],..,[time];
例如:
alter session set "_px_trace"=high,execution,medium,execution,time;
其中参数可以设置的值:
Verbosity:
Ø high
Ø medium
Ø low
Area:
Ø scheduling - ( 与某些10384和10390事件相同)
Ø execution - (与某些10390事件相同)
Ø granule - (与某些10390和10391事件相同)
Ø messaging - (与某些10392和10393事件相同)
Ø buffer - (和10399事件相同)
Ø compilation
Ø all - all of the above
Ø none - none of the above.
Timing
Ø time
2、跟踪文件解析
下面我们产生一个trace文件,并分别解释Trace文件中各个部分的含义(蓝色字体部分为解释):
SQL> ALTER SESSION FORCE PARALLEL DML;
Session altered.
SQL> alter session set "_px_trace"= "all";
Session altered.
SQL> UPDATE /*+ PARALLEL(CT,10) */ CS2_CT_MVMT CT
2 SET CT.LAST_MOD_DT = NVL((SELECT TL.MSG_GMT_DT
3 FROM CS2_TXN_LOG TL
4 WHERE CT.MSG_ID = TL.MSG_ID),CT.LAST_MOD_DT)
5 WHERE CT.MSG_ID > 9000000000000000000;
ERROR at line 1:
ORA-01013: user requested cancel of current operation
SQL> alter session set "_px_trace"=”none”;
Session altered.
· 文件头,关于Oracle DB Server的基本信息:
/home/oracle/admin/cs2prd/udump/cs2prd1_ora_27073.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
ORACLE_HOME = /home/oracle/RAC/10.2.0
System name: Linux
Node name: pmrac01.cargosmart.com
Release: 2.6.9-55.ELsmp
Version: #1 SMP Fri Apr 20 16:36:54 EDT 2007
Machine: x86_64
Instance name: cs2prd1
Redo thread mounted by this instance: 1
Oracle process number: 153
Unix process pid: 27073, image: oracle@pmrac01.cargosmart.com (TNS V1-V3)
· 以下为并行度信息:
*** ACTION NAME:() 2009-03-16 17:31:25.122
*** MODULE NAME:(SQL*Plus) 2009-03-16 17:31:25.122
*** SERVICE NAME:(SYS$USERS) 2009-03-16 17:31:25.122
*** SESSION ID:(1802.17915) 2009-03-16 17:31:25.122
Kkfdapdml --- 函数名称
pgadep:0 pdml mode:2 PQ allowed DML allowed not autonomous => allowed
--> 从这段信息看,Parallel DML已经激活了
The table : 65109 ---> 相关对象的object_id
kkfdPaPrm
DOP = 10 (computed from hint/dictionary) --->并行度(及并行度的来源,这里是从HINT或者数据字典中——即对象属性设置中来)
use hint DOP on table ---> 最终使用HINT中的DOP设置
kkfdPaForcePrm force DML d = 32767 ---> 当强制使用PDML时,DOP可能的最大值为 32767
The table : 64796 ---> 下一个相关对象的object_id
… …
kxfrDefaultDOP --> 计算默认DOP的函数
DOP Trace -- compute default DOP
# CPU = 8 ---> CPU数量
Threads/CPU = 2 ("parallel_threads_per_cpu") ---> parallel_threads_per_cpu的设置值
default DOP = 16 (# CPU * Threads/CPU) --> 单个实例的默认DOP=CPU数量*parallel_threads_per_cpu
default DOP = 32 (DOP * # instance) --> RAC的默认DOP=单个实例的默认DOP*RAC实例数
… …
The table : 65109
kkfdPaPrm
DOP = 10 (computed from hint/dictionary)
use hint DOP on table
kkfdtip
serial - DOP = 1 (from kkfdPaPrm()) or index not part'd ---> 对象未分区
The table : 65109
… …
kkfdPiAlo
DOP = 4 (kkfdpideg)
Part. No. = 0 (kkfdpiPnum)
No Pruning Info
Pararllel Operation Type = 6 (kkfdpiOpType)
Flags = 48 (kkfdpiflg)
min. tranx free list = 65535 (kkfdpimgi) --> Min Transaction Freelists, 限制了DOP的最大值
… …
· 以下开始请求和分配并行进程:
kxfralo
DOP trace -- requested thread from best ref obj = 10 (from kxfrIsBestRef
()) ---> 根据最佳对象的DOP来请求线程数.
kxfralo
threads requested = 10 (from kxfrComputeThread()) ---> 请求分配线程数
kxfralo
adjusted no. threads = 10 (from kxfrAdjustDOP()) ---> 根据相关算法调整后的线程数
kxfralo
about to allocate 10 slaves --> 最终准备分配的线程数
kxfrAllocSlaves
DOP trace -- call kxfpgsg to get 10 slaves ---> 调用函数kxfpgsg来获取slave进程
kxfpgsg
num server requested = 10
kxfplist
Getting instance info for open group
kxfpiinfo
inst[cpus:mxslv] ---每个实例的CPU数量和parallel_max_servers的设置,我们这里是2个实例的RAC环境
1[8:160] 2[8:0] -- 第一个实例中有8颗CPU,parallel_max_servers为160,第二个实例中有8颗CPU,parallel_max_servers为0
kxfpclinfo
inst(load:user:pct:fact)aff --->各个实例的负载信息,Oracle会根据这些负载在并行化时进行负载均衡。
1(6:0:100:133) 2(7:0:100:114)
---> 实例1的负载为6,并行用户数为0,能100%的利用系统资源(即没有资源限制),负载因子为133 ---> 实例2的负载为7,并行用户数为0,能100%的利用系统资源(即没有资源限制),负载因子为114
---> 这里最关键的数据为负载因子,它是由前3个数据计算得出,表明该机器还能接受多大量的负载,数字越大,说明能接受的负载越多 kxfpAdaptDOP
Requested=10 Granted=10 Target=128 Load=13 Default=32 users=0 sets=2
---> 请求10个进程, 并同意了这10个请求 ---> 根据整个RAC的资源状况,128个负载是一个最佳的负载状态.
如果这个值小于请求数,那么可能会抛出ORA错误或者自动减少进程数(如PARALLEL_AUTOMATIC_TUNING或者PARALLEL_ADAPTIVE_MULTI_USER被设置为true时) ---> 当前负载为13(6+7).
----> 默认DOP是32
---> 当前并行用户数为0.
---> 请求的进程集数为2.
kxfpgsg
load adapt num servers requested to = 10 (from kxfpAdaptDOP())
kxfpsori
Sorted: 1(6:133) 2(7:114) --->根据实例的负载情况排序将有限从排序更前的实例中分配进程
kxfrDefaultDOP
DOP Trace -- compute default DOP
# CPU = 8
Threads/CPU = 2 ("parallel_threads_per_cpu")
default DOP = 16 (# CPU * Threads/CPU)
default DOP = 16 (DOP * # instance)
kxfpldbl
compute default DOP = 64 (from kxfrDefaultDOP()) for loadbalancing
kxfpunit
var=1 limit=64 use_aff=0 aff_num=0 unit=5 ---> 每次以5个为单位发出进程分配请求.
kxfpgsg
getting 2 sets of 10 threads, client parallel query execution flg=0x30 ---> coordinator请求分配2个进程集(可参加语句的查询计划)
Height=0, Affinity List Size=0, inst_total=2, coord=1
Insts 1 2 ---> 每个进程集准备从各个实例请求分配的进程数
Threads 10 0 ---> 从实例1请求分配10个,从实例2请求分配0个
--- 开始为第一个进程集分配进程
kxfpg1srv
trying to get slave P000 on instance 1 –--> 试图从实例1中获取P000 slave进程
kxfpg1sg
Got It. 1 so far. ---> 成功获取到进程,到目前为止已分配了1个进程
……
kxfpg1sg
Got It. 10 so far. ---> 成功为第一个进程集分配到了10个进程
kxfpg1sg
got 10 servers (sync), returning...
--- 开始为第二个进程集分配进程
kxfpg1srv
trying to get slave P000 on instance 1
kxfpg1srv
trying to get slave P001 on instance 1
kxfpg1srv
trying to get slave P002 on instance 1
… …
kxfpg1sg
Got It. 10 so far. ---> 成功为第二个进程集分配到了10个进程
kxfpg1sg
got 10 servers (sync), returning...
GROUP GET
Acquired 20 slaves in 2 sets q serial 3454977 --- 总共为2个进程集分配了20个slave进程
P000 inst 1 spid 27075
P001 inst 1 spid 27077
P002 inst 1 spid 27084
P003 inst 1 spid 27086
P004 inst 1 spid 27088
P005 inst 1 spid 27090
P006 inst 1 spid 27092
P007 inst 1 spid 27094
P008 inst 1 spid 27096
P009 inst 1 spid 27098
P010 inst 1 spid 27100
P011 inst 1 spid 27102
P012 inst 1 spid 27108
P013 inst 1 spid 27110
P014 inst 1 spid 27112
P015 inst 1 spid 27114
P016 inst 1 spid 27116
P017 inst 1 spid 27118
P018 inst 1 spid 27120
P019 inst 1 spid 27122
kxfpgsg
got 10 threads on 1 instance(s), avg height=10 ---> 每个进程集从单个实例中分配了10个进程,
平均高度为10.这里的高度是指每个进程集在一个实例上的进程数量。这里只从一个实例分配了10个进程,因此为10.
假如一个进程集分别从2个实例上分配了5个进程,那么高度就为5.当然,这数量是不能低于前面的单元数的。
Insts 1 2
Svrs 20 0 ---> 最终,2个进程集总共从实例1分配了20个进程,从实例2分配了0个进程(因为我们的实例的参数parallel_max_servers为0,因此不可能从实例2上分配并行进程)
· 以下部分显示了优化器对并行化查询的解析过程:
--- 进程集1中的解析过程
kxfxcPutSession --> 这一部分只在10g和以上版本中才有,显示了在CBO影响并行化执行的优化器参数.
Send Compilation environment:
optimizer_mode_hinted = false
optimizer_features_hinted = 0.0.0
parallel_execution_enabled = true
parallel_query_forced_dop = 0
parallel_dml_forced_dop = default
parallel_ddl_forced_degree = 0
… …
--- 以下为解析过程
flashback session env:scn: 0xffff.ffffffff
kxfxpnd
i: 0, cnt: 10
kxfxpnd
calling kxfxmdmp
enqueueing:
MSG( -->, KXFXOparse, STREAM_HINT )
kxfxmh at addr: 0x16368a760
kxfxmh->kxfxmhh.kxfmhtyp: KXFMHFSX
kxfxmh->kxfxmhmsz: 1912
kxfxmh->kxfxmhflg: 0x5
kxfxmh->kxfxmhotyp: , KXFXOparse
kxfxmp->kxfxmppxid: 1
kxfxmp->kxfxmptqidb: 3583324
kxfxmp->kxfxmpssno: 1
kxfxmp->kxfxmpsno: 0
kxfxmp->kxfxmpnslv: 10
kxfxmp->kxfxmpntqd: 2
kxfxmp->kxfxmpnpqd: 10
kxfxmp->kxfxmpnss: 0
kxfxmp->kxfxmpndfo: 2
kxfxmp->kxfxmpsqllen: 251
kxfxmp->kxfxmpgvsqllen: 0
kxfxmp->kxfxmpoct: 6
kxfxmp->kxfxmpflg: TOP/NO PEEK/TRANS/CVM OFF/PDML ON/KKOISRECUR OF
F
kxfxmp->kxfxmpqcxid: xid: 0x004a.002.00087961
kxfxmp->kxfxmpbuf: 0x16368a858
… …
kxfxcp1
Sending parse to slave set 1: ---> 进程集1中解析的语句
User sqllen sent from QC = 251
UPDATE /*+ PARALLEL(CT,10) */ CS2_CT_MVMT CT
SET CT.LAST_MOD_DT = NVL((SELECT TL.MSG_GMT_DT
FROM CS2_TXN_LOG TL
WHERE CT.MSG_ID = TL.MSG_ID),CT.LAST_MOD_DT)
WHERE CT.MSG_ID > 9000000000000000000
……
接下来是进程集2中的解析过程
kxfxcPutSession
Send Compilation environment:
optimizer_mode_hinted = false
optimizer_features_hinted = 0.0.0
parallel_execution_enabled = true
parallel_query_forced_dop = 0
parallel_dml_forced_dop = default
parallel_ddl_forced_degree = 0
parallel_ddl_forced_instances = 0
_query_rewrite_fudge = 90
optimizer_features_enable = 10.2.0.3
_optimizer_search_limit = 5
……
3、案例分析
在我们对一次海量数据更新的调优过程中(见文章《让语句横着走————对海量数据更新的并行优化》),曾经出现一件令人费解的事情:有时从会话信息中找不到任何并行会话,看起来似乎并行化“消失”了。于是,在测试脚本中,我加入了以下语句打开并行跟踪。
alter session set "_px_trace"= "all";
终于再一次的发生这种情况,并成功获取到了trace文件。打开文件看看到底怎么回事?
首先看并行度部分:
kkfdapdml
pgadep:0 pdml mode:2 PQ allowed DML allowed not autonomous => allowed
The table : 65109
kkfdPaPrm
DOP = 6 (computed from hint/dictionary)
use hint DOP on table
……
kxfralo
DOP trace -- requested thread from best ref obj = 6 (from kxfrIsBestRef(
))
kxfralo
threads requested = 6 (from kxfrComputeThread())
kxfralo
adjusted no. threads = 6 (from kxfrAdjustDOP())
可以看到PDML已经并激活,并行度计算也正确。接下来,就发现问题了:
kxfpiinfo
inst[cpus:mxslv]
1[8:160] 2[8:160]
我们的环境是RAC,有两个节点,parallel_max_servers都被设置为0,为了测试,我在测试脚本中临时将parallel_max_servers改为160,并且只在实例1上进行了测试。但是这里却发现2个节点都被设置为了160。
继续看下面的跟踪信息:
kxfpclinfo
inst(load:user:pct:fact)aff
1(10:0:100:80) 2(5:0:100:160)
这里说明实例1的负载容量远远小于负载2,接下来的信息也证明了我的猜想——所有并行进程都被分配到实例2上面去了!
kxfpgsg
getting 2 sets of 6 threads, client parallel query execution flg=0x30
Height=0, Affinity List Size=0, inst_total=2, coord=1
Insts 2 1
Threads 6 0
… …
GROUP GET
Acquired 12 slaves in 2 sets q serial 3618817
P000 inst 2 spid 27299
P001 inst 2 spid 27305
P002 inst 2 spid 27307
P003 inst 2 spid 27309
P004 inst 2 spid 27311
P005 inst 2 spid 27396
P006 inst 2 spid 27398
P007 inst 2 spid 27400
P008 inst 2 spid 27402
P009 inst 2 spid 27404
P010 inst 2 spid 29654
P011 inst 2 spid 29656
kxfpgsg
got 6 threads on 1 instance(s), avg height=6
Insts 2 1
Svrs 12 0
通过Trace文件,我们轻松的知道了为什么在实例1上看不到并行进程了——在实例1 的负载高时,并行进程被负载均衡分配到了实例2上去了。
这个问题的根本原因是实例2上的parallel_max_servers被修改了。但是,这个参数怎么会被修改呢?看下测试脚本:
Test.sh:
… …
sqlplus -s /nolog << EOF
connect cs2_party_owner/cs2_party_owner@test
@test.sql
EOF
Test.sql:
ALTER SYSTEM SET parallel_max_servers=160 scope=memory;
…..
UPDATE /*+ parallel(CT, 6) */ ……
ALTER SYSTEM SET parallel_max_servers=0 scope=memory;
我们在脚本中临时修改了parallel_max_servers的值,脚本完成后再改回。而我们是通过tnsname——test来连接数据库,并且我设置了test的连接串连接到实例1上,会不会test的内容被修改过了呢?查看test的内容,果然被人修改过了…
test =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac01)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = rac02)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prd)
)
被人修改成了负载均衡方式。因此在rac01负载高时,脚本被放到rac02上去执行了,再加上测试过程中会出现脚本错误或者被人工Ctrl+c终止脚本,导致最后修改回0的语句并不能执行到,因此rac02的参数就这样被修改了。)