DB2问题诊断与解决: 一个由于历史文件过大,导致LOAD慢的问题

问题描述:


 客户发现数据库上所有表的LOAD操作都很慢,即使只有一行数据,也需要一分多钟。


load from S40_F_LN_LNP_TEST.ixf of ixf REPLACE INTO ODS.S40_F_LN_LNP_TEST NONRECOVERABLE
SQL3109N  The utility is beginning to load data from file
"/home/db2inst1/S40_F_LN_LNP_TEST.ixf".

SQL3500W  The utility is beginning the "LOAD" phase at time "01/14/2016
10:55:14.580838".

SQL3150N  The H record in the PC/IXF file has product "DB2    02.00",
date
"20160114", and time "105044".

SQL3153N  The T record in the PC/IXF file has name
"S40_F_LN_LNP_TEST.ixf",
qualifier "", and source "            ".

SQL3519W  Begin Load Consistency Point. Input record count = "0".

SQL3520W  Load Consistency Point was successful.

SQL3110N  The utility has completed processing.  "96" rows were read
from the
input file.

SQL3519W  Begin Load Consistency Point. Input record count = "96".

SQL3520W  Load Consistency Point was successful.

SQL3515W  The utility has finished the "LOAD" phase at time "01/14/2016
10:55:14.719068".

Number of rows read         = 96
Number of rows skipped      = 0
Number of rows loaded       = 96
Number of rows rejected     = 0
Number of rows deleted      = 0
Number of rows committed    = 96

real 1m32.16s
user 0m0.02s
sys 0m0.01s


问题诊断:

在LOAD完成之前,收集了两次如下数据:

db2pd -stack all
db2pd -eve >> db2pd.eve
db2 "list utilities show detail" >> list.uti

1.) 

db2pd -uti & db2 list utilites 均显示没有正在进行的load操作


2.)

诊断日志也显示,从load开始到load结束也只用了不到1S的时间

2016-01-14-11.02.58.024481+480 I205449A470        LEVEL: Warning
PID     : 13566098             TID  : 47797       PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000         DB   : AWPDB
APPHDL  : 0-34671              APPID: *LOCAL.db2inst1.160318130448
AUTHID  : DB2INST1
EDUID   : 47797                EDUNAME: db2agent (AWPDB) 0
FUNCTION: DB2 UDB, database utilities, sqluvtld_route_in, probe:839
DATA #1 : <preformatted>
Starting LOAD operation (S) (3) (R).

2016-01-14-11.02.58.509794+480 I205920A498        LEVEL: Warning
PID     : 13566098             TID  : 57577       PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000         DB   : AWPDB
APPHDL  : 0-34671              APPID: *LOCAL.db2inst1.160318130448
AUTHID  : DB2INST1
EDUID   : 57577                EDUNAME: db2lrid 0
FUNCTION: DB2 UDB, database utilities, DIAG_NOTE, probe:0
DATA #1 : String, 81 bytes
LOADID: 47797.2016-01-14-11.02.58.024466.0 (6;256)
Load CPU parallelism is: 15, 0

2016-01-14-11.02.58.514353+480 I206419A565        LEVEL: Warning
PID     : 13566098             TID  : 985         PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000         DB   : AWPDB
APPHDL  : 0-34671              APPID: *LOCAL.db2inst1.160318130448
AUTHID  : DB2INST1
EDUID   : 985                  EDUNAME: db2lfrm0 0
FUNCTION: DB2 UDB, database utilities, sqlulPrintPhaseMsg, probe:314
DATA #1 : String, 136 bytes
LOADID: 47797.2016-01-14-11.02.58.024466.0 (6;256)
Starting LOAD phase at 01/14/2016 11:02:58.514111. Table ODS
.S40_F_LN_LNP_TEST

2016-01-14-11.02.58.677115+480 I206985A532        LEVEL: Warning
PID     : 13566098             TID  : 57577       PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000         DB   : AWPDB
APPHDL  : 0-34671              APPID: *LOCAL.db2inst1.160318130448
AUTHID  : DB2INST1
EDUID   : 57577                EDUNAME: db2lrid 0
FUNCTION: DB2 UDB, database utilities, sqlulPrintPhaseMsg, probe:314
DATA #1 : String, 104 bytes
LOADID: 47797.2016-01-14-11.02.58.024466.0 (6;256)
Completed LOAD phase at 01/14/2016 11:02:58.676938.

3.)那么剩下的一分多钟时间它在做什么呢?为什么迟迟没有返回,于是查看stack,stack显示它正在写历史文件

shows it's writing to history file.

13566098.47797.000.stack.txt

<StackTrace>
-------Frame------ ------Function + Offset------
0x0900000000029F60 lseek64 + 0x20
0x090000000C6B62B8 lseek64@glue7B0 + 0x98
0x090000000ACD3A0C sqloseek + 0x128
0x090000000D0039C0
sqluhReadEntry__F12SQLO_FHANDLEP11SQLUH_ENTRYP14SQLUH_WORKAREAPcN34PUiC1
9SQLUH_ENTRY_VERSION + 0x154
0x090000000D0037BC sqluhUpdate + 0xCCC
0x090000000CC39ABC
sqluCommitLoadEntryInHistoryFile__FP8sqeAgentP17SQLU_LOAD_HF_INFO + 0xF4
0x090000000CC316B0
sqlu_process_pending_operation__FP8sqeAgentiPcP9SQLP_LSN8P12SQLP_PENLIST
PUiP15SQLD_RECOV_INFO + 0x1024
0x090000000CC7BBA0 sqlptppl__FP8sqeAgent + 0x490
0x090000000CC7B660 sqlptppl__FP8sqeAgent@glue15F6 + 0x74
0x090000000ACF4274 sqlpxcm1__FP8sqeAgentP15SQLXA_CALL_INFOi + 0x224
0x090000000E6BD8A4 sqlrrcom_dps__FP8sqlrr_cbiT2P15SQLXA_CALL_INFO +
0x16C
0x090000000E5F27BC sqlrrcom__FP8sqlrr_cbiT2 + 0x1C8
0x090000000E5F341C sqlrr_commit__FP14db2UCinterface + 0xD8
0x090000000CC8FAA8 sqluv_commit__FP8sqeAgentPiP5sqlcai + 0x450
0x090000000D093440
sqlu_register_table_load__FP26sqluCLoadRequestDescriptorP24sqlusCLoadMPP
CoordinatorP15sql_static_dataPb + 0x27C
0x090000000D0AB8F8 sqluvtld_route_in__FP12SQLE_DB2RA_T + 0x889C
0x090000000D0CE118 sqluvtld_route_in__FP12SQLE_DB2RA_T@glue17E3 + 0x74
0x090000000B10632C sqlesrvr__FP14db2UCinterface + 0x4F4
0x090000000E3FB6E8 sqleMappingFnServer__FP5sqldaP5sqlca + 0x1A4
0x090000000E3FAFE8
sqlerKnownProcedure__FiPcPiP5sqldaT4P13sqlerFmpTableP8sqeAgentP5sqlca +
0x308
0x090000000E3F8C48 sqlerCallDL__FP14db2UCinterfaceP9UCstpInfo + 0x51C
0x090000000E3F92B8
.sqljs_ddm_excsqlstt.fdpr.clone.215__FP14db2UCinterfaceP13sqljDDMObject
+ 0xE4
0x090000000E5F92BC
sqljsParseRdbAccessed__FP13sqljsDrdaAsCbP13sqljDDMObjectP14db2UCinterfac
e + 0xFC
0x090000000E5F971C
.sqljsParse.fdpr.clone.77__FP13sqljsDrdaAsCbP14db2UCinterfaceP8sqeAgentb
+ 0x240
0x090000000E5E1D44 @63@sqljsSqlam__FP14db2UCinterfaceP8sqeAgentb + 0xAA8
0x090000000E3D6658 @63@sqljsDriveRequests__FP8sqeAgentP14db2UCconHandle
+ 0xA0
0x090000000E3D62CC @63@sqljsDrdaAsInnerDriver__FP18SQLCC_INITSTRUCT_Tb +
0x260
0x090000000E3D5D94 sqljsDrdaAsDriver__FP18SQLCC_INITSTRUCT_T + 0x1EC
0x090000000E4B43D8 RunEDU__8sqeAgentFv + 0x298
0x090000000E4A9D58 EDUDriver__9sqzEDUObjFv + 0xE4
0x090000000E4B1CFC sqloEDUEntry + 0x260
</StackTrace>


问题解决:

 prune history文件之后,问题得到解决

猜你喜欢

转载自blog.csdn.net/qingsong3333/article/details/50676290