How to capture and dump a task that is interrupting the CPU
Symptom:
Occasionally, the CPU on a Dialogic® Integrated Media Gateway may reach an interrupted state of 80% or higher; the IMG Gateway then rejects all calls until it decreases below 80%.
This article explains the diagnostic steps required to capture and dump the task that is interrupting the CPU, and also describes how to induce a 'fault' on the IMG Gateway in order to provide potentially useful additional information on the root cause of the CPU overload.
Reasons for the issue:
There are several reasons the CPU can get into this state. Some of the possible reasons are:
1. GCL congestion: in a call trace you will see 'RCVD Clear, cause= 42' from GCL.
2. zNET(W) SERV: serv_add_entry_to_queue failed exstat 55 and zNET(W) SERV: opening socket failure 0x00000037
This error means that the host machine is congested and is not responding to ACKs. The TCP Window closes and the IMG starts buffering up messages.
Eventually the IMG Gateway runs out of buffers and causes the 'opening socket failure 0x00000037 '.
3. A task is in 'spinlock' and is not clearing and becoming idle.
4.** FAILED Enqueuing Message to Queue qModm, Rtn: x3d0002, CLI indication that the CPU is in overload.
Diagnosis:
1. Telnet (IMG 1004 / IMG 1010) or ssh (IMG 2020) to the IMG management IP address and log in with admin / password (IMG 1004/ IMG 1010) or dialogic / Dial0gic (IMG 2020), then enter the advanced debug menu by typing d and log in with the credentials of admin / password:
username: admin
Password:
*** Clearing GEL Print Buffer ***
WELCOME TO DIALOGIC IMG 1010 DIAGNOSTICS
===> IMG IP Addr : 10.129.52.146
===> IMG MAC Addr : 00201C111713
===> IMG Name : DS3_IMG
===> IMG ID : 0
===> IMG FileSvr IP Addr : 10.129.52.133
===> Press 'h' for diag menu options
DS3_IMG>d
username: admin
password:
Advanced Debug>
2. Enter the OS menu and dump the CPU task and states, via the o -> C commands:
Advanced Debug> o
os> C
NAME ENTRY TID PRI total % (ticks) delta % (ticks)
-------- -------- ----- --- --------------- ---------------
tExcTask 6585008 0 97% ( 0) 0% ( 0)
tJobTask d2dd730 0 0% ( 0) 0% ( 0)
tLogTask logTask d2ddcc0 0 0% ( 0) 0% ( 0)
tNbioLog d321020 0 0% ( 0) 0% ( 0)
tWDT zOsWdt_tas d7ec790 0 0% ( 0) 0% ( 0)
tWvRBuffMgr wvRBuffMgr e4e6620 0 0% ( 0) 0% ( 0)
tM256poll M256_poll f587d90 0 0% ( 1) 0% ( 1)
tOSMON zOsMon_tas f35ad90 1 0% ( 0) 0% ( 0)
tWdbTask wdbTask d2eb010 3 0% ( 0) 0% ( 0)
tErfTask d328c70 10 0% ( 0) 0% ( 0)
tNetTask netTask d32eaf0 50 0% ( 0) 0% ( 0)
tTelnetd telnetd d2e5cd0 55 0% ( 0) 0% ( 0)
tSYSCTL sysMain ea4ed90 55 0% ( 0) 0% ( 0)
tTelnetOut_e telnetOutT f09a670 55 0% ( 0) 0% ( 0)
tTelnetIn_ef telnetInTa f248d00 55 0% ( 0) 0% ( 0)
tCciTask cciTask d2eb650 75 0% ( 0) 0% ( 0)
tPRCMON procMonMai efe6d90 100 0% ( 0) 0% ( 0)
tCHIPMON chipMonMai f0ccd90 100 0% ( 0) 0% ( 0)
pwrMon zHwPower_m 1fdfa590 100 0% ( 0) 0% ( 0)
TDSPx dsp_DSPcSe efb4b80 104 0% ( 1) 0% ( 1)
tNET e83bd90 105 0% ( 0) 0% ( 0)
tNETNFS e869d90 105 0% ( 0) 0% ( 0)
TCPS tcpSendTas e8f6d90 105 0% ( 0) 0% ( 0)
TCPR tcpRcvTask e925d90 105 0% ( 0) 0% ( 0)
tEXS z_hmiExs_m e9ced90 105 0% ( 0) 0% ( 0)
tRCOMM z_rcomm_ma e9d8d90 105 0% ( 0) 0% ( 0)
tRRCVR z_rcommRcv e9e2d90 105 0% ( 0) 0% ( 0)
tFILE ea1ed90 105 0% ( 0) 0% ( 0)
tSYSNTP sysCtlNtp_ ea7ed90 105 0% ( 0) 0% ( 0)
tSysRM sysRm_main eb0ad90 105 0% ( 0) 0% ( 0)
tSS7Ctl ss7ctl_mai eb66d90 105 0% ( 0) 0% ( 0)
tNFSMON nfsMonMain efb8d90 105 0% ( 0) 0% ( 0)
TDSPr dsp_DSPcRe d2d2af0 105 0% ( 1) 0% ( 1)
tDCC dccMain ef8ad90 154 0% ( 0) 0% ( 0)
tL4CFG cpL4Cfg_ma f5cfd90 154 0% ( 0) 0% ( 0)
EVT0 smon_event 1226ed90 154 0% ( 0) 0% ( 0)
tUTPortTmr e8f14a0 154 0% ( 0) 0% ( 0)
SERV tcpServerT e898d90 155 0% ( 0) 0% ( 0)
CLNT tcpClientT e8c7d90 155 0% ( 0) 0% ( 0)
tFPanel z_hmifp_ma e958d90 155 0% ( 2) 0% ( 2)
tDNS zNetDns_ma e987d90 155 0% ( 0) 0% ( 0)
tCLI z_cli_main ea17d90 155 0% ( 0) 0% ( 0)
tCFG cfgMain eaaed90 155 0% ( 0) 0% ( 0)
tL3 l3c_main eb38d90 155 0% ( 0) 0% ( 0)
tSCFG scfg_main eb94d90 155 0% ( 0) 0% ( 0)
tMUN munMain ec3ed90 155 0% ( 0) 0% ( 0)
tCFC cfcMain ec6cd90 155 0% ( 0) 0% ( 0)
tRXS rxsMain ec9ad90 155 0% ( 0) 0% ( 0)
tTIMC timcMain ecc8d90 155 0% ( 0) 0% ( 0)
tL4 l4Main ecf6d90 155 0% ( 0) 0% ( 0)
txTsi xTsi_main ed24d90 155 0% ( 0) 0% ( 0)
tfacPfc facPfc_mai ed52d90 155 0% ( 0) 0% ( 0)
tModm modmMain ed80d90 155 0% ( 0) 0% ( 0)
tVcfg vcfgMain edaed90 155 0% ( 0) 0% ( 0)
tVppl vpplMain eddcd90 155 0% ( 0) 0% ( 0)
tsigH323 sigH323_ma ee0cd90 155 0% ( 0) 0% ( 0)
tH323CFG sigH323_cf ee3cd90 155 0% ( 0) 0% ( 0)
tH323L3P sigH323_l3 ee70d90 155 0% ( 0) 0% ( 0)
tDSP dspMain ee9ed90 155 0% ( 0) 0% ( 0)
tDCFG dspCfgMain eed2d90 155 0% ( 0) 0% ( 0)
tTC tcMain ef2ed90 155 0% ( 0) 0% ( 0)
tPTC ptcMain ef5cd90 155 0% ( 0) 0% ( 0)
tRAC racMain f014d90 155 0% ( 0) 0% ( 0)
tECH echoMain f042d90 155 0% ( 0) 0% ( 0)
tFAX faxMain f070d90 155 0% ( 0) 0% ( 0)
tPVDAMD pvdAmdMain f09ed90 155 0% ( 0) 0% ( 0)
tSUBR subrateMai f0fcd90 155 0% ( 0) 0% ( 0)
tXNG vxngMain f12ad90 155 0% ( 0) 0% ( 0)
tsigSIP sip_rfc326 f166d90 155 0% ( 0) 0% ( 0)
tISDN sigISDN_ma f194d90 155 0% ( 0) 0% ( 0)
tICFG isdn_cfg_m f1c2d90 155 0% ( 0) 0% ( 0)
tIPRI ipri_main f1f0d90 155 0% ( 0) 0% ( 0)
tIX25 sigISDN_x2 f21ed90 155 0% ( 0) 0% ( 0)
tIL2 l2_main f24cd90 155 0% ( 0) 0% ( 0)
tIMgmt sigISDN_mg f27ad90 155 0% ( 0) 0% ( 0)
tsigCAS sigCAS_mai f2a8d90 155 0% ( 0) 0% ( 0)
tCASCFG cas_cfg_ma f2d6d90 155 0% ( 0) 0% ( 0)
tCASL3 cas_ppl_ma f304d90 155 0% ( 0) 0% ( 0)
tGCL gcl_main f339d90 155 0% ( 0) 0% ( 0)
tCfcMiscISRH miscInt_mi f594d90 155 0% ( 0) 0% ( 0)
tSysTime ltimc_ISRH eec8900 155 0% ( 0) 0% ( 0)
tMCC mcc_main f5fdd90 155 0% ( 0) 0% ( 0)
tDS3Loopback 12086d90 155 0% ( 0) 0% ( 0)
tLBx 1208bd90 155 0% ( 0) 0% ( 0)
TDSPMsgDisp dsp_MSG_Di ee665e0 155 0% ( 0) 0% ( 0)
tCsmeRx f010a40 155 0% ( 0) 0% ( 0)
tGtlTx f010d60 155 0% ( 0) 0% ( 0)
tGtlTimer f03e1d0 155 0% ( 0) 0% ( 0)
tGtlRx f03eb80 155 0% ( 0) 0% ( 0)
tMIS misMain ef00d90 156 0% ( 0) 0% ( 0)
tSnmpTmr e80b4b0 204 0% ( 0) 0% ( 0)
tSnmpd e80bb40 204 0% ( 0) 0% ( 0)
tSysSts sysSts_mai eab5d90 205 0% ( 0) 0% ( 0)
tLOCO loco ec10d90 205 0% ( 0) 0% ( 0)
tFanPoll zHwFan_pol ee36c10 250 0% ( 0) 0% ( 0)
tTmpMon zHwTemp_Mo ee66d00 250 0% ( 0) 0% ( 0)
KERNEL 0% ( 0) 0% ( 0)
INTERRUPT 97% ( 0) 0% ( 0)
IDLE 3% ( 333) 98% ( 333)
TOTAL 98% ( 338) 98% ( 338)
os>
Notice at the bottom of the CPU dump you can see that only 3% of the CPU is idle and that 97% is interrupted which indicates that a task is using the majority of the CPU.
Scroll through the tasks and look at the total % field and you can see that the task tExcTask 6585008 0 97% is using 97%.
3. Once the task has been identified, the next step is to dump the Task Stack Information by executing the t -> <task id> -> ENTER where the task ID is 6585008 as show in the previous step:
os> t
Task ID: 6585008
0x00dc5d1c excJobAdd +0x18 : semBTake ()
os>
4. The following step is to induce a fault, which will provide additional useful information on the root cause of the CPU overload. This is executed from the home Advanced Debug menu by entering the "!" command:
Advanced Debug> !
Please Confirm Fault by typing YES: YES
A .flt file will be created in the same location where the IMG gets its firmware file from; either on the SD card or the FTP location on the GCEMS (IMG 1004 /IMG 1010) or EMS (IMG 2020) server. Send this file to Dialogic support for analysis.
Product List
Dialogic® IMG 1010 Integrated Media Gateway
Dialogic® IMG 1004 Integrated Media Gateway
Dialogic® IMG 2020 Integrated Media Gateway (IMG 2020), formerly referred to as Dialogic® BorderNet™ 2020 Session Border Controller
Glossary of Acronyms / Terms
GCL= Gateway Control Layer
CLI= command line interface/telnet
First published: 24-Jun-2009
Last published: 06-Jan-2014
Open access: Product rule: ; Page rule: Auto