..

MD signalling card failed on congestion

..

Developer Group

Developer Group
Connect with thousands of other developers to brainstorm ideas, share best practices and tips - or just chat about the latest emerging technologies making noise in the field. And of course, get the most up-to-date service and support news from Dialogic.
Dialogic SS7 and SIGTRAN Signalling

MD signalling card failed on congestion

  • Normal 0 false false false EN-US X-NONE X-NONE

    I'm testing my SMSC application, I have experienced problem with MD card when congestion occurs.

    I'm doing some stress test and i wanted to see how it will behave when congestion occur.

    When congestion occurs on MTP links this is behavior I'm getting (part of log from my app). Final result is board failure:

    processQ752EventIndication: linkset 0x0, linkID 0x1 --> Signaling link congestion
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
        M-I0000-t8745-i0008-r00-f33-def-s00-p058200C80001
        -----------------------------------------------
    parseSCCPManagement: PC_STATE_IND PC=200, status=CONGESTION
        M-I0000-t8745-i0008-r00-f33-def-s00-p058200C80001
        -----------------------------------------------
    parseSCCPManagement: PC_STATE_IND PC=200, status=CONGESTION
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    ...
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    processQ752EventIndication: linkset 0x0, linkID 0x1 --> Congestion cleared
    processQ752EventIndication: linkset 0x0, linkID 0x3 --> Signaling link congestion
    processQ752EventIndication: linkset 0x0, linkID 0x0 --> Signaling link congestion
    processQ752EventIndication: linkset 0x0, linkID 0x1 --> Signaling link congestion
    processQ752EventIndication: linkset 0x0, linkID 0x2 --> Signaling link congestion
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    ...
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
    parseMTPStatus: Signaling Network Congestion, congestion Status --> 1
        M-I0000-t8745-i0008-r00-f33-def-s00-p058200C80000
        -----------------------------------------------
    parseSCCPManagement: PC_STATE_IND PC=200, status=CONGESTION
        M-I0000-t8745-i0008-r00-f33-def-s00-p058200C80000
        -----------------------------------------------
    parseSCCPManagement: PC_STATE_IND PC=200, status=CONGESTION
    processQ752EventIndication: linkset 0x0, linkID 0x1 --> Congestion cleared
    processQ752EventIndication: linkset 0x0, linkID 0x2 --> Congestion cleared
    processQ752EventIndication: linkset 0x0, linkID 0x3 --> Congestion cleared
    processQ752EventIndication: linkset 0x0, linkID 0x0 --> Congestion cleared
    processMTP3ErrorIndication: !!!! MTP3 Error Indication arrived !!!!
    MTP_RRT_OVRFLW: Messages discarded due to overflow of Re-Routing buffer.MTP3 has discarded messages during forced or controlled rerouting as a result of an excessive number of messages queued internally. Operation resumes normally, although some messages are lost.
    processQ752EventIndication: PC 0xc8 Destination unavailable
        M-I0000-t8745-i0008-r00-f33-def-s00-p020208C80001
        -----------------------------------------------
    parseSCCPManagement: N_STATE_IND PC=8, SSN=200, status=UOS
        M-I0000-t8745-i0008-r00-f33-def-s00-p020206C80001
        -----------------------------------------------
    parseSCCPManagement: N_STATE_IND PC=6, SSN=200, status=UOS
        M-I0000-t8745-i0008-r00-f33-def-s00-p058100C80000
        -----------------------------------------------
    parseSCCPManagement: PC_STATE_IND PC=200, status=INACCESSIBLE
        M-I0000-t8745-i0008-r00-f33-def-s00-p058100C80000
        -----------------------------------------------
    parseSCCPManagement: PC_STATE_IND PC=200, status=INACCESSIBLE
    processMTP3ErrorIndication: !!!! MTP3 Error Indication arrived !!!!
    MTP_BSNT_FAIL (link_id=0): MTP3 failed to receive a BSNT from MTP2 during a changeover cycle. This may be as a result of a failure of the MTP2 board. MTP3 uses the Emergency Changeover procedure.
    processQ752EventIndication: linkset 0x0, linkID 0x0 --> Changeover
    processMTP3ErrorIndication: !!!! MTP3 Error Indication arrived !!!!
    MTP_BSNT_FAIL (link_id=1): MTP3 failed to receive a BSNT from MTP2 during a changeover cycle. This may be as a result of a failure of the MTP2 board. MTP3 uses the Emergency Changeover procedure.
    processQ752EventIndication: linkset 0x0, linkID 0x1 --> Changeover
    processMTP3ErrorIndication: !!!! MTP3 Error Indication arrived !!!!
    MTP_BSNT_FAIL (link_id=2): MTP3 failed to receive a BSNT from MTP2 during a changeover cycle. This may be as a result of a failure of the MTP2 board. MTP3 uses the Emergency Changeover procedure.
    processQ752EventIndication: linkset 0x0, linkID 0x2 --> Changeover
    processMTP3ErrorIndication: !!!! MTP3 Error Indication arrived !!!!
    MTP_BSNT_FAIL (link_id=3): MTP3 failed to receive a BSNT from MTP2 during a changeover cycle. This may be as a result of a failure of the MTP2 board. MTP3 uses the Emergency Changeover procedure.
    processQ752EventIndication: linkset 0x0, linkID 0x3 --> Changeover
    processSCCPMaintenanceEventIndication: start
    processSCCPMaintenanceEventIndication: SCPEV_RTF_NET_FAIL --> Routing failed, network failure.Cause Value: DPC Prohibited, opc= 200, ssn = 8
    processSCCPMaintenanceEventIndication: start
    processSCCPMaintenanceEventIndication: SCPEV_RTF_NET_FAIL --> Routing failed, network failure.Cause Value: DPC Prohibited, opc= 200, ssn = 8
    processSCCPMaintenanceEventIndication: start
    processSCCPMaintenanceEventIndication: SCPEV_RTF_NET_FAIL --> Routing failed, network failure.Cause Value: DPC Prohibited, opc= 200, ssn = 8
    processSCCPMaintenanceEventIndication: start
    processSCCPMaintenanceEventIndication: SCPEV_RTF_NET_FAIL --> Routing failed, network failure.Cause Value: DPC Prohibited, opc= 200, ssn = 8
    ...

     

    Board failed:

    S7_MGT Rx: M-I0000-t06a0-i0000-f20-dcf-s62-r0000-p001000d1
    S7_MGT Tx: M-I0000-t06a0-i0000-f20-def-s62-r0000-p001000d1
    ssdm[0]: failed (00d1)


  • Hi Sanja,

    Under 'normal' situation the only scenario I would expect a board to report failure is if is overloaded from the host and messages have backed up at the SSD process waiting to be sent from the host to the board. I believe once 4000 are queued SSD will presume the board has failed. What load is the system running at and what does gctload -t1 report following the failure? Also how many link sets and links does your system have?

    I would also suggest that you contact your support channel concerning this. Board failures are not normal and not expected and if they do happen we would like to work closely with you to resolve the issue.

    Best Regards

  • Normal 0 false false false EN-US X-NONE X-NONE

     

    Hi Howard,

    Thanks for your answer.

    I have two servers that are communicating over one linkset that contains 4 MTP links.

    Traffic amount that is simulate is 140 SMS/s. SMSs are relatively small because it doesn't contain any text.

    When I run gctload –t1 after board failure I get this:

    # ./gctload -t1

     

    GCTLOAD System status:

      Congestion module Id:     0x21

      GCTLIB library:           V1.43

      Internal system error:    0

      GCTLIB Atomic:            Enabled

      Partition[0]

        Parameter size:         320

        MSGs in partition:      5000

        MSGs allocated :        0

        MSGs free:              5000

        Maximum MSGs allocated: 1001

        Out of MSG count:       0

        Congestion onset:       2500

        Congestion abate:       500

        Congestion status:      0

        Congestion count:       0

      Partition[1] - Not defined.

     

    But at the end, I think I have solved the problem. It looks that max message number (MSGs free) settings was incorrect. It was set to 5 000 so congestion threshold was too big.

    When I reduced max message number to 1 000, behavior was much better and no board failure has occurred. When i run test with this settings, congestion is cleared after some time, link set is recovered and Adjacent PC is accessible.

    Can you tell me if there's a way to manipulate wit this setting from application? Is there any message that can be sent to framework to change this data so if I add more MTP links, congestion threshold can be increased?

  • MD boards support message rates well in excess of this with systems with many more messages present. I suspect the changes you have made have worked around the issue but not resolved it.

    To answer your specific question  I don't believe that the number of message in the system can be dynamically changed. You would need to restart the system with new configuration in the system.txt.

    Are you using the latest DPK from the web site, what OS are you using, and are you disabling the GCT environment system start up health check 'VERIFY' in system.txt.

    Best Regards

  • Normal 0 false false false EN-US X-NONE X-NONE

    Normal 0 false false false EN-US X-NONE X-NONE

    Hi Howard,

    Regarding OS, on servers there is Oracle Solaris 10 8/11 s10x_u10wos_17b X86 OS installed. Healty check is performed with every start of gctload:

    (26609)gctload: Verification started.

    (26609)gctload: Verification complete.

    I’m using latest DPK version, from March 2012, release 5.2.1.

    I will perform some tests today to see how system will behave with various settings. I will replicate same test on other server with other MD card and check if behavior is same

     

    Kind regards