• Not Answered

MD Plus Standby Controller continuously reset

In DeltaV Version 11.3 one of the standby MD Plus controller continuously reset by itself and tries to load the program. This cycle repeats continuously. Both active and standby controllers software firmware are same. In the  DeltaV diagnostic, NVM of the standby controller shows as good but the cold restart status shows invalid. 

When we use the standby controller as simplex it works good, however the cold restart status remain invalid. When it is configured as simplex, recycling power requires to download the program manually and the nvm does not hold the program. Flashed with DeltaV version 10 to see if we can get any change in response.  But the response remain unchanged.

Appreciate for giving us your hand to come-out of this problem.

11 Replies

  • I've seen this kind of behavior a couple of times in the past and the reasons were different in both cases. In one instance, it was a software related issue that did not allow the standby controller to come online; while in the other case, it was hardware related(problem with the backplane).
    It's not clear, how far you have gone with your troubleshooting and what possibilities have been eliminated, but I suggest that it would be best to work with the GSC on this one. They could advise you on where to look given your installation, configuration, etc. even help you with a telnet into the standby, and understand what is happening. Perhaps all that you might be having is a bad controller hardware that can be replaced easily.
  • Hi John,
    This is a hardware problem at our site we have changed all left and right side extenders 
    This could be a short in the pin or "tin whiskers" issue Emerson is fully aware of this and please raise a call with GSC.
    Regards
    Syed Mavvaj Hussaini 
  • In reply to Syed Mavaj Hussaini:

    The information provided does not support a conclusion of hardware fault or an issue with Tin Whiskers. As John suggests, engaging the GSC is your best recourse.

    Andre Dicaire

  • In reply to Andre Dicaire:

    Dear All,

    Thanks for the reply. We have replaced the standby controller having problem with a new controller and the problem got solved. We do not know what is causing the faulty controller to reset continuously when it is configured as redundant. Secondly when the DeltaV diagnostic reports that the NVM is good and we do not understand why the cold restart program is not loading into the NVM.
  • In reply to B.Senthilkumar:

    Hello All,

    I am facing same issue with my v11.3.1.2465.xr and MD controller rev 5.12, 5.17, 5.21 and MD+ rev 5.50. Error and Stby LED on my Stby controller keeps on blinking, all above HW works fine in simplex config. After some testing I have come to a conclusion that redundancy works perfect if there is no I/O configuration; but as soon as I configure any I/O module and download then Stby never comes back in sync state, stays in the limbo. Got stuck as controller is of no use without I/O :-), any suggestions. Thanks.

    BR,
    Riz
  • In reply to Rizwan:

    I wouldn't say it's the same issue as in original post, they replaced the controller to get things resolved. You've listed four different controllers based on hardware revisions...

    First, this is a product issue that should be reported to the GSC so they can assist you in resolving this in a timely fashion.

    In a redundant pair, the Active controller polls the IO bus continuously to update the IO map in the controller. In each poll, it allows the Standby controller to poll a card so that the standby can verify it's connectivity to the IO bus. If you have 20 cards, the Standby will poll each card every 20 complete polls from the Active.

    If either controller is unable to poll the IO, it will defer to it's partner and either switchover, or disable itself as being unavailable.

    If the issue is with the localbus, and only affects the Standby, that would mean the left hand 2-wide carrier is not properly seated, or has an issue connecting to the LocalBus, but the right hand carrier is functioning properly, allowing the right hand controller to work but the left hand controller is unable to connect to IO cards.

    The issue could be in the controller connector on either the left or right carrier, causing that position to not be able to connect to IO cards.

    If the issue is happening with different controllers installed, and all controllers are known to function in Simplex mode, it sounds unlikely that it is the controllers, and more likely the carriers.

    Do controllers work in simplex installed in either the right or the left carriers, or have you tried this using one carrier only (the Right side on connected to the IO Carrier)?

    The carriers have the main connector in the center which provides access to the IO bus and card/bank selection lines, as well as 12 vdc power distribution to the IO cards, and the Redundancy connector that the top that is used for detecting the presence of the standby controller and pass redundancy data. If these are not securely seated, the controller in the left carrier might not be able to communicate with IO cards.

    When there are no IO cards configured, the controllers have no reason to poll or detect IO cards. The active will poll the bus so it can discover a new card, but the standby would expect to find no cards based on the configuration. So adding all that up, it sounds like the Standby is unable to successfully read the IO bus, and with multiple controllers exhibiting the same behavior, this would indicate the issue is in the carrier connectors.

    You should check carriers are properly seated, swap them to see if the problem follows the carrier (this could cause both controllers to fail to read IO, which would confirm the issue is in the carrier now installed on the right). If the problem follows the carrier, it is likely an internal connector failure, so replace the carrier. If the issue disappears, the issue could be with the now left carrier on its left connector and since that is no longer used, is no longer causing an issue. I'd confirm by swapping carriers back to see if issue reproducable. You can also simply swap out the carriers and see if that fixes it. Then trouble shoot the carriers separately.

    Document precisely what you test as if this does not resolve it, you'll want to be able to share your actions to date.

    Oh, and make sure your controllers are actually all flashed to the same revision of firmware. If they are not, you should flash them and make sure the issue is present with consistent Firmware.

    Andre Dicaire

  • Hello All,

    I faced with the same issue. MD Plus controller installed in an offline test system and continuously reset after full CPU download, with no matter I/O cards installed or not.

    Controller system information:
    DeltaV Controller Type MD Plus
    Hardware Revision Level 5.52
    FRS Part Number: 12P3439X032
    Software Revision Level 13.3.1.6350.xr
    Carrier Type Standard Carrier (SIS Compatible)
    Application Autostart YES
    Application Disable NO

    Telnet log of event is listed below >>>>>>>>>
    ----------------------------------
    Record = 1 of 40
    Crash Time = Oct 03, 2019 07:29:27 Role = Active
    Current Task = Idle Task, Prior Task = Roc SOE Service
    Sw Build Date = Jul 30,2019 21:27 Software Rev = 13.3.1.6350.xr
    Hardware Rev = 5.52 FRS Part Number = 12P3439X032
    Ctlr Uptime = 0 yrs 0 days 0 hrs 0 mins 56 secs
    Free Time = 100 Free Mem = 50619282
    P0 Index = 0, P1 Index = 0, P2 Index = 0, PDebug Index= 0, LastRes = 0

    Registers = R0: FF68A998 R1: FFF04CF0 R2: FFE957C8 R3: 504C4F47
    R4: FF167770 R5: 000000C4 R6: FF140807 R7: 00000003
    R8: FFF1A25C R9: 00003972 R10: 00000001 R11: 000000C4
    R12: 00000000 R13: FF171758 R14: 00000000 R15: 00000000
    R16: 00000000 R17: 00000000 R18: 00000000 R19: 00000000
    R20: FF160428 R21: 00000000 R22: 00000002 R23: FFCE1230
    R24: FF140808 R25: 00001A29 R26: 00000005 R27: 00000000
    R28: FF140000 R29: FF140000 R30: FF171A70 R31: 00000000

    CR: 20000000 PVR: 80811014 XER: 20000000 LR: FF68A998
    CTR: 00000000 PC: FF68A9D0 MSR: 00023972 DAR: 00000000
    DSISR: 00000000 TESR:00004000

    EXCEPTION VECTOR: FFF00700 'Program'
    CAUSE: 'PLOG'
    RtProgLog, Severity = RtError
    Role = Active
    Task = STST
    File = RtcRailbusDriver.cpp Line = 6697
    Message = Railbus DMA Machine failed. I/O Scanning on Active Controller Stopped.
    status = 00000005

    Stack:
    FFF04CF0 = FFF04D00 FF140000 FF171A70 00000000 ..M........p....
    FFF04D00 = FFF04DC0 FF68A998 FFD28698 00001A29 ..M..h.........)
    FFF04D10 = 00000002 00000005 FFD286B0 0F010000 ................
    FFF04D20 = 00000003 0056709C 00000000 FFBBB858 .....Vp........X
    FFF04D30 = FF160428 00000000 00000001 000000C0 ...(............
    FFF04D40 = 0F000000 00000000 00000000 00000000 ................
    FFF04D50 = FFF04D60 FF0B9E7C 00000001 000000C0 ..M`...|........
    FFF04D60 = FFF04DA0 FF84BC74 05000000 FFF04DC8 ..M....t......M.
    FFF04D70 = FFF04D08 53545354 00015180 FF140000 ..M.STST..Q.....
    FFF04D80 = FF171A88 00000000 00000000 00000000 ................
    FFF04D90 = FFF04DA0 FF05A6A4 00015180 FF140000 ..M.......Q.....
    FFF04DA0 = FF171A88 00000000 00000002 00000001 ................
    FFF04DB0 = FF160428 0F000000 00000000 FF0B9278 ...(...........x
    FFF04DC0 = FFF04DD0 FF84C760 00000000 FF0B9278 ..M....`.......x
    FFF04DD0 = FFF04E00 FF84F070 80000000 FF028308 ..N....p........
    FFF04DE0 = FF171A88 00000000 00000001 00000001 ................
    FFF04DF0 = FF160428 FF171758 00000A9C 00000000 ...(...X........
    FFF04E00 = FFF04E10 FF69310C 00003072 FF16758C ..N..i1...0r..u.
    FFF04E10 = FFF04E20 FF681140 FFE957C8 0000B972 ..N .h.@..W....r
    FFF04E20 = FFF04ED8 FF68FEB8 FFF04E50 FFBBE9E0 ..N..h....NP....
    FFF04E30 = 504E4144 FF6A57C0 FFFF6CCC FFF03ACC PNAD.jW...l...:.
    FFF04E40 = FFF04EE8 FF171758 FFF04E58 FFBBE178 ..N....X..NX...x
    FFF04E50 = FFBD2F90 FFF0AFF4 FFF04E78 FFBBE6B0 ../.......Nx....
    FFF04E60 = FFF04E78 FF6329C8 FFFF6CCC FF635684 ..Nx.c)...l..cV.
    FFF04E70 = FFF04EE8 00000000 FFF04EA8 FFBC1F60 ..N.......N....`
    FFF04E80 = 00010000 FFBD2F90 00030000 00000002 ....../.........
    FFF04E90 = 20000000 00000000 FFF1D6B8 FF6A5950 ............jYP
    FFF04EA0 = FFF04EE8 FF6A5950 00003072 FFBC2024 ..N..jYP..0r.. $
    -------------------------------

    Any help would be highly appreciable.
  • In reply to Anton Saitov:

    Like the previous suggestions for other issues, you should log a call with GSC for your particular issue as it may not be "the same" issue and only appear that way.

    Go to www3.emersonprocess.com/.../ to access how to do this if you are not aware already
  • In reply to Andre Dicaire:

    Hello Andre, thanks a lot for the detailed reply, appreciate that.

    I tested above scenario with 2 x 2 wide and and 4 wide vertical legacy carrier so carrier issue is ruled out. The issue is resolved by upgrading the firmware. Other than firmware I also have to do a couple of other things to completely resolve this issue.
    1. Disable network redundancy, because I only had single connection with my controllers this changed my error state from "Failed Component" to "Stby not Ready".
    2. By looking at error msg "Stby not Ready" I just remembered your advice about Stby polling I/O bus and not finding any modules hence it is not ready for the charge so I removed all configured I/O modules and status changed to GOOD, redundancy established.

    Once again, thanks for your valuable input.

    BR,
    Riz.
  • In reply to Andre Dicaire:

    Hello Andre, thanks a lot for the detailed reply, appreciate that.

    I tested above scenario with 2 x 2 wide and and 4 wide vertical legacy carrier so carrier issue is ruled out. The issue is resolved by upgrading the firmware. Other than firmware I also have to do a couple of other things to completely resolve this issue.
    1. Disable network redundancy, because I only had single connection with my controllers this changed my error state from "Failed Component" to "Stby not Ready".
    2. By looking at error msg "Stby not Ready" I just remembered your advice about Stby polling I/O bus and not finding any modules hence it is not ready for the charge so I removed all configured I/O modules and status changed to GOOD, redundancy established.

    Once again, thanks for your valuable input.

    BR,
    Riz.
  • In reply to Anton Saitov:

    Hello Anton,

    In diagnostic, what are the values of PExist, Pavail, RedEn and Status? Make sure you have the same FW running in both controllers. DeltaV redundancy works differently, to establish redundancy first you have to provide it with every thing you have configured. Usually during testing in lab we cut the corners which then become a bottle-neck for establishing redundancy.

    Thanks,
    Riz