• Not Answered

Failure of redundant controller automatic switch-over

I had one of my redundant controllers fail earlier but it did not switch to the standby as it should have. I pulled out the failed unit and the remaining unit immediately went into active mode. I'm not sure what constitutes a failure but the failed unit was showing power and the secondary CN light was flashing. Primary was showing no activity on the controller or on the switch. Has anybody ran across this before?

5 Replies

  • Why do you say that your controller fail ? Power led was green or red ? If green plus 1 network is enough to work. Did you observe some modules not working ?
  • In reply to LaurentB:

    Because we lost all communication with it and it was x-ed out in DV Explorer.
  • In reply to harnettw:

    Try to post events from eventchronicle during failure. If you lost comm , the cn light should not blink....
  • In reply to harnettw:

    Interesting. The Active controller sees the world differently than we do. When a node stops communicating, as reported in DeltaV explorer, that means Explorer is not seeing it. That could be an issue on the network or it could be the controller is not communicating. The controller may stop communicating to protect itself from abnormal traffic on the network, like a broadcast storm or a cybersecurity attack. The system version is important as well as the controller type to now what to expect from that controller.

    Anyway, the controller will have stored some events in its local buffers that can be retrieved over telnet. You'll need to log a call with the GSC so they can help you retrieve these and determine what the controller has to say about this event. Note the times you observed this, but they should also be in the Ejournal file with communication fail alarms recorded.

    Also, as Laurent asked, what was the state of the LED's on the controller when you removed it? You'll want to pass this on to the GSC engineer as a data point.

    Andre Dicaire

  • In reply to Andre Dicaire:

    The Event journal recorded a whole bunch of I/O input failures, input transfer errors and communications errors. The controllers are type MD and are dated circa 2005. These are being replaced with MQ controllers in October when we upgrade to DV V14.3. Current version is V11.3.