Module Error Module Status Interpretation

For some reason, recently I have started getting random floods of Module Errors and Modules status alarms, red boxes around the objects on the graphic and equipment shutting off.  Is there a decoder ring to help interpret the Module Error and Module Status codes?  

These come in and then about 30 secs later go inactive. The equipment is functional and has worked for 6 months or more. The devices are linked to DCS through and EIOC, Rockwell Stratix switch ,to the ENBT in a single ControlLogix PLC.  All of my devices using the EIOC are effected and it appears to be different MCC sections.

In addition to the Error/Status decoder, are there any txt files or similar buried behind the curtain to see any additional info that might be driving these alarms?

Thanks in Advance-

Alarm Screen             Process History View shows 1088

5 Replies

  • Check your setting on the ENBT for CIP connections. If you have multiple "Masters" talking through it and you exceed that number... somebody won't get to talk and their connections periodically drop. You can have up to 128 CIP connections on an ENBT card, you just have to configure for it. BTW the more you have, the slower it runs.

    Mike Brannon // Project Manager
    Revere Control Systems

  • In reply to MICHAEL BRANNON:

    MODBAD is a custom alarm. It looks like it evaluates MERROR and MSTATUS as non-zero and sends integer values in the alarm message. It normally would be tied to BAD_ACTIVE, which is set via the MERROR_MASK and MSTATUS_MASK.

    These are binary weighted bit strings and bit descriptions are in BOL. Search for Module Level parameters to find the bit position of each Error and Staus condition.

    MSTATUS of 1024 is bit position 10, or FB abnormal. 1088 is bit position 10 and 6, so FB Abnormal and Unresolved reference.

    FB ABNORMAL is set in MSTATUS when an FB’s BLOCKERROR had a condition true that is not masked to the block’s BAD_ACT via BAD_MASK. If the status bit is masked to BAD_ACT then it would have set MERROR.

    The Signal status going BAD likely explains this but it can be hard to find which block is having issue. I suspect an Input block. You can trend one or more FB/BLOCK_ERR Parameters and confirm which block(s) are flagging and what the error is (Bad PV?)

    Also confirm PDT status value. THis can indicate if an RPI error or loss of comms has occurred. You can historize this too to help capturing the transient errors.

    Unfortunately, you may need to Wireshark the segment to get more insight as to exactly what is happening on the integration network, assuming you are seeing IO related errors. In Diagnostics, you can see LDT values and their status.

    Andre Dicaire

  • In reply to Andre Dicaire:

    Andre gave a great answer, but I just want to add one things. It seems odd to be seeing an unresolved reference in your MSTATUS show up periodically just because of a loss of EIOC communications (though the FB Bad Active is fully expected). If you are using dynamic references, this might be related, but if you are doing standard references, you might have some other fixes that should be made.

    Though that's unlikely to fix the issue of loss of comms, wireshark is probably the best diagnostics tool you have to figure out why the loss of communications.
  • In reply to Matt Forbis:

    Thanks for all the info. It has helped a lot.
    Update and some more history- I didn't think it was related but now looking back I believe it was. Before my post, the power had been removed from the DCS and PLC controllers for some UPS maintenance. Upon power up, I had RPI faults on some of my LDTs for EIOC devices. This isn't a new problem, supposedly a known issue in 13.3.1. I bumped the config to force a download and they started talking. My main PLC, the one throwing the errors above, didn't have any faults so I didn't download it. Yesterday after several attempts at other solutions and diag, I took a shot and bumped the LDTs on my main PLC and downloaded the EIOC for it. Since then, I haven't seen any errors..... As I've been downloading the LDTs, I've been slowing them down too. They were all set about 1000ms and we don't need that speed. I set most of the to 2000ms. I spoke with Rockwell and am going to run a task monitor tool to see performance of the ENBT/EN2F but I don't think that is a problem, yet. Seems like maybe the EIOC had some sort of bug on the LDTs associated with my main PLC and needed flushed out... I don't have warm fuzzies, since I didn't find a real smoking gun. But I've seen crazier things in the past. Hope this was a fix.
  • In reply to s_brwn:

    Assume you are using UCMM for logics tags?

    There is a KBA that discusses some best practices. Basically, each PDT creates a connection to a PLC and schedules the configured LDTs one every 20 ms. If the number of LDTs * 20ms exceeds the RPI then you will see RPI faults. This LDT scheduler does not load balance based on RPI settings of LDTs. The best practice is to keep number of LDTs below the limit for the fastest RPI, and add additional PDTs to add through put.

    This will creat warnings of duplicate IP addresses but will download and work. The PDTs interleave their schedules to increase throughput.

    This also require additional Device licenses.

    If you use bidirectional LDT to write and read values, you should only do so with manually changed values, when you write a value, the entire LDT is written, and if you have continuous writes to a value, this can prevent update from the PLC. Continuous data exchange shlould be in an output LDT wher DeltaV Is the “owner” of the data.

    Also, these bidirectional LDTs might be fit form an RPI that is twice as fast as the module(s) writing. This ensures the readback can happen before the next write. However, consider that if multiple modules write to an LDT, these writes can prevent the readback as the entire LDT is sent on the writes.

    For operator initiated writes, you would Mersey multiple writes that would impact data exchange. If you have SP or alarm limits, an operator any possible make a series of changes fast enough to cause an issue. But if you programmatically send a change continuously, like a watch dog counter, don’t send this in LDT with read back.

    So add PDTs to increase throughput without slowing update rate.

    Don’t exceed LDT/PDT limit based on 20 ms scheduler

    Move continuous dat output such as watchdog or balance of plant PVs using a marshalling module/LDT Combo so that LDT is written to by one module. This avoids triggering an LDT transfer between module executions

    Set RPI twice as fast as module if readback is enabled so the module sees updated PLC data on next scan or is separate input and output LDTs where readback is not enabled

    Andre Dicaire