Been having this issue for quite some time now where PLM Watchdog Failures are putting our Batches into Hold. They appear to come when Requests are being sent to the Batch Executive, generally for Report parameters. The number of PLM Watchdog Failures has just recently increased significantly, and appears to be coming in sets of 2 for back to back days, in which the next set of 2 will come roughly 2 weeks later. These also appear to come around the same time each day (i.e. between 2am and 3am). Of course, sending Requests to the Batch Executive are a standard part of the system and we have been doing so since the system was built with no issues. The only thing I can think of that has changed recently is that we have Commissioned quite a few new Units with CIOCs on the system, but those Units were Disconnected without first being Decommissioned, leading to an increase in Device Connection Failures that we're reading in Process History View, that all seem to come in a cluster ever 2 minutes or so. I am quite suspicious that the Device Connection Failures could have something to do with increase in PLM Watchdog Failures, but I wanted to get a second opinion before trying to narrow it down. We did end up Decommissioning those CIOCs and Downloading Changed Setup Data to the associated Controllers, but rather than eliminating the Device Connection Failures, it just removed the associated CIOC Tag we had created in the system, and replaced in with a random ID# (i.e. before they were saying "Device Connection Failure - [Unit#]CIOCnn", and following the Decommissioning and Changed Setup Data Download for the CIOC, the message now says "Device Connection Failure - ID = #xxxxxxxx"). Interestingly enough, we were able to eliminate Device Connection Failures for a few of our CIOCs following the same procedure as we did for the others, which were all on the same Controller, but there are still CIOCs on that Controller that are Decommissioned and showing up with Device Connection Failures still. I should also point out that we are using DeltaV v13.3.2 with a combination of SQ and SX Controllers, though the Units that appear to be causing the PLM Watchdog Errors (ones with an active Batch that is sending Requests to the Batch Executive) have had instances from both SQ and SX Controllers. In short, we are basically trying to figure out:
1. Could these Device Connection Failures be a factor in causing the PLM Watchdog Errors? Are there any other factors anyone can think of that we should be considering?
2. If these Device Connection Failures are a significant factor in causing the PLM Watchdog Errors, does anyone know how to eliminate them?
Thanks everyone, this help will be greatly appreciated!
Doug