Emerson Exchange 365

Answered

Watchdog problem at customer site

Dear All,

We are facing a Watchdog failure problem at one of our customer site, as soon as this watchdog timer fails the entire batch goes to hold thus stopping the entire production. The entire system uses dynamic referencing to issue different commands. Controller Free time and Free memory seems good when issue happens. If anyone came across such problem or have found a way out or alternative please kindly share. Thank you very much.

Youssef.El-Bahtimy
- 13 Mar 2014 5:08 PM
As the general community does not have access to the GSC call tracking system, I would suggest you post the details of your issue here.

If the problem pertains to a customer that you are the service provider for (based on the title of the posting), please make sure not to post any sensitive information.

Youssef El-Bahtimy | Systems Integration Technologist
PROCONEX | 103 Enterprise Drive | Royersford, PA 19468 USA
Proconex Office: 610 495 2970 | Cell: 267 275 7513

[email protected]
- Cancel
Lun.Raznik
- 13 Mar 2014 10:51 PM
In reply to Youssef.El-Bahtimy:

I agree with Youssef.

But on general terms - watchdog failure at the node level can be caused by seveal things.

On top of my head:

- Loading on the node (for the controller is it within the specified limits)? Like

Controller free time minimum: 20%

Controller free memory minimum

MD - 1.4 MB

SD Plus/MD Plus - 4.8 MB

SX/MX - 9.6 MB

- For workstations - check that you have enough CPU idle, free memory, Disk IO load

- At network level - is it clean traffic-wise? Are the network equipments in good shape

On the control level, it is possible that improvements/fixes may have exposed issues in you configuration. Or there might be regression in functionality.
- Cancel
Ashish P
- 13 Mar 2014 11:07 PM
Dear Youssef/Raznik,

The data is critical and important to client and thats the reason I didn't sumbitted it here. I thought may be some of you guys are having access to SMS to check the details for the problem.

Thank you for reminding. I will try to sum up the problem faced at site in general words without sharing any critical information.

The main query is not watchdog failure, but trip of entire batch because of watchdog failure.

I will get back with details as soon as I can.

Thanks a lot ever
- Cancel
Ashish P
- 16 Apr 2014 1:36 AM
In reply to Ashish P:

Dear All,

Sorry for delay just want to update some information on this problem :

The issue we are facing is different one….. THIS IS NOT SERIAL WATCHDOG… this is batch watchdog

Your phase is loaded to controller…. While running if it loses connection with batch executive, after certain time your batch fails with a reason as ‘Phase Logic Failure: PLM Watchdog Failed 1’ or "Device connection Error " etc.

Has anyone observed this kind of situation earlier ? Any details or information will be really helpful.

Thank you for your time gentlemen !!
- Cancel
Youssef.El-Bahtimy
- 18 Apr 2014 4:13 AM
In reply to Ashish P:

You mentioned that you use a lot of dynamic referencing. I have had experience with module's with dynamic references embedded in phases causing problems because the module execution would start before references were bound.

Perhaps take a look at networking during phase loading to see if utilization due to excessive dynamic reference binding spikes enough, interefering with batch exec watchdog function.

I would look into whether you can modify the watchdog time out. For soft phases I believe you can, but I don't think that applies to controller phases.
- Cancel
Ashish P
- 22 Apr 2014 7:32 AM
In reply to Youssef.El-Bahtimy:

Dear Youssef,

As you mentioned that module's with dynamic references embedded in phases causing problems because the module execution would start before references were bound, but this problem is observed only in a particular phase. Not all the phases cause the PLM watchdog Failure.

Whether Controller Memory fragmentation has any impact on watchdog failure ?
- Cancel
Ashish P
- 2 Jun 2014 3:22 AM
In reply to Ashish P:

Client considered the option of hardware upgrade to solve this problem. Now client will replace all the existing controllers with MX.
- Cancel

Emerson Exchange 365

Emerson Exchange 365

DeltaV Alarm Help License Subscription

Live-Control Tag vs Multilanguage String

Small HCI host (SE2761V1 based on R660xs) dropped from Emerson DeltaV Product Line Up.

VIM2 Communication issue

Parameter Value Recovery

Press Release: Emerson’s CHARMs automation technology logs 1 billion operating hours at more than 300 customer sites

Watchdog for Serial Communication

PLM Watchdog Failure Increased Frequency

PLM Watchdog Failures putting Batches into Hold

How to setup a timer in DeltaV to be used as a watchdog to another DCS system

Watchdog problem at customer site