• Not Answered

DeltaV Communication Issue

Hi,

We are having 10 Controllers (MX - 11.3.1) , 10 Operator stations and 4 Application stations in DeltaV network having smart deltav switches (RM100).

All controllers, operator stations and applications stations generate random events of

"Switching to primary ACN network 'controller name/operator station name/application station name' " or  "Switching to secondary ACN network ''controller name/operator station name/application station name".

e.g. Switching to primary ACN network ES.

After some seconds  it will generate another event of  

e.g. Switching to secondary ACN network ES.

This kind of events keeps on generating for all controllers, operator stations and application stations,

Pls suggest some solution.

For ready reference i have attached photo of events getting generated.

12 Replies

  • hi,
    check the diagnostic ->network for more info eg. collission also check the switch diagnostics.Did error start by itself or did you do something?
    Is the controller-c3 fail related to the error.
    Sounds like you have a broken switch...

    Niklas Flykt 

    Klinkmann Oy

    Key Account Manager safety products

    nikfly@gmail.com

  • What did the problem end up being? We are experiencing a similar issue.
  • In reply to jwiersma:

    You probably want to pull up Emerson KBA NK-1200-0277 "Understanding Area Control Network (ACN) Switchover Events"
    There are a number of things that can cause ACN Switchovers a few common things:
    - Cabling
    - Grounding
    - Noise
    - Bad / mis-configured / unsupported switch hardware

    In the event chronicle the Desc2 Column will tell you what the destination node that was attempting to communicated with that caused a switch over. If it is usually the same destination node then start with the network cables to that node, if most nodes show up then look at switches and trunk cables.
  • So what causes an ACN switch over?

    DeltaV uses a redundant network, two completely separate networks. All the data management on these networks is handled by the end devices, so the networks do not make any decision as to which is used or not.

    Each DeltaV node communicates to other nodes using Unicast messages with the UDP transport protocol. A UDP packet is highly efficient. DeltaV communication layer takes care of confirming data is received and has timeouts built in for retries. By preference, all DeltaV "process data" uses the Primary network. Data is sent to each destination in separate packets. When a packet is sent, a response is expected. If a response is not received within the timeout period, the parameter values are resent (the latest available value). After several retries, if no confirmation is received, the network between that node and the destination is marked bad and the packet is sent on the secondary. THis is called an ACN Switchover.

    The source node does not know how or why the packet is not confirmed, just that it is not. So, if you disconnect the primary cable from a console, all nodes sending data to that console will experience an ACN Switchover to the Secondary. Diagnostic messages are still attempted to the console on the primary so that when it returns, an ACN switchover to the Primary will occur.

    If an intermediary switch is turned off, all traffic through the switch will stop and any affected nodes will report an ACN switchover between it and its destinations. As you can see, one event can generate a significant number of ACN Switchovers. If this point of failure is intermittent, say a bad cable, improperly seated connector, bad Port on a switch, bad NIC in a computer, and traffic is intermittently getting through, the ACN messages can toggle. If it is a hard failure, you will get the switchover to the secondary and it will stay there till the problem is addressed.

    To isolate the problem, you can look to see if there is a common device involved. One device has lost connection on the Primary to all others, and all others have lost connection to one device. This will point to that device's primary port, network cable or Smartswitch Port. If the Switch shows issues on that port, it could be the switch, and changing ports can confirm this.

    If there are multiple devices that lose connection to all devices, one has to look for a common point in the network that sees all affected traffic.

    Some say that a normal healthy DeltaV network will still see some ACN switchover events and that a few events per day is of no concern. But every ACN switch over is the result of a failure to receive confirmation of receipt of a DeltaV packet, three successive times resulting in a time out and switchover. Early on when DeltaV used hubs and common collision domains, this was more likely as a 10 MB half duplex network became saturated. Today, all Smart Switches provide a separate full duplex connect to each controller and IO card, virtually eliminating collisions. 100 MB bandwidth at full duplex eliminates network pinch points. What is left is either corrupted data packets due to EMI noise, damaged cables, bad connections, bad network ports, high dB loss on fiber or bad NIC cards at the devices, or a software issue that prevents a device from responding. Other normal causes of ACN messages are activities such as rebooting a console, or cycling power on a switch. If these events occur, then the ACN messages are expected and can be ignored. The system will stabilize once the nodes are restored.

    In this case, sorting the Event Chronicle messages by Source device, you are able to determine if there is one common device or a common network point. If a cable is disconnected, and ACN messages stop because of the hard fault, it is likely that cable, and restoring connection with a good cable should result in one last ACN flurry as things switch back to Primary.

    Using the Smart Switch Command console, confirm that the switches involved are reporting any error counts, such as bad packets or other. Remember, if the packet is corrupt, it cannot be delivered and a confirmation is impossible. If the Confirmation is corrupt, it cannot get back to the source.

    If a console has a cyber security threat that is disrupting network traffic, this too could result in lost confirmation packets. The latest Smart Switch firmware limits broadcast and multicast storms, but the affected Computer may still have issues.

    Hope this is useful information.

    Andre Dicaire

  • In reply to Andre Dicaire:

    Hello Everybody!

    I had a similar issue with DeltaV Communications. All controllers connected on same Hirschmann Switch experienced Device Connection Failure with all workstations. It is important to know that I have a DeltaV Firewall IPD on Hirschmann side (Primary and Secondary). It looks like the nodes tried to switch to the redundant network but it was not successful, and all controllers lost communication with the operator workstations for about 1 minute (Graphics data went to magenta). Would you have any idea what could cause the issue? Primary and secondary SWs have independent power source, also the firewalls, and each firewall is connected a different Workstation SW (Primary and Secondary Cisco 2950) I wonder if Firewalls Firmware could affect the system. I would appreciate your comments very much. Thanks, Maria Isabel.
  • In reply to Maria Portillo:

    Maria Isabel,

    my advice is: please log a call with our Global Support Center (GSC) if you haven't opened one yet. It will be much easier if a technical expert helps you with this troubleshooting effort as you have many components that could be causing the issue.

    For clarity, we never released a firmware upgrade for DeltaV Firewall-IPDs since its launch in 2016 (KBA # NK-1600-0290 has additional information about this product). Therefore firmware versions cannot be the issue as you have one option available for this device. One thing to note about Firewall-IPDs is that control communications (CIOCs to Controllers, or Controller to Controller) should not traverse the firewalls.

    Our communications default to the primary network, and only go to the secondary when the primary route is not available. This is not done at the node level, but instead per communication link. A complete network switchover is likely to be caused by a complete segment (primary or secondary) issue.

    The issue could be caused by the network equipment or a storm of data generated by many different conditions including loops. You also have equipment which have been EoL a while ago, so support may be limited (Cisco 2950 were never tested with Firewall-IPDs as an example). Sorry for not helping much, but a call with GSC is the most appropriate route to deal with such network conditions in my opinion.

    I hope this helps!

    Regards,

    Alexandre Peixoto
  • In reply to Alexandre Peixoto:

    Alexandre,

    Thanks so much for your comments. You are right about the Firewall not affecting control communications at controllers side. When we experienced the issue, all controllers kept running and talking properly. But the controllers had no communication to the Workstations. We will check with Emerson the logs next week, but for me it's very uncommon loosing communication with both Hirschmann SW, so redundancy failed or something didn't work properly. They don't have power source or networking in common. My suspicious is something related to the Firewall, so it was not happy with some data and just lost communication. Probably I am wrong but it was weird issue for about 1 minute or something. Thanks again for your comments.
  • In reply to Maria Portillo:

    Maria, It is weird to lose both networks at the same time. But your issue is different than the initial call and so we have nothing to review. The event journal records shown in the original post show that a node called ES is involved in a series of ACN switchover messages. In that case, it looks like that node has lost primary communication with all other nodes, and for each connection you get a message from each node, provided the secondary network is working.

    If the secondary network is somehow not working, a loss of the primary would result in total loss of communications between nodes sharing that path. So we have to confirm if we lost the secondary network prior to the primary network event. As Alex points out, when available, DeltaV uses the primary network. Diagnostic messages are passed on the secondary to confirm its health, and any issues there would be logged in the Event Journal, indicating affected nodes. Remove the secondary network on a Workstation and review the resulting events to understand the expected behavior.

    Some say that even on a healthy network, you can see some ACN errors. I'm of the opinion that there is always an underlying cause of an ACN switchover, and that these can occur normally, such as rebooting a workstation, a total download to a controller, or even a controller switch over. DeltaV communications are based on Unicast traffic, where data from one node is sent explicitly to all other nodes that request data. Loss of communication caused by a switch or fault cable would affect all nodes that share that path. Loss of communication between two specific nodes is likely the result of an event on one or the other of the nodes, like a reboot, or some overload condition that results in time outs at the communication layer. The root cause of the ACN switchover is a time out in responses. To switch back to the primary, successful communication is seen on the primary.

    But you should most definitely log a call with the GSC. Good luck.

    Andre Dicaire

  • In reply to Andre Dicaire:

    I have the same issue with ACN switching over between primary and secondary then I/O failure of remote CHARMs for about 0.001 second then recovered. Laterly I’ve found that one of my Emerson Smart switch in Primary network is commissioned but showed red cross in DeltaV Smart Switch Command Center. I’ve check LED on this switch they are running and healthy. I’ve trìed to use telnet to login but don’t know the user name and password. My question is what happen with my smart switch?  how i can fix it? Is it possible that ACN switching over causing remote CHARM I/O failure alarm. I’m using DeltaV v. 13.3.1

    Thank you in advance.

  • In reply to Jitomit:

    Jitomit. An ACN switchover is not an Issue. It is a symptom, and there are many different reasons an ACN switchover is recorded. You situation does not seem anywhere similar to the original question. I would also suggest you log this with the GSC.

    Andre Dicaire

  • In reply to Andre Dicaire:

    Andre Dicaire,
    Yes. I'll try to log this issue with GSC. But one more question. Following instruction in previous reply I've found that all switching over are from only 1 SQ controller (source or destination of switching over alarm), then I/O failure alarm for signal from this SQ controller. How I can identify which part or controller has problem, if I/O failure is appeared then disappeared after 0.0011 second than?
  • In reply to Jitomit:

    It would be preferred to have this in a different Thread as this has nothing to do with original question. Open a new discussion based on the I/O Failure and we can discuss this in more detail.

    Andre Dicaire