Virtualization recovery time for HA CLUSTER

How to know the recovery time in case of server failures for DELTA V virtualization HIGH AVAILABILITY CLUSTER configuration. Any calculation tool available? Any advice is welcome on how to judge.

  • For HA failover within a cluster, Microsoft would be the better datasource, as the functionality is based on their technology.

    For HA failover across clusters (i.e., starting a replica VM), that is a cold fail. Recovery time depends on how long it takes a user to notice and start the replica manually.

    Most MTTR requirements I've worked with recently require max 4 hours' until process can be re-started.
  • In reply to Ray Emerson:

    What would be the time frame if there is a recovery server added to the HA CLUSTER. If configuration is server 1 and server 2 fail and there is a 3rd server that acts as a recovery server can the VM be automatically shifted to avoid loss of operations downtime? Will the transfer be automatic or manually triggered Requirements are for upstream application
  • In reply to Prasenjit Dasgupta:

    I think you'd benefit from looking into Microsoft Hyper-V Failover Clustering. There are many resources available that will give you the details you need to generate your design.

    DeltaV Virtual Studio uses a subset of Failover Clustering.
    - HA: VMs may be configured for automated failover within the same logical cluster. The VM reboots after the failover.
    - Replication/Recovery: The recovery image must be started manually.
  • In reply to Ray Emerson:

    Thanks Ray, I shall look it up.
  • In reply to Ray Emerson:

    As a non virtual site, with 3 servers and 24 stations we are having a hard time justifying virtual based on the additional costs of the licensing primarily, the hardware size necessary to have full dual functionality, the “one more thing to support” and the “I can replace my current hardware numerous times for what this costs since I still have to update my Vertex hardware” where is the ROI?. For those that are virtual, help me “drink the kool-aid” related to the benefits vs the added cost. What do you love and hate about it? And lastly do you feel like you are saving time/money by being virtual.

  • In reply to Jason.Brumfield:

    I second this ... my site is a little larger, and we are primarily batch so we have redundancy for Batch Executive which, 2 releases after we paid for it, finally works as originally advertised. I looked at virtualization vs physical and I could not make the virtualization case pay out. The best case I can make for virtualization is if you have a software configuration that is no longer supported by current Intel hardware, in this case virtualization can keep your old OS/system working on modern silicon that would otherwise be unsupported by your system's OS. I also greatly prefer VMWare ESXI to Microsoft Hyper-V.
  • In reply to Jason.Brumfield:

    Jason - Having had this discussion MANY times in the last 10 years, the short answer is "It depends where you're coming from, and where you want to go". I don't know you plant, so here are a couple of cases from my own experience.

    Example 1: Large system, FDA-regulated. Two DeltaV systems with 20 physical servers, 65 operator terminals (multiple generations of OptiPlexes). They opted to virtualize PART of these systems in 2016, including half the servers, and 50 of the terminals. These they replaced with multiple terminal servers tying in to ThinClients. They virtualized the remaining hardware in 2018 as part of a shutdown. This also included moving half the infrastructure to a different building on the same campus.

    The original drivers? Operator terminals had short lifecycles and couldn't be washed down. Environmental (heating/cooling) in the datacenter for all the servers was getting more expensive. The directive was to save money on utilities.

    What additionally came out of it was that the site was able to meet it's disaster recovery metric (return to operations in 4hrs or less) for the first time in almost 20 years.

    Example 1a: We completed a major version upgrade in late 2018 on this same hardware, including an OS change at the hypervisor level to support Server 2016 in the VMs (thanks, M$!). Having upgraded this site multiple times in the past, I knew it would take 2 weeks of 24-hour coverage to complete.

    I got 4 days, so I required that we run the plant on half the infrastructure for 3 months while I completed the upgrade on the other half. We did not get a code freeze, and another project was in charge of tracking all the changes in the live system to re-integrate into the upgraded system. We also didn't get a domain freeze, so reconciling users and passwords was also needed.

    _BUT_ we cutover both systems and were ready to return to operations in 3 days.

    Example 2: Very small system (ProPLUS, 1 App, 4 operator terminals, 4 controllers), BUT running a critical utility feeding 26 points of use on civilian and military property with 20,000+ end users. 90% of the control and switching hardware and 100% of the workstation interfaces lived in one room.

    Anecdotally, this was originally a MicroPRoVOX install that we migrated to DVOP back in 2007, then to DeltaV migration controllers in 2010. Virtualized in 2014, and they were still running on PRoVOX I/O until 2016.

    My single design criterion: If a bomb hits the control room, the critical utility must remain operable from a different location.

    We designed in replication between multiple physical locations. We also added view-only ThinClients in additional buildings for increased accessibility and disaster recovery capabilities, and are currently planning an upgrade with migration to DeltaV Live, plus integration of one of their other legacy physical systems into the same infrastructure.

    In this new project, we ARE replacing the hardware, but only because of the customer requirement to replace everything that spins every 5 years.
  • In reply to Pat Grider:

    I've spent a lot of years with both VMWare and Hyper-V now, and I agree.

    Hyper-V can do most of the things that ESXi can, but it can be painful. Depending on what you're doing, you need (4) applications (PowerShell, DeltaV Virtual Studio, Hyper-V Manager, Failover Cluster Manager) to do it all.

    At the end of the day, however, ANY Virtualization is a good answer because it can be made to do (almost) whatever you want it to. The challenge is that you have to choose your ROI metric (reduced server footprint, increased system availability, fewer assets to replace, etc...) and THEN design the system to that. The ROI comes from how you put the pieces together, and (as in my post above), there are a lot of ancillary things that come later on.