Microsoft Hyper-V 2012 High Availability and Live Migration

I’m about 75% of the way through the Introduction to Hyper-V Jump start in the Microsoft Virtual academy. There is a lot of great information contained in the videos and .ppt downloads.  Below are some of the highlights of the material that I am not as familiar with: High-Availability and Clustering. Microsoft Hyper-V in Server 2012  provides “Complete Redundancy In the Box.”

Virtualization can cause problems if you take 10 or 20 servers and virtualize them all onto a single piece of hardware (your host server); you’ve created a single point of failure. If your single host goes down you lose a whole workload. Because Microsoft realizes that the value of these hosts exponentially increases, they’ve worked hard to create complete redundancy and protection in all core services; from the bottom of the stack, to the top. Below are the five levels, from the physical to the virtual, where Server 2012 can protect your data.

1. Hardware Fault:

  • Windows Hardware Error Architecture (WHEA)
  • Reliability, Availability, Serviceability (RAS)

Server 2012 provides RAS Hardware Fault detection capabilities. For example, if a memory controller detects an address is failing, hyper-v is notified, and the address space is taken offline. The information about the hardware fault is sent to the BDC store so it survives reboots, and the server never allocates this area of memory again. This is a nice feature that fixes hardware errors without user intervention.

2. Physical Node Redundancy:

  • Live Migration for Planned Downtime
  • Failover Cluster for Unplanned Downtime

If a server is going to go down unplanned, we want the workloads to failover without any user intervention. In planned downtime, we can live-migrate, perform maintenance on our host, then live-migrate our workloads back over to our repaired host. This physical node redundancy is provided with Hyper-V 2012.

3. I/O Redundancy:

  • Network Load Balancing & Failover via Windows NIC Teaming
  • Storage Multi-Path I/O (MPIO)
  • Multi-Channel SMB (Server 2012 Fileserver)

Storage Multi-path I/O provides redundancy for example iSCSI and  Fiber Channel both from a Host level and a Guest level. If you’re using a Server 2012 fileserver as back-end storage, you get redundancy her as well.

4. Application/Service Failover:

  • Non-Cluster Aware Apps: Hyper-V App Monitoring
  • VM Guest Cluster: iSCSI, Fiber Channel
  • VM Guest Teaming of SR-IOV NICs

If you’re running VM’s and you want to provide failover of applications within the virtual machine, you can cluster applications already. For Non-cluster aware (legacy) apps, Server 2012 provides App Monitoring which does light-level health monitoring such as restarting processes and notifications automatically etc.

5. Disaster Recovery:

  • Hyper-V Replica for Asynchronous Replication
  • CSV 2.0 Integration with Storage Arrays for Synchronous Replication

If you want to failover to another site completely, with cluster shared volumes, you need redundancy – two of everything. All nodes simultaneously monitor eachother through a heartbeat network. Every node keeps track of every other node in the cluster in a registry database (states/properties). If a node crashes, all the other nodes know about what workloads and VM’s were running on the crashed node. It will connect to the appropriate VHDs on the network, and pick up that workload. There are two types of clustering that we can consider: Host Clustering and Guest Clustering.

Host Clustering: The most common type of clustering, were we are clustering the physical servers, and we can move apps and vms between the servers.

  • Avoids a single point of failure when consolidating
  • VM’s can survive a host crash because the replica VM is restarted on another node, VMs can be restarted on the same node also when a the VM OS crashes or hangs.
  • Zero downtime maintenance and patching (live-migrate VMs to other hosts
  • Mobility and Load distribution – Live Migrate VMs to different servers to load balance.

Guest Cluster: Two virtual machines running Windows Server and form a cluster themselves for high availability. If one of the two needs to be patched, we can fail over to the other VM. If one of the two crashes, it will fail-over for example that SQL server to the live VM. The difference between virtual and physical options for storage on Guest Clusters is it requires virtualized HBAs: Virtualized Fiber Channel, Fiber Channel over Ethernet and iSCSI (not Serial Attached SCSI).

Combining Host and Guest Clustering:

It’s recommended to combine Host and Guest clustering for flexibility and protection. You can combine all VMs as long as your VMs pass the Clusters Best Practice Analyzer, known as “Validate”.

 

 

Kudos to Microsoft Virtual Academy, Symon Perriman, and Jeff Woolsey