Introduction and Configuration of HA (High Availability) in VMware vSphere 5.0
November 11, 2011 3 Comments
What is VMware HA:
VMware High Availability (HA) provides a simple and cost effective clustering solution to increase uptime for virtual machines. HA uses a heartbeat mechanism to detect a host or virtual machine failure. In the event of a host failure, affected virtual machines are automatically restarted on other production hosts within the cluster with spare capacity. In the case of a failure caused by the Guest OS, HA restarts the failed virtual machine on the same host. This feature is called VM Monitoring, but sometimes also referred to as VM HA.
How Does VMware HA Work?
VMware HA continuously monitors all virtualized servers in a resource pool and detects physical server and operating system failures. To monitor physical servers, an agent on each server maintains a heartbeat with the other servers in the resource pool such that a loss of heartbeat automatically initiates the restart of all affected virtual machines on other servers in the resource pool.
VMware HA leverages shared storage and, for FibreChannel and iSCSI SAN storage, the VMware vStorage Virtual Machine File System (VMFS) to enable the other servers in the resource pool to safely access the virtual machine for failover. When used with VMware Distributed Resource Scheduler (DRS), VMware HA automates the optimal placement of virtual machines on other servers in the resource pool after server failure.
To monitor operating system failures, VMware HA monitors heartbeat information provided by the VMware Tools package installed in each virtual machine in the VMware HA cluster. Failures are detected when no heartbeat is received from a given virtual machine within a user-specified time interval.
VMware HA ensures that sufficient resources are available in the resource pool at all times to be able to restart virtual machines on different physical servers in the event of server failure. VMware HA is easily configured for a resource pool through VMware vCenter™ Server.
How is VMware HA Used?
Ensuring the availability of virtual machines within an environment is of paramount concern to administrators. VMware HA alleviates these concerns by providing protection from failures within the following three key layers:
• The infrastructure layer
At this layer, VMware HA monitors the health of the virtual machine and will attempt to restart the virtual machine when a failure, such as the loss of a physical host, occurs. This protection is independent of the OS used within the virtual machine.
• The OS layer
Through the use of VMware Tools installed within the OS, VMware HA can monitor the OS for proper operation. This protects against such failures as an unresponsive OS.
• The application layer
With some customization or with a third-party tool, an administrator can also monitor the application running within the OS for proper operation. In the event of a failure of the application, HA can be triggered to restart the virtual machine hosting the application.
In this section, you will learn how to enable, configure, and test the operation of HA to provide basic high availability services for your virtual machines at the infrastructure layer.
Before continuing, it is important that the environment be configured properly with the the following:
Ensure that you have a working management network with all hosts in the environment.
Verify that all of the virtual machines are online.
Have at least one virtual machine running on each host.
Validate that you have access to VMware vCenter™ utilizing the vSphere Client.
Shared Storage for VMs – NFS, SAN, iSCSI
- Enabling HA is a straightforward process that simply entails editing the properties for the cluster. The following
- steps will guide you through this process
Step 1 – Connect to vCenter Server
Step2 – Go to Cluster Summary
Step 3 – Turn on HA
In the cluster summary screen, select the Edit Settings option. This will bring up a wizard that you can use to modify the settings of the cluster. Click the check box next to Turn On vSphere HA and select OK. This will close the wizard and the system will initialize VMware HA.
Under the Recent Tasks pane of the vSphere Client, you can observe the progress of the initialization of HA on the systems within the cluster. You’ll notice that the configuration tasks occur in parallel among all the hosts within the cluster.
Step 4 – Verifying VMware HA Enablement
After enabling HA, you will notice that a section for HA is now shown under the cluster summary screen. This will show you general information about the configuration of HA. There is also an option for Cluster Status here.
Click this to bring up the HA Cluster Status screen.
Under this screen, you will notice three tabs. There is one tab each for Hosts, VMs, and Heartbeat Datastores. On the Hosts tab, you will see the system that is acting as the Master node. You will also see the number of hosts that are currently connected to this Master. The number shown should equal the number of hosts that are contained within you cluster, minus one for the Master.
Under the VMs tab, a summary of the virtual machine protection states is displayed. The virtual machines that
were powered on when VMware HA was enabled are in the Protected state.
Clicking the Heartbeat Datastores tab will display information about the datastores that were selected as heartbeat datastores. Heartbeat datastores allow a secondary means of communication between the hosts in case of a loss of the management network. By selecting a particular datastore, you will display a list of all the hosts that are using the selected datastore as a heartbeat datastore.
Click OK to exit the cluster status screen.
Step 5 – Configuring VMware HA Advanced Options
VMware HA provides a user with the ability to change various options based on their individual needs. This
section provides an overview of the most commonly used options.
Select cluster and click edit settings:
This brings up the wizard that allows you to edit the cluster settings. Once VMware HA is enabled, additional
settings are displayed allowing for the configuration of VMware HA.
1. vSphere HA tab
In the cluster settings dialog box, select vSphere HA from the navigation tree on the left. This allows you to edit
the Host Monitoring Status and Admission Control attributes.
Host monitoring enables VMware HA to take action if a host fails to send heartbeats over the management network. During maintenance operations on the management network, it is possible that the hosts will not be able to send heartbeats. When this occurs, you should unselect this option to prevent VMware HA from believing
the hosts are isolated.
Admission control is used to ensure that adequate resources within the cluster are available to facilitate failover if
needed. It also serves to ensure that the virtual machine reservations are respected. Three options are available to specify the desired admission control policy. These include the following:
- Host failures
This option attempts to reserve enough capacity within the cluster to provide for the failure of any host within
As with the host failures option, this also attempts to reserve enough capacity within the cluster. However, this
option allows you to specify a percentage of CPU and memory that you want reserved.
Alternately, you can specify particular hosts within the cluster that will be used as a preferred target host to start any virtual machines that were protected on a failed host. In the event of a failure, vSphere HA will first attempt to restart the protected VMs on these hosts before trying others. Additionally, vSphere HA prevents VMs from being moved to these hosts, or powered on by the user or vSphere Distributed Resource Scheduler (DRS) on these hosts.
2. Virtual Machine Options tab:
Select Virtual Machine Options from the left-hand navigation pane. Here, you can define the behavior of virtual machines for VMware HA. The two settings you can edit are the VM restart priority and the Host Isolation response.
The VM restart priority enables you to specify the order that virtual machines will be started in the event of a failure. In cases where there might not be enough resources available within the cluster to accommodate the restart of a series of virtual machines, this setting allows a level of prioritization, allowing the most important virtual machines to be restarted first. Notice that this can be set on a per–virtual machine basis as well.
Host Isolation Response specifies the behavior that HA will take in the event that a host is determined to be isolated. Host isolation occurs when a host loses the ability to communicate through the management network to the other hosts within the environment and is unable to ping its configured isolation addresses—this is the default gateway. In this event, the host is still functioning, although it is not able to communicate. The default setting for this is Leave powered on.
3. Virtual Machine Monitoring tab:
Selecting VM Monitoring from the left-hand navigation pane enables you to change settings related to the monitoring of the OS or application running within a virtual machine. In order to use this feature, you must have VMware Tools installed within the virtual machine.
By selecting the Custom option, you can exert a fine level of control over the various parameters involved. You can specify these settings on a per–virtual machine basis:
4. Database heartbeating tab:
Storage heartbeats provide a secondary communication path in the event of a failure of the management network. This is advantageous, because it provides another level of redundancy and allows for the determination of failure between a network and a host failure. By default, two datastores will be chosen based on the connectivity they have to other hosts and the type of storage. This attempts to provide protection against array failures and allows for the highest number of hosts to utilize the heartbeat datastore. The datastores utilized can be manually specified if desired.
Step 6 – How VM will be migrated in case a host failure:
The most common failure case involves the failure of a physical host. This can be for a variety of reasons, such as a loss of power to the host or a motherboard failure.
When this event occurs, VMware HA will identify the failure of the host and will attempt to restart the protected virtual machines on a functional host.
First, use the vSphere Client to examine the virtual machines hosted within the cluster. In this example, we are going to cause the system tm-pod1-esx01.tmsb.local to fail. You need to check the virtual machines in your environment and ensure that at least one is online on the host that you are going to fail.
Next, remove the power from one of your hosts. By looking at the hosts within the cluster, you will see that VMware HA will detect the failure of the host and generate an alert.
By examining the events, you will see messages similar to the ones demonstrated in the preceding figure validating that VMware HA has detected the failure.
After a failure of a host has been detected, HA will attempt to restart the virtual machines that were running on the failed host on other available hosts within the cluster. Go back to the virtual machine view of your cluster and notice that the virtual machines that were previously on the failed host are now online on other hosts.
You can also examine the events for a host to see the log messages denoting that VMware HA has attempted to
restart the virtual machine.
By selecting the Summary tab for the failed host, you will notice that the issue is displayed in multiple places. The first is located at the top of the screen and second location is the vSphere HA State. At this point, you will reapply power to the failed host and allow it to boot. Once it completes this process, you will see that it rejoins the cluster and continues to function as before.
VMware High Availability (HA) provides easy-to-use, costeffective high availability for applications running in virtual machines. In the event of physical server failure, affected virtual machines are automatically restarted on other production servers with spare capacity. In the case of operating system failure, VMware HA restarts the affected virtual machine on the same physical server. The combination of VMware HA and the other availability features of the VMware vSphere™ platform provides organizations the ability to select and easily deliver the level of availability required for all of their important applications
All Articles on VMware vSphere