vCenter HA (VCHA)

Introduction:

No matter what is the naure of your business application are, there are good chances that they are being powered by VMware vSphere virtualization hosted on your org’s private, public or hybric cloud. Important are availability aspects of underlying infrastructure to ascertain smooth availability, monitoring and management of your application workloads.

Today, when public cloud giants are working round the clock to beat the competition in cloud offerings, a lot of enhancements are happening resulting in greater workload flexibility and agile datacenters. If you have put your hands already into VMware cloud on AWS or bare-metal offerings on IBM Softlayer, I can hear your nod already.

With that said, I am sure availability of your application and security of data are the most important aspects but manageability and monitoring of applications / workloads play an important part as well. Similarly, vCenter Server being a management plane component, doesn’t actually affect your production when not available but plays an important part in terms of monitoring, self service provisioning aspects and when providing management plane to NSX.

VMware released the high availability feature of VMware vCenter Server® 6.5 to ensure high availability against various hardware and software failures. Although nothing in this world comes without a cost, in this case, cost would be a bit of overhead, performance impact and performance penalty in case of high latency networks. However that’s a trade off between what you get and what you pay. Hence, it depends on your SLA’s and management plane reliability whether you should enable VCHA or not. But irrespective of that, lets dig into some more detail about the feature in the coming sections

 

Pre-Requisites:

Here is a list of pre-reqs to enable vCenter HA:

1) vCenter Server Appliance (Supported only on appliance)
2) Minimum three hosts cluster where VCSA will be running
3) A dedicated port group (different from management IP VLAN) to run cluster management IPs

vCenter Server HA Cluster Management Network
vCenter Server HA Cluster Management Network

 

Functionality:

The vCenter High Availability architecture uses a three-node cluster to provide availability against multiple types of hardware and software failures. A vCenter HA cluster consists of one Active node that serves client requests, one Passive node to take the role of Active node in the event of failure, and one quorum node called the Witness node. Any Active and Passive node-based architecture that supports automatic failover relies on a quorum or a tie-breaking entity to solve the classic split-brain problem, which refers to data/availability inconsistencies due to network failures within distributed systems maintaining replicated data. Traditional architectures use some form of shared storage to solve the split-brain problem. However, in order to support a vCenter HA cluster spanning multiple datacenters, our design does not assume a shared storage–based deployment. As a result, one node in the vCenter HA cluster is permanently designated as a quorum node, or a Witness node. The other two nodes in the cluster dynamically assume the roles of Active and Passive nodes. vCenter Server availability is assured as long as there are two nodes running inside a cluster. However, a cluster is considered to be running in a degraded state if there are only two nodes in it. A subsequent failure in a degraded cluster means vCenter services are no longer available.

vCenter Server HA
vCenter Server HA Replication

A vCenter Server appliance is stateful and requires a strong, consistent state for it to work correctly. The appliance state (configuration state or runtime state) is mainly composed of:
• Database data (stored in the embedded PostgreSQL database)
• Flat files (for example, configuration files).

For the state to be stored inside the PostgreSQL database, we use the PostgreSQL native replication mechanism to keep the database data of the primary and secondary in sync. For flat files, a Linux native solution, rsync, is used for replication. Because the vCenter Server appliance requires strong consistency, it is a strong requirement to utilize a synchronous form of replication to replicate the appliance state from the Active node to the Passive node.

A vCenter HA cluster requires a vCenter HA network that is separate from the management network for the vCenter Server appliance. As such, 3 FQDNs or static IP addresses are required to be assigned to each node that is used for VCHA cluster traffic on the isolated VCHA network. Clients can have access to the Active vCenter Server appliance via the management network interface, which is public.

Different Nodes and their roles:

vCenter Server HA Nodes
vCenter Server HA Nodes

• Active Node:
– Node that runs the active instance of vCenter Server. – Enables and uses the public IP address of the cluster.
• Passive Node:
– Node that runs as the passive instance of vCenter Server.
– Constantly receives state updates from the Active node in synchronous mode.
– Equivalent to the Active node in terms of resources.
– Takes over the role of Active Node in the event of failover.
• Witness Node:
– Serves as a quorum node.
– Used to break a tie in the event of a network partition causing a situation where the Active and Passive nodes cannot communicate with each other.
– A light-weight VM utilizing minimal hardware resources.
– Does not take over role of Active/Passive nodes.

Availability of the vCenter Server appliance works as follows under the following failure conditions:

1. Active node fails:
– As long as the Passive node and the Witness node can communicate with each other, the Passive node will promote itself to Active and start serving client requests.
2. Passive node fails:
– As long as the Active node and the Witness node can communicate with each other, the Active node will continue to operate as Active and continue to serve client requests.
3. Witness node fails:
– As long as the Active node and the Passive node can communicate with each other, the Active node will continue to operate as Active and continue to serve client requests. The Passive node will continue to watch the Active node for failover.
4. More than one node fails or is isolated:
– This means all three nodes—Active, Passive, and Witness—cannot communicate with each other. This is more than a single point of failure and when this happens, the cluster is assumed non-functional and availability is impacted because VCHA is not designed for multiple failures.
5. Isolated node behavior:
– When a single node gets isolated from the cluster, it is automatically taken out of the cluster and all services are stopped. For example, if an Active node is isolated, all services are stopped to ensure that the Passive node can take over as long as it is connected to the Witness node.
– Isolated node detection takes into consideration intermittent network glitches and resolves to an isolated state only after all retry attempts have been exhausted.

 

vCenter Server HA Cluster Health

It is worthwhile noticing that an anti-affinity rule, to keep all three nodes separate, gets automatically created to ensure one hardware failure doesn’t cause multiple cluster nodes to go down simultaneously

VCHA Anti-Affinity Rules

You can easily check cluster health and nodes status by navigating to ‘Monitor’ tab after selecting vCenter Server on the left navigation pane as shown in the figure below:

VCHA nodes health

9
Leave a Reply

avatar
9 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
9 Comment authors
Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Kareem Taormina

This site is absolutely fabulous!

Janessa Veerkamp

Keep up the great work guyz.

minecraft

It’s very effortless to find out any matter on net as compared to textbooks, as I found this post at this website.

minecraft

I don’t even know how I ended up here, but I thought this post was good.

I don’t know who you are but certainly you are going to a
famous blogger if you are not already 😉 Cheers!

Anonymous

Hello there, I found your blog by the use of
Google whilst searching for a comparable matter, your website came up, it appears to be like good.
I’ve bookmarked it in my google bookmarks.
Hello there, simply was aware of your weblog thru Google, and found that it
is really informative. I’m going to watch out for brussels.
I’ll be grateful when you proceed this in future.
A lot of people will likely be benefited from your
writing. Cheers!

Anonymous

May I just say what a relief to discover a person that actually understands
what they are talking about on the net. You actually know how to
bring a problem to light and make it important.
More and more people really need to look at this and understand this
side of your story. I was surprised that you are not more popular since you surely have the gift.

Anonymous

You need to take part in a contest for one of the most useful blogs on the internet.
I’m going to highly recommend this web site!

Anonymous

I’m not that much of a online reader to be honest but your sites really nice, keep it up!
I’ll go ahead and bookmark your site to come back in the future.

All the best

Anonymous

Great post! We will be linking to this great content on our site.
Keep up the good writing.