During my migration from vSphere 4.1 to vSphere 5.5, I ran into an issue I had never experienced before. When your VMware HA network is partitioned, vCenter will not let you perform VMotions. At first I was surprised but after some searching I learned the reasons behind it and it makes complete sense now.
I had a cluster running ESX 4.0/4.1 hosts with a very basic network config. Service Console was in VLAN100, VMotion in VLAN110 each in different IP ranges. The new ESXi 5.5 hosts would get a completely different network config. First vmkernel port in VLAN200, second vmkernel port in VLAN210 both enabled for management traffic. And there were two VMotion vmkernel ports in VLAN220. When adding these new hosts to the same clusters as the ESX 4.0/4.1 hosts, I received a number of HA error messages:
- The vSphere HA Agent on the host is alive and has management network connectivity, but the management network has been partitioned.
- This state is reported by a vSphere HA Master Agent that is in a partition other than the one containing this host.
- The vSphere HA protected VMs running on the host are monitored by one or more vSphere HA Master Agents, and the agents will attempt to restart the VMs after a failure.
This seemed logical since the old hosts had their Service Console interface in a different IP range than the new ESXi 5.5 hosts. Shouldn’t be a problem, since this situation would only last one day during working hours, while I was working on that cluster and my plan was to VMotion the VMs in the cluster from the old hosts to the new hosts. And that is where I was stopped. I was unable to free an 4.1 host because VMotion was not allowed to a new ESXi 5.5 host. Fortunately, KB 1033634 “vSphere HA and FT Error Messages” came to help and explained why I could not VMotion.
And as in many situations before, it is completely logical that VMware decided to treat a partition HA situation in this way. Taken from the KB Article: “The powered-on virtual machine you are attempting to migrate will not be protected by vSphere HA after the vMotion operation completes because the vSphere HA master agent is not currently responsible for it. vSphere HA will not restart the virtual machine if it subsequently fails. To restore vSphere HA protection, resolve any network partitions or disk accessibility issues.“.
Makes perfect sense and now you know too or even better, you already knew