Strange DRS behavior during vSphere 5.1 upgrade

This week I upgraded a customer running vSphere 4.0 Update 1 to vSphere 5.1 when discovering a strange DRS behavior. My upgrade path was to first install a fresh vCenter 5.1 and then move the ESX 4.0 hosts into vCenter 5.1. Next step was to put the first host into maintenance mode, shutdown, remove from cluster, perform a fresh install with ESXi 5.1 and then add it to the cluster.

After the first host was installed with ESXi 5.1 and added to the cluster, I suddenly saw DRS moving virtual machines to that first host. Not just a few but a lot of VMs. The host got loaded to 96% and then DRS stopped migrating VMs.

Since performance was still good, I decided to put host number two in maintenance mode and continue the upgrade. DRS migrated the VMs from host number two to the remaining hosts and surprisingly enough also managed to add some more VMs to host number one, which had just gone from 96% to 94% memory load. By adding those VMs it went to 96% again. During upgrade of host number two I saw the memory load drop a few percent again and at that point DRS moved some more VMs to the number one host. When host number two was installed with ESXi 5.1 and member of the cluster again, DRS started loading that host too until it hit 96%. In the image below, host number three (vmw15) has just been installed with 5.1 and only just added to the cluster, VMs where starting to move to this host.

Another strange thing was that (of course) the cluster didn’t reach a good balance at all when migrating all those VMs to the 5.1 hosts.

I was convinced that once all hosts had been upgraded my problem would be solved so I continued the upgrade of the remaining hosts. Since I was also curious to know what causes this behavior I decided to create a support request with VMware and the engineer told me they hadn’t seen this behavior before and will try to reproduce this behavior in their own test lab. I hope to hear more in a few days.¬†After finishing all the upgrades the cluster became stable and I experienced no further issues.