Hyper-V, not in my datacenter (part 2: Guest OS and Memory overcommit)

Guest OS

After you installed your hypervisor you want to virtualize your physical servers on it. When doing the research for my presentation I was very surprised to find some strange limitations in which guest are supported with Hyper-V. Can you believe that not all Windows Server versions are supported? Well neither could I.

The guest OS support of Hyper-V:

  • Windows 2000 Server is only supported with SP4 and 1 CPU.
  • Windows 2003 Server is only supported with SP2 and 1 or 2 CPUs.
  • Windows 2008 Server is supported with 1, 2 and 4 CPUs.
  • and there is Linux support for SUSE Linux Server 10 sp1 / sp2 with only 1 CPU.

This does look quite reasonable at first glance, because these are current versions. True, but when looking at the ESX environment I work in now, we would not be able to virtualize as many servers as we have running on ESX right now. Let’s see:

  • Customer has 721 VMs,
  • 4 of them are running Linux (RedHat EL),
  • then we have 2 running on Windows NT4 (YES you read it correctly, NT4),
  • 8 on Windows 2000 mostly 2 CPU,
  • 15 Windows 2003 4 CPU guest,
  • about a 100 VMs are running Windows 2003 Sp1
  • and the remaining is on Sp2 with 2 or less CPUs.

Now, I admit, we are not the quickest with upgrading our systems and sp2 is around for almost one year now, but we have many financial applications whose suppliers are also quite slow certifying their application for new service packs. This customer deals with millions of euro’s per day. They cannot afford downtime for this set of financial apps and will only upgrade to OS versions / service packs if the supplier supports it.

Looking at this environment, there is quite a number of systems that would not be supported on Hyper-V. Big deal you say? Actually, yes it is a big deal. Wasn’t cost saving by getting rid of a lot of physical hardware one of the main drivers to go virtual? Leaving a bunch of servers (around 130 in our case) run physical is quite a lot, especially because these will probably be somewhat older servers that are getting more expensive in their support contract and more vulnerable to failure.

VMware does support quite a broad range of OS’es and all with full capabilities, no strange cpu limitations. There are about 12 flavors of Linux, there is SUN, FreeBSD and of course Windows. Looking at the list, it seems ALL Windows Server versions are supported, from Windows NT4 to Windows 2008.

I can’t fully understand Microsoft on this. Of course I’m not a software engineer who understands anything about software limitations and what effect 1,2 or 4 cpu’s have on an OS and its service pack level, but still… why these support limits? I remember how hard it often is to get companies to run Exchange or SQL on VMware when they don’t get an official support statement from Microsoft. Now why is Microsoft doing this to themselfs? Is it really such an effort to get Windows 2003 supported on all service packs and on 1,2 and 4 cpu’s?

Memory over-commitment

One major point that Microsoft always mentions in his campaign, is that VMware’s transparent page sharing and memory over-commit techniques are not suited for the enterprise environment. Because, who would make VMs share memory? Well, let me tell you, at first I could agree with Microsoft on this. Surely I would never let my VMs starve for memory and never over-commit memory. I will constantly monitor my ESX environment and check how much memory my guest really need and make sure that the right amount of memory is present. Never let your ESX host (or other hypervisor) swap for memory to disk, never!!!

But wait. After reading some MS articles and VMware articles on this, I think the definition of memory over-commit is not as clear to everybody. What is memory over-commitment???
Memory over-commitment is when you assign more memory to your VMs, than the amount of memory present in your host.

Microsoft says this is not a smart thing to do and they tell their customers that you therefore don’t need this feature. Well, to be honest, VMware to thinks this not smart IF your VMs really need this amount of memory. And there is the big difference: IF

Now we all know those physical servers, the ones running with 2, 3, 4Gb of RAM because the application manual says that application X really needs 4Gb? We also all know, whilst monitoring these applications, the physical box never reached more than ½ the total memory on any given busy day? This is the point where VMware steps in.
If you have 20 VMs, with each 2Gb assigned but they all use no more than 1Gb, then you would be throwing away 10Gb of RAM in the host that never gets used. Now, this example is hypothetical, but you get the point. And in a real world scenario, I wouldn’t draw the hard line at 10Gb, but the point is that some VMs will stay way below their assigned memory and some will use all of it, however after running a couple of weeks, you will find a stable average of memory and peak memory usage, thereby enabling you to scale your hosts memory profile.

I’ve taken a quick snap of the environment of one of my customers. They now have 16 ESX hosts in one cluster. In the first column you see the amount of memory that is present in the host. Second column shows the amount of memory that is assigned to the VMs. Third column is the amount of memory over-commitment.

Name Host (Gb) Assigned (Gb) Over-Commit
esx-01 40 38 0
esx-02 40 46 6
esx-03 40 33 0
esx-04 40 48 8
esx-05 40 35 0
esx-06 40 49 9
esx-07 40 42 2
esx-08 40 37 0
esx-09 40 45 5
esx-10 40 52 12
esx-11 40 48 8
esx-12 40 37 0
esx-13 40 42 2
esx-14 40 46 6
esx-15 64 87 23
esx-16 64 85 21
688 Gb 102 Gb

What does this table tell us? The total of physical memory on all their hosts is 688 GB of RAM, but on some host they have assigned more RAM to VMs then physically present. A total of 102 GB more!!! That is no typo, that really is one-hundred and two Gigabytes of RAM. VMware ESX saved them buying 102 GB by allowing memory over-commit. To me, that is quite a saving, a saving you won’t get using Hyper-V. Actually, there is more.

Transparent Pagesharing

Say, we have 10 VMs running Windows 2003 Server and in each VM there is about 500Mb of identical data, because it is just the basics of Windows 2003. So you have 4,5GB (5GB – 500Mb) of redundant data in your expensive RAM. Not, with VMware transparent page-sharing you haven’t. So there is another great memory saver. Now look at the bigger picture again and see how much memory would be saved if you have 400 Windows VMs and add those savings of memory over-commitment…… I think you get the point and why you can really benefit from these techniques.

Thank you Tom Howarth for checking this post before publishing.

Hyper-V, not in my datacenter (part 1: Hardware)
Hyper-V, not in my datacenter (part 2: Guest OS and Memory overcommit)
Hyper-V, not in my datacenter (part 3: Motions and storage)