Hyper-V, not in my datacenter (part 2: Guest OS and Memory overcommit)

28 November, 2008

Guest OS

After you installed your hypervisor you want to virtualize your physical servers on it. When doing the research for my presentation I was very surprised to find some strange limitations in which guest are supported with Hyper-V. Can you believe that not all Windows Server versions are supported? Well neither could I.

The guest OS support of Hyper-V:

  • Windows 2000 Server is only supported with SP4 and 1 CPU.
  • Windows 2003 Server is only supported with SP2 and 1 or 2 CPUs.
  • Windows 2008 Server is supported with 1, 2 and 4 CPUs.
  • and there is Linux support for SUSE Linux Server 10 sp1 / sp2 with only 1 CPU.

This does look quite reasonable at first glance, because these are current versions. True, but when looking at the ESX environment I work in now, we would not be able to virtualize as many servers as we have running on ESX right now. Let’s see:

  • Customer has 721 VMs,
  • 4 of them are running Linux (RedHat EL),
  • then we have 2 running on Windows NT4 (YES you read it correctly, NT4),
  • 8 on Windows 2000 mostly 2 CPU,
  • 15 Windows 2003 4 CPU guest,
  • about a 100 VMs are running Windows 2003 Sp1
  • and the remaining is on Sp2 with 2 or less CPUs.

Now, I admit, we are not the quickest with upgrading our systems and sp2 is around for almost one year now, but we have many financial applications whose suppliers are also quite slow certifying their application for new service packs. This customer deals with millions of euro’s per day. They cannot afford downtime for this set of financial apps and will only upgrade to OS versions / service packs if the supplier supports it.

Looking at this environment, there is quite a number of systems that would not be supported on Hyper-V. Big deal you say? Actually, yes it is a big deal. Wasn’t cost saving by getting rid of a lot of physical hardware one of the main drivers to go virtual? Leaving a bunch of servers (around 130 in our case) run physical is quite a lot, especially because these will probably be somewhat older servers that are getting more expensive in their support contract and more vulnerable to failure.

VMware does support quite a broad range of OS’es and all with full capabilities, no strange cpu limitations. There are about 12 flavors of Linux, there is SUN, FreeBSD and of course Windows. Looking at the list, it seems ALL Windows Server versions are supported, from Windows NT4 to Windows 2008.

I can’t fully understand Microsoft on this. Of course I’m not a software engineer who understands anything about software limitations and what effect 1,2 or 4 cpu’s have on an OS and its service pack level, but still… why these support limits? I remember how hard it often is to get companies to run Exchange or SQL on VMware when they don’t get an official support statement from Microsoft. Now why is Microsoft doing this to themselfs? Is it really such an effort to get Windows 2003 supported on all service packs and on 1,2 and 4 cpu’s?

Memory over-commitment

One major point that Microsoft always mentions in his campaign, is that VMware’s transparent page sharing and memory over-commit techniques are not suited for the enterprise environment. Because, who would make VMs share memory? Well, let me tell you, at first I could agree with Microsoft on this. Surely I would never let my VMs starve for memory and never over-commit memory. I will constantly monitor my ESX environment and check how much memory my guest really need and make sure that the right amount of memory is present. Never let your ESX host (or other hypervisor) swap for memory to disk, never!!!

But wait. After reading some MS articles and VMware articles on this, I think the definition of memory over-commit is not as clear to everybody. What is memory over-commitment???
Memory over-commitment is when you assign more memory to your VMs, than the amount of memory present in your host.

Microsoft says this is not a smart thing to do and they tell their customers that you therefore don’t need this feature. Well, to be honest, VMware to thinks this not smart IF your VMs really need this amount of memory. And there is the big difference: IF

Now we all know those physical servers, the ones running with 2, 3, 4Gb of RAM because the application manual says that application X really needs 4Gb? We also all know, whilst monitoring these applications, the physical box never reached more than ½ the total memory on any given busy day? This is the point where VMware steps in.
If you have 20 VMs, with each 2Gb assigned but they all use no more than 1Gb, then you would be throwing away 10Gb of RAM in the host that never gets used. Now, this example is hypothetical, but you get the point. And in a real world scenario, I wouldn’t draw the hard line at 10Gb, but the point is that some VMs will stay way below their assigned memory and some will use all of it, however after running a couple of weeks, you will find a stable average of memory and peak memory usage, thereby enabling you to scale your hosts memory profile.

I’ve taken a quick snap of the environment of one of my customers. They now have 16 ESX hosts in one cluster. In the first column you see the amount of memory that is present in the host. Second column shows the amount of memory that is assigned to the VMs. Third column is the amount of memory over-commitment.

Name Host (Gb) Assigned (Gb) Over-Commit
esx-01 40 38 0
esx-02 40 46 6
esx-03 40 33 0
esx-04 40 48 8
esx-05 40 35 0
esx-06 40 49 9
esx-07 40 42 2
esx-08 40 37 0
esx-09 40 45 5
esx-10 40 52 12
esx-11 40 48 8
esx-12 40 37 0
esx-13 40 42 2
esx-14 40 46 6
esx-15 64 87 23
esx-16 64 85 21
688 Gb 102 Gb

What does this table tell us? The total of physical memory on all their hosts is 688 GB of RAM, but on some host they have assigned more RAM to VMs then physically present. A total of 102 GB more!!! That is no typo, that really is one-hundred and two Gigabytes of RAM. VMware ESX saved them buying 102 GB by allowing memory over-commit. To me, that is quite a saving, a saving you won’t get using Hyper-V. Actually, there is more.

Transparent Pagesharing

Say, we have 10 VMs running Windows 2003 Server and in each VM there is about 500Mb of identical data, because it is just the basics of Windows 2003. So you have 4,5GB (5GB – 500Mb) of redundant data in your expensive RAM. Not, with VMware transparent page-sharing you haven’t. So there is another great memory saver. Now look at the bigger picture again and see how much memory would be saved if you have 400 Windows VMs and add those savings of memory over-commitment…… I think you get the point and why you can really benefit from these techniques.

Thank you Tom Howarth for checking this post before publishing.

Series:
Hyper-V, not in my datacenter (part 1: Hardware)
Hyper-V, not in my datacenter (part 2: Guest OS and Memory overcommit)
Hyper-V, not in my datacenter (part 3: Motions and storage)

  • Great article for people that want to understand memory-overcommitment and transparant pagesharing!
  • Both ESX and Hyper-V are based on a hypervisor running on bare metal but uses 2 different architectures. The main difference as Andrew states is that once ESX is up and running the RHEL based servcie console can actually be unavailable and the VM’s on the ESX host will still run even though they will be unmanageable.

    In Hyper-V the VM’s or child partitions are dependant on both the parent partition and the hypervisor. If the parent partition is not available and working then the guest VM’s will not work at all.
  • Ah ok that's cool, thanks for the explanation. I was about to post that I didn't understand. So there's a difference in architecture design btw the two?
    George
  • Kukulkan
    The biggest hole in your review is ignoring the cost of ESX licensing.
  • Hi, no I don't ignore the cost of ESX licensing. But you can't just say ESX costs (for example) $1000 and Hyper-V is free. If you can run 40 VMs on 1 ESX host and only 10 VMs on that same host when it is running Hyper-V, then you need more on hardware with Hyper-V. So yes the license cost of ESX is more, but don't your care on the total cost too?

    Please, for good discussion, also supply your e-mail address when entering new comment.
    Gabe
  • penman96
    What the memory over-commitment table tells me is that 3 or 4 of these ESX host servers may start disk paging if there is a utilization spike. DRS wouldn't help either, as DRS isn't memory overcommitment aware. As you correctly said at first, memory overcommitment in production is a risky idea. VMware best practice discourages it too - even if their marketing materials do not. ESX 15 has a 25% memory overcommitment.

    Not in my datacenter ;-)
  • Fantastic post. From last couple of month I need information about The guest OS support of Hyper-V. And finally I got all information. The post is very informative.
  • Miquel Àngel
    We are in the same problem we have in our CPD 90% microsoft products but when we wont virtualize we did it with ESXi because we still have windows NT in some servers....microsoft rules!
  • Hi Gabe,

    It does become evident that Microsoft has failed to get the concept of "Sharing Resources" in order to maximize utilization of the hardware layer. I view Microsoft as an infant with attitude in the world of virtualization services.

    Regards,

    Mike
  • @Bridget: To be honest, I'm not always very consequent in these kind of things. I did mean Gigabytes of memory, so GB it is then :-)

    And I'm honored to be quoted :-)

    Please remember there is a lot of confusion about what "overcommit" really is. Think it is essential to explain this to your audience.

    Regards
    Gabrie
  • Hi Gabe,

    I'm a reporter with SearchServervirtualization.com, writing an article about VMware vs. Hyper-V, and I found your input n memory overcommit useful. In fact, I am going to quote you and link to this blog in the article.
    One questions though - when you talk about RAM here: "If you have 20 VMs, with each 2Gb assigned but they all use no more than 1Gb, then you would be throwing away 10Gb of RAM in the host that never gets used..."
    You mean Gigabytes (GB), not gigabits (Gb), correct? Just making sure the Gb is only a typo...

    Thanks
  • Great article, good to get such independent reviews. As a VMWare instructor, I hear VMWare implementation stories of many different companies. 19 out of 20 couldn't care less about things like DRS and memory overcommitment - they say 'it is cheaper to add another server than to find out how DRS should be configured'. The argument of cost is not a good choice to try to get the bias towards VMWare - just add up VMWare licensing costs and compare it with a HyperV solution and you can pay for way more than 102GB of memory...... But VMWare is still better (-;
  • No, I'm not saying that at all. As far as I know all those distros run perfectly well under Hyper-V using emulated hardware (as does Windows NT 4). You can run them all you like. It's just not supported by Microsoft if your guest OS breaks because of the hypervisor layer. If Debian or Ubuntu want to get their distro officially supported under Hyper-V I'm sure they could arrange that somehow (I don't know what the process is exactly).
  • So what you are actually saying is that MS will never ever allow to have distro's like BSD, Ubuntu, Debian etc running on their Hypervisor withou breaking support?

    I know what it means, but as a result you'll have a very limited set of os'es you can run on your hypervisor.
  • Duncan

    What that means is that if you are running a supported OS under Hyper-V and something goes wrong, there are agreements in place to avoid finger pointing and ensuring the issue gets resolved regardless of whether the problem is with the Hypervisor or with the guest OS. That's what supported means to Microsoft. What happens if VMware simply says 'Yeah, Windows NT crashed, talk to Microsoft"? That's the difference between providing some vmware tools, and actually supporting.

    Cheers

    Stu
blog comments powered by Disqus