Hyper-V, not in my datacenter (part 2: Guest OS and Memory overcommit)

Guest OS

After you installed your hypervisor you want to virtualize your physical servers on it. When doing the research for my presentation I was very surprised to find some strange limitations in which guest are supported with Hyper-V. Can you believe that not all Windows Server versions are supported? Well neither could I.

The guest OS support of Hyper-V:

  • Windows 2000 Server is only supported with SP4 and 1 CPU.
  • Windows 2003 Server is only supported with SP2 and 1 or 2 CPUs.
  • Windows 2008 Server is supported with 1, 2 and 4 CPUs.
  • and there is Linux support for SUSE Linux Server 10 sp1 / sp2 with only 1 CPU.

This does look quite reasonable at first glance, because these are current versions. True, but when looking at the ESX environment I work in now, we would not be able to virtualize as many servers as we have running on ESX right now. Let’s see:

  • Customer has 721 VMs,
  • 4 of them are running Linux (RedHat EL),
  • then we have 2 running on Windows NT4 (YES you read it correctly, NT4),
  • 8 on Windows 2000 mostly 2 CPU,
  • 15 Windows 2003 4 CPU guest,
  • about a 100 VMs are running Windows 2003 Sp1
  • and the remaining is on Sp2 with 2 or less CPUs.

Now, I admit, we are not the quickest with upgrading our systems and sp2 is around for almost one year now, but we have many financial applications whose suppliers are also quite slow certifying their application for new service packs. This customer deals with millions of euro’s per day. They cannot afford downtime for this set of financial apps and will only upgrade to OS versions / service packs if the supplier supports it.

Looking at this environment, there is quite a number of systems that would not be supported on Hyper-V. Big deal you say? Actually, yes it is a big deal. Wasn’t cost saving by getting rid of a lot of physical hardware one of the main drivers to go virtual? Leaving a bunch of servers (around 130 in our case) run physical is quite a lot, especially because these will probably be somewhat older servers that are getting more expensive in their support contract and more vulnerable to failure.

VMware does support quite a broad range of OS’es and all with full capabilities, no strange cpu limitations. There are about 12 flavors of Linux, there is SUN, FreeBSD and of course Windows. Looking at the list, it seems ALL Windows Server versions are supported, from Windows NT4 to Windows 2008.

I can’t fully understand Microsoft on this. Of course I’m not a software engineer who understands anything about software limitations and what effect 1,2 or 4 cpu’s have on an OS and its service pack level, but still… why these support limits? I remember how hard it often is to get companies to run Exchange or SQL on VMware when they don’t get an official support statement from Microsoft. Now why is Microsoft doing this to themselfs? Is it really such an effort to get Windows 2003 supported on all service packs and on 1,2 and 4 cpu’s?

Memory over-commitment

One major point that Microsoft always mentions in his campaign, is that VMware’s transparent page sharing and memory over-commit techniques are not suited for the enterprise environment. Because, who would make VMs share memory? Well, let me tell you, at first I could agree with Microsoft on this. Surely I would never let my VMs starve for memory and never over-commit memory. I will constantly monitor my ESX environment and check how much memory my guest really need and make sure that the right amount of memory is present. Never let your ESX host (or other hypervisor) swap for memory to disk, never!!!

But wait. After reading some MS articles and VMware articles on this, I think the definition of memory over-commit is not as clear to everybody. What is memory over-commitment???
Memory over-commitment is when you assign more memory to your VMs, than the amount of memory present in your host.

Microsoft says this is not a smart thing to do and they tell their customers that you therefore don’t need this feature. Well, to be honest, VMware to thinks this not smart IF your VMs really need this amount of memory. And there is the big difference: IF

Now we all know those physical servers, the ones running with 2, 3, 4Gb of RAM because the application manual says that application X really needs 4Gb? We also all know, whilst monitoring these applications, the physical box never reached more than ½ the total memory on any given busy day? This is the point where VMware steps in.
If you have 20 VMs, with each 2Gb assigned but they all use no more than 1Gb, then you would be throwing away 10Gb of RAM in the host that never gets used. Now, this example is hypothetical, but you get the point. And in a real world scenario, I wouldn’t draw the hard line at 10Gb, but the point is that some VMs will stay way below their assigned memory and some will use all of it, however after running a couple of weeks, you will find a stable average of memory and peak memory usage, thereby enabling you to scale your hosts memory profile.

I’ve taken a quick snap of the environment of one of my customers. They now have 16 ESX hosts in one cluster. In the first column you see the amount of memory that is present in the host. Second column shows the amount of memory that is assigned to the VMs. Third column is the amount of memory over-commitment.

Name Host (Gb) Assigned (Gb) Over-Commit
esx-01 40 38 0
esx-02 40 46 6
esx-03 40 33 0
esx-04 40 48 8
esx-05 40 35 0
esx-06 40 49 9
esx-07 40 42 2
esx-08 40 37 0
esx-09 40 45 5
esx-10 40 52 12
esx-11 40 48 8
esx-12 40 37 0
esx-13 40 42 2
esx-14 40 46 6
esx-15 64 87 23
esx-16 64 85 21
688 Gb 102 Gb

What does this table tell us? The total of physical memory on all their hosts is 688 GB of RAM, but on some host they have assigned more RAM to VMs then physically present. A total of 102 GB more!!! That is no typo, that really is one-hundred and two Gigabytes of RAM. VMware ESX saved them buying 102 GB by allowing memory over-commit. To me, that is quite a saving, a saving you won’t get using Hyper-V. Actually, there is more.

Transparent Pagesharing

Say, we have 10 VMs running Windows 2003 Server and in each VM there is about 500Mb of identical data, because it is just the basics of Windows 2003. So you have 4,5GB (5GB – 500Mb) of redundant data in your expensive RAM. Not, with VMware transparent page-sharing you haven’t. So there is another great memory saver. Now look at the bigger picture again and see how much memory would be saved if you have 400 Windows VMs and add those savings of memory over-commitment…… I think you get the point and why you can really benefit from these techniques.

Thank you Tom Howarth for checking this post before publishing.

Series:
Hyper-V, not in my datacenter (part 1: Hardware)
Hyper-V, not in my datacenter (part 2: Guest OS and Memory overcommit)
Hyper-V, not in my datacenter (part 3: Motions and storage)

  • It also depends on what the purpose of your ESX Cluster is. It’s not uncommon to use overcommitment for test environments and for instance VDI. It hardly ever gets fully utilized and it would cost you a lot of money to fully equip all servers.

    Transparent Page Sharing is called “transparent” for a good reason. A lot of VMware customers are using this feature without realizing they are using it. That’s the cool thing about the feature. It also works intra-VM. So if you have a virtualized Citrix environment and have 30 users running the same apps all the duplicate memory blocks get deduped by TPS.

    So what’s not to like about these features. I like to have the possibility to do TPS and Over-Commitment. It’s still the SysAdmin that decides if they actually want to use it or not.

    Great series so far!

  • Andrew Storrs

    I see Microsoft limiting support for guest functionality to their newer operating systems as just another “incentive” to encourage people to upgrade to newer versions (e.g. Windows Server 2008). They are infamous for this practice and although you’re right (in that it makes no sense from a technical perspective), I can’t see them dropping the practice.

    Typical MS B.S. ;)

    Looking forward to #3.

  • Great post again, can’t wait to see the next part!

    About the overcommitment, a few weeks ago i’ve read an article, can’t remember from who, there was a poll about who is using memory overcommitment. Just 2% of the customers where really using overcommitment. My question is why? Are they to scared and thinking: lets play save? Maybe the enviroment is to small against the price of memory so they can buy a lot of it. Or is memory low priced so it isn’t a issueu anymore? I think a lot of customers and engineers want to play safe and buy more memory then really needed. any comments on this?

  • Pingback: Series: Hyper-v, not in my Datacenter part2 « ICT-Freak.nl()

  • @Tomas: I have no idea how many other companies use over-commit, but I do hear from severall colegues they are using it. Maybe an idea to start a poll :-) (Be sure to explain memory over-commit first and that it is not over-allocating when resources are exhausted).

  • Gabe,

    There was already a poll done by VMware of a bunch of customers for memory overcommit. You can find it here: http://blogs.vmware.com/virtualreality/2008/10/memory-overcomm.html. There was also one done by another blogger but I can’t honestly remember who that was.

    As for older guest support, I see that all the time in shops of all sizes. I was just working with my Uncle in Atlanta who wanted to virtualize 10 servers for the city he lives in. Turns out 5 of them were Windows NT, 2 of them were Windows 2000, and the other 3 were Windows 2003 SP1. It’s not uncommon to come across older versions of Windows in just about any size environment. That’s usually the first question I ask customers when they’re looking at different solutions – will your ISV support it and is your OS supported on the virtualization solution. The answer leads nearly everyone towards VMware.

  • @tomas. I have not seen a lot of customers using the over commit features. The reason though is that most people seem to be assigning VM’s the amount of required RAM and not what the app calls for regardless of usage. That is to say, if an app says it requires a minimum of 4 GB but only has been using 1 GB, my customers typically assign it 1 GB and then watch for performance issues and increase memory allocation as needed.

  • So if there is a problem with the so called “supported” OS who do you call? VMware, or Microsoft? If it’s an OS like Windows NT 4, and you call Microsoft guess what the answer is going to be? Windows NT 4 is unsupported. That’s why the Windows OS support list looks like it does on Hyper-V – we support currently supported Windows operating systems, with support meaning something different than what it means to VMware. What VMware really means is that the OS can run on their platform and they provide the drivers for it. They don’t however “support” the OS.

    Cheers

    Stu

    Disclaimer: I work for Microsoft NZ

  • So, that doesn’t change the fact that VMware supports running a large variety of Operating Systems on their Hypervisor and Hyper-V is limited. Or are you saying that if I run FreeBSD on Hyper-V and for some weird reason Hyper-V crashes it’s still supported although it’s not on the supported guests list? I don’t think so,

  • Pingback: Hyper-V, a true MS product | VMGuru.nl - I choose (a virtual) life!()

  • Duncan

    What that means is that if you are running a supported OS under Hyper-V and something goes wrong, there are agreements in place to avoid finger pointing and ensuring the issue gets resolved regardless of whether the problem is with the Hypervisor or with the guest OS. That’s what supported means to Microsoft. What happens if VMware simply says ‘Yeah, Windows NT crashed, talk to Microsoft”? That’s the difference between providing some vmware tools, and actually supporting.

    Cheers

    Stu

  • So what you are actually saying is that MS will never ever allow to have distro’s like BSD, Ubuntu, Debian etc running on their Hypervisor withou breaking support?

    I know what it means, but as a result you’ll have a very limited set of os’es you can run on your hypervisor.

  • No, I’m not saying that at all. As far as I know all those distros run perfectly well under Hyper-V using emulated hardware (as does Windows NT 4). You can run them all you like. It’s just not supported by Microsoft if your guest OS breaks because of the hypervisor layer. If Debian or Ubuntu want to get their distro officially supported under Hyper-V I’m sure they could arrange that somehow (I don’t know what the process is exactly).

  • Great article, good to get such independent reviews. As a VMWare instructor, I hear VMWare implementation stories of many different companies. 19 out of 20 couldn’t care less about things like DRS and memory overcommitment – they say ‘it is cheaper to add another server than to find out how DRS should be configured’. The argument of cost is not a good choice to try to get the bias towards VMWare – just add up VMWare licensing costs and compare it with a HyperV solution and you can pay for way more than 102GB of memory…… But VMWare is still better (-;

  • Pingback: Hyper-V: Not in My Datacenter()

  • Hi Gabe,

    I’m a reporter with SearchServervirtualization.com, writing an article about VMware vs. Hyper-V, and I found your input n memory overcommit useful. In fact, I am going to quote you and link to this blog in the article.
    One questions though – when you talk about RAM here: “If you have 20 VMs, with each 2Gb assigned but they all use no more than 1Gb, then you would be throwing away 10Gb of RAM in the host that never gets used…”
    You mean Gigabytes (GB), not gigabits (Gb), correct? Just making sure the Gb is only a typo…

    Thanks

  • @Bridget: To be honest, I’m not always very consequent in these kind of things. I did mean Gigabytes of memory, so GB it is then :-)

    And I’m honored to be quoted :-)

    Please remember there is a lot of confusion about what “overcommit” really is. Think it is essential to explain this to your audience.

    Regards
    Gabrie

  • Hi Gabe,

    It does become evident that Microsoft has failed to get the concept of “Sharing Resources” in order to maximize utilization of the hardware layer. I view Microsoft as an infant with attitude in the world of virtualization services.

    Regards,

    Mike

  • Miquel Àngel

    We are in the same problem we have in our CPD 90% microsoft products but when we wont virtualize we did it with ESXi because we still have windows NT in some servers….microsoft rules!

  • Fantastic post. From last couple of month I need information about The guest OS support of Hyper-V. And finally I got all information. The post is very informative.

  • penman96

    What the memory over-commitment table tells me is that 3 or 4 of these ESX host servers may start disk paging if there is a utilization spike. DRS wouldn't help either, as DRS isn't memory overcommitment aware. As you correctly said at first, memory overcommitment in production is a risky idea. VMware best practice discourages it too – even if their marketing materials do not. ESX 15 has a 25% memory overcommitment.

    Not in my datacenter ;-)

  • Kukulkan

    The biggest hole in your review is ignoring the cost of ESX licensing.

  • Hi, no I don't ignore the cost of ESX licensing. But you can't just say ESX costs (for example) $1000 and Hyper-V is free. If you can run 40 VMs on 1 ESX host and only 10 VMs on that same host when it is running Hyper-V, then you need more on hardware with Hyper-V. So yes the license cost of ESX is more, but don't your care on the total cost too?

    Please, for good discussion, also supply your e-mail address when entering new comment.
    Gabe

  • Both ESX and Hyper-V are based on a hypervisor running on bare metal but uses 2 different architectures. The main difference as Andrew states is that once ESX is up and running the RHEL based servcie console can actually be unavailable and the VM’s on the ESX host will still run even though they will be unmanageable.

    In Hyper-V the VM’s or child partitions are dependant on both the parent partition and the hypervisor. If the parent partition is not available and working then the guest VM’s will not work at all.

  • Ah ok that's cool, thanks for the explanation. I was about to post that I didn't understand. So there's a difference in architecture design btw the two?
    George

  • Great article for people that want to understand memory-overcommitment and transparant pagesharing!

  • Pingback: Windows 2008 Hyper-V | Системное администрирование в киберспорте и не только()

  • Pingback: MS Hyper-V Virtualization Myths Busting()