Unbelievable Hyper-V performance on Dell R900

Today I read an article on virtualization.info that really surprised me. See here: Benchmarks: Hyper-V performance on Dell R900 with Quad-Core and Six-Core Intel Xeon
Looking at the PDF in which the benchmark test is explained I first couldn’t believe the conclusions they made, so I started reading it thoroughly and came to the same conclusion.

Performance is not good in this test. I don’t know if it is Hyper-V, bad configuration of Hyper-V, bad testing or anything else that went wrong. But the results are surely not good.

Their test configuration was a VM with 2Gb RAM, 1 vCPU, running Windows 2008 Server x64 and SQL Server 2005 x64, including Hyper-V integration services so the drivers are optimized for Hyper-V. This VM was cloned a number of times and this set of VMs was used to hit the host hard. To simulate how VMs run in a typical enterprise environment, they increased the number of VMs with the SQL application untill the host CPU would hit 80%.

The results at 80% cpu were as follows:

HP DL585 G2 – 4x Quad Core – 128Gb RAM – 26 VMs -> 1.625 VM per core

Dell R900 – 4x Quad Core – 128Gb RAM – 30 VMs -> 1.875 VM per core

Dell R900 – 4x Six Core – 128Gb RAM – 40 VMs -> 1.666 VM per core

I think these results are unbelievable. Unbelievable low that is. First time I read it, I thought “What? Such a big machine with just 26VMs?”. And when doing the math and counting the VMs per core, I think this is way too low.

Lookin’ at the site of my current customer, we have HP DL585G2 and G5 servers running VMware ESX 3.0.2 and 3.5u2. We easily get 3.5VMs per core and sometimes even 4.5VMs per core. “Ok, but this is a VM under heavy load” you will say. That is true, but don’t think our VMs are doing nothing. We run SQL2000, SQL2005, Citrix, Fileservers, etc and it is all production. A total of 650 VMs at this moment and we are still busy P2Ving more and more servers.

So where did it go wrong? First I think that testing SQL2005 with just 2Gb of RAM and 1 vCPU might not be the best test. And how hard was this VM with SQL hit? Or is it irrelevant what the VM itself was doing?

On the VMware website you can find the VMark results of the same Dell R900 server. In this test the Dell R900 with 24 cores, does a whopping 14 tiles. Each tile is 6 VMs, which gives us a 84 VMs on that same Dell R900. That is more then double and comes to 3.5 VMs per core.

So I’m wondering, is Hyper-V that bad? Although I’m a VMware fan and hope VMware ESX is much faster, I find this difference hard to believe. Ok, they did different workloads, so why not run VMark under Hyper-V? From what I learned, VMark is just a bunch of VMs that can be run under any Hypervisor. In this way we can compare apples to apples. Maybe Microsoft them self’s should step in and perform real good performance test with VMark, because when seeing these results, Hyper-V doesn’t come out well.

(PS: Did you see that in the Hyper-V test, the DELL R900 16 core runs 4 more VMs then the DL 585G2 with 16 cores, althoug on the VMark test we see the DL585G5 (not G2) perform slighlty better then the DELL R900….. strange… ).

What is a tile?
A tile is a collection of six diverse workloads concurrently executing specific software.  Running on one of two separate operating systems, each workload runs in its own virtual machine and executes applications found in all the world’s datacenters.  Included in a single tile are a web server, file server, mail server, database, java server, as well as an idle machine.Each virtual machine in a tile is tuned to use only a fraction of the system’s total resources.  As a tile, the aggregate of all six workloads normally utilizes less than the full capacity of modern servers.  Therefore, the complete saturation of a system’s resources and accurate measurement of server performance with VMmark require the execution of multiple tiles simultaneously.

9 thoughts on “Unbelievable Hyper-V performance on Dell R900

  1. Thanks for reading and reviewing my whitepaper. I really like to get into these discussions of testing methodology, so let me try to explain some additinal details about the testing that I did in reponse to your comments.

    The number of VMs that a host can support is entirely dependent on what those VMs are and what they are doing. In a previous test that I did with the same DVD Store workload on SQL Server 2005 on W2K3 VMs I was able to run 32 VMs on a dual socket quad-core (8 cores total) 2950 III server- http://www.dell.com/downloads/global/power/dell2socket_vs_hp4socket_vmware.pdf . In this new test I was able to get 30 SQL VMs on a quad-core 4-socket server (16-cores). The difference is that each VM was doing more work in the second test. If you look at the Orders Per Minute for these two tests you will see that the 8-core system achieved about 32K OPM and the 16-core system achieved about 63K OPM.

    The test in the whitepaper shows the realtive performance difference between the servers that were all tested under the same conditions. VMMark is a standardized benchmark that shows what could be achieved following the rules and conditions of that benchmark. The two tests are different and really can’t be directly compared in terms of number of VMs.

    In terms of VMMark – we did submit an R900 24-core result on ESX. As far as I know nobody has submitted a VMMark result on anything other than ESX.

    Thanks – Todd

  2. Hi Todd,

    Thank you for taking the time to respond to my blog. I do understand that I can’t compare your results against those of a VMark test because there is a difference in workload. But on the other hand, those VMark VMs are surely not sitting idle all the time, I bet they are also being hit very hard. So, although not scientifically comparable my gut feeling tells me the difference is too BIG.

    Therefore I hope someone with enough resources will step up and use a testing method that is equal in all situations and can be easily reproduced by different people and will give us an insight about Hyper-V performance against ESX.

    The way I see it now, VMark would be a great testing method for any hypervisor.

    I do not question your testing method for comparing performance between Dell and HP !!! For your goal I’m sure its fine.

    I’m searching for the Hyper-V / ESX comparison.

    Hope to hear from you.

    (Should VMark not be suited for an all hypervisor test, maybe we can together come up with specs for a good testing method).

    Gabrie

  3. Todd’s right that you really can’t compare VMmark scores or scaling with these Dell scores using this Microsoft datastore driver. But beyond that, I fail to see what value this paper offers Dell. They may have found a workload that shows a performance advantage for their 4-core R900 over the 4-core HP, but based on a quick check of the VMmark scores for similar systems, I’d suspect that in the real world these two systems are about on par with each other.

    Given that, I’m at a loss to understand why Dell would use Hyper-V to showcase a mere 19% performance increase in their 6-core result over their 4-core when — again extrapolated from published VMmark scores — I would expect that in VMmark the performance delta using ESX would be closer to 30%. Sure they showed a performance-per-watt savings with the 6-core result even with that modest scaling, but wouldn’t you expect the performance-per-watt benefit to have been even greater with 30% improvement? (Shrug. Guess that’s why I’m not a Marketeer.)

    And as much as there is a legitimate basis for capping CPU at 80% for these measurements, I’d still want to see the 100%-utilized scores. Otherwise, I’d be suspicious that this rationale was just a pretext for hiding Hyper-V performance issues at higher loads. So it really does behoove the tester to offer that 100% data point, even if it is not the point used for the primary comparison.

    And lastly, based on the specs, I believe the HP system was a G5, not a G2.

  4. Right now, I converted a VMWare box that ran Windows 2008 64 enterprise and two linux server VM’s over to Hyper-V with the same two linux server VM’s.

    Sorry to say that VMWare is MUCH faster than Hyper-V (and yes I have fine-tuned everything).

    I did find this:

    http://support.microsoft.com/kb/961661

    and I have a high-end graphics card in this system. Yet VMWare does not choke with a high-end graphics card.

    Next up is to try only MS VM’s on this machine and then rip it out and try only MS VM’s on the same machine (I have two sets of hard drives that can boot, so easy to switch between the two).

    Also, simply opening up a web browser is lower in Hyper-V on the HOST than VMWare. Since I run folding at home, the same configuration is producing 25% less units per day.

    My goal on my 8 way was to run 4 VM’s for Office Comm 2007R2, Exchange, and two Linux VM’s but that will not happen under Hyper-V – under VMWare it was no problem.

    If anyone has any ideas, that would be great as I do want Hyper-V to work. Maybe R2 will fix these issues.

  5. OK – Hyper-V on a low end graphics card does give fairly decent performance. The issue is around WDDM on high-performance cards. For a 1.0, Hyper-V id decent, but it does not quite give me what I am looking for. I was hoping it would be a cross between VMWare ESX for virtualization with the abilities of either VMWare server or VMWare Workstation, so I could:

    a) Use for development
    b) Install on laptop and demo various MS programs

    While I really wanted this to work, I have now gone back to VMWare for my virtualization needs. Apparently, there are many others who wished they could do a and b, but this will be a low priority. Even the trick to install Windows XP 64 drivers to get at least some performance and multi-monitor support no longer works on Hyper-V R2.

    So would I recommend Hyper-V?

    Yes, but when you would keep the server in a closet and administer with SCVMM, primarily for Server Core running IIS. I still do not like to virtualize on databases as they really are the backbone of companies and require a lot of horse power.

    I also tried and liked the feature of load balancing VM’s across machines. Again, many cool features – just not exactly what I need at home right now.

  6. Performance Advantage
    of Dell PowerEdge R900
    over HP DL585 Running
    Microsoft Hyper-V

    Look's Gabe Never read the PDF title

Comments are closed.