Unbelievable Hyper-V performance on Dell R900

Today I read an article on virtualization.info that really surprised me. See here: Benchmarks: Hyper-V performance on Dell R900 with Quad-Core and Six-Core Intel Xeon
Looking at the PDF in which the benchmark test is explained I first couldn’t believe the conclusions they made, so I started reading it thoroughly and came to the same conclusion.

Performance is not good in this test. I don’t know if it is Hyper-V, bad configuration of Hyper-V, bad testing or anything else that went wrong. But the results are surely not good.

Their test configuration was a VM with 2Gb RAM, 1 vCPU, running Windows 2008 Server x64 and SQL Server 2005 x64, including Hyper-V integration services so the drivers are optimized for Hyper-V. This VM was cloned a number of times and this set of VMs was used to hit the host hard. To simulate how VMs run in a typical enterprise environment, they increased the number of VMs with the SQL application untill the host CPU would hit 80%.

The results at 80% cpu were as follows:

HP DL585 G2 – 4x Quad Core – 128Gb RAM – 26 VMs -> 1.625 VM per core

Dell R900 – 4x Quad Core – 128Gb RAM – 30 VMs -> 1.875 VM per core

Dell R900 – 4x Six Core – 128Gb RAM – 40 VMs -> 1.666 VM per core

I think these results are unbelievable. Unbelievable low that is. First time I read it, I thought “What? Such a big machine with just 26VMs?”. And when doing the math and counting the VMs per core, I think this is way too low.

Lookin’ at the site of my current customer, we have HP DL585G2 and G5 servers running VMware ESX 3.0.2 and 3.5u2. We easily get 3.5VMs per core and sometimes even 4.5VMs per core. “Ok, but this is a VM under heavy load” you will say. That is true, but don’t think our VMs are doing nothing. We run SQL2000, SQL2005, Citrix, Fileservers, etc and it is all production. A total of 650 VMs at this moment and we are still busy P2Ving more and more servers.

So where did it go wrong? First I think that testing SQL2005 with just 2Gb of RAM and 1 vCPU might not be the best test. And how hard was this VM with SQL hit? Or is it irrelevant what the VM itself was doing?

On the VMware website you can find the VMark results of the same Dell R900 server. In this test the Dell R900 with 24 cores, does a whopping 14 tiles. Each tile is 6 VMs, which gives us a 84 VMs on that same Dell R900. That is more then double and comes to 3.5 VMs per core.

So I’m wondering, is Hyper-V that bad? Although I’m a VMware fan and hope VMware ESX is much faster, I find this difference hard to believe. Ok, they did different workloads, so why not run VMark under Hyper-V? From what I learned, VMark is just a bunch of VMs that can be run under any Hypervisor. In this way we can compare apples to apples. Maybe Microsoft them self’s should step in and perform real good performance test with VMark, because when seeing these results, Hyper-V doesn’t come out well.

(PS: Did you see that in the Hyper-V test, the DELL R900 16 core runs 4 more VMs then the DL 585G2 with 16 cores, althoug on the VMark test we see the DL585G5 (not G2) perform slighlty better then the DELL R900….. strange… ).

What is a tile?
A tile is a collection of six diverse workloads concurrently executing specific software.  Running on one of two separate operating systems, each workload runs in its own virtual machine and executes applications found in all the world’s datacenters.  Included in a single tile are a web server, file server, mail server, database, java server, as well as an idle machine.Each virtual machine in a tile is tuned to use only a fraction of the system’s total resources.  As a tile, the aggregate of all six workloads normally utilizes less than the full capacity of modern servers.  Therefore, the complete saturation of a system’s resources and accurate measurement of server performance with VMmark require the execution of multiple tiles simultaneously.