Harlem shuffle with VMDKs

Today I was brainstorming with Arnim van Lieshout about how to place all VMDKs on our LUNs, because right now we are facing two problems in our environment:

  • No balancing of IO workloads across LUNs
  • VMDKs often don’t fit nicely into the remaining LUNs space, so we lose a lot of GBs to unused space.

We tried coming up with an algorithm that would deploy the VMDKs over our LUNs based on IO and VMDK size. What we have now is purely an algorithm, we have no idea yet how to do this, but I bet there are others that can make nice tools around this :-)

First we set the following requirements:

  • Each LUN will be equal of size, in our environment this is 500GB.
  • Balancing of disk paths is out of scope, we expect this to be implemented
  • We try to fill the LUNs as much as possible, but leave 50GB free space for snapshots etc.

What we thought of was the following (example is of on of our clusters):

  • Have a list of all VMs (300) with their I/O average of let’s say one week, sorted by highest IO on top.
  • Take the number of LUNs (37) and then split the VM list into sections equal to the number of LUNs. In other words, we would get 300 / 37 = 8,1 = 9 sections of VMs. First section would then hold to top IO consumers.
  • Take the total number of  average I/O and divide it by the number of LUNs to get an average IO per LUN.

Now to divide the VMs over the LUNs we would follow these steps:

  • Take the first (or later on the next) section of VMs and place first VM of that section on LUN1, next VM on LUN2, next on LUN3, etc.
  • When placing a VM on a LUN check the following: Do we still meet minimal free space requirement and are we still below the max I/O we would allow on a single LUN (= average I/O per LUN + x%)
  • For the next run of placing VMs, start placing on LUNs in reverse order. In our example VM-37 would be on LUN 37 and VM-38 would also be on LUN37,  LUN36 would hold VM-36 and VM-39.
  • Then when that section is complete, you start with section 3 top down again, etc, etc.

These are our initial  thoughts on how to find a combination between spreading I/O load without wasting too much disk space. We would like to hear your thoughts on this, so please let us know in the comments field.

Arnim van Lieshout & Gabrie van Zanten

Edit: Also read Arnim’s post about this:   http://www.van-lieshout.com/2009/02/vmware-storage-sudoku/

11 thoughts on “Harlem shuffle with VMDKs

  1. Imagine having them all on one big volume that grew only when needed to do so, on the fly ;)

    As I said on Twitter, might be time to go NFS

  2. Difficulty I think, is not in how to move them, but building the intelligence into the script. Would love to see if PowerShell can do this as easy as it does other jobs. Oh and if you really want to try it, it would be great if the powershell script first shows a list of what it suggest would be the ideal situation and then gives the commands you can run to do the storage vmotions :-) But now I’m pushing it I think :-) :-)

  3. Great post guys!

    You could script this with powershell indeed Alan. But it will be a lot, let me repeat this, a lot of work.

    A couple of questions and remarks:
    How often would you run this script?
    How long do you think it would take to reshuffle your complete environment(?) let’s say a small environment of 4TB).
    What about the scsi reservations/locks this reshuffling will trigger?
    What’s the risk of moving vm’s around? (vs the benefits)

    During the move around you would probably need an additional LUN which can be used as temporary storage for the VM’s that are moving around. You will need to move VMs off before you can start moving back in some and maybe even most cases. And the app / script would need to take diskspace in account during moving. (svmotion also uses snapshots for instance.)

    I think it will be one gigantic puzzle. And especially for large environments it will be really really tough to get this done within off hours.

    But than again, I wouldn’t be surprised if for instance EMC is working on a vStorage variant of this: http://virtualgeek.typepad.com/virtual_geek/2008/09/so-what-does-vs.html

  4. Gabe, good stuff. I use a candidate spreadsheet with a column for each Datastore. I then place the disks into the datastores spreading them out on a very similar basis to the one you have mentioned. Some simple formulas for each column/datastore summarise the used space and number of VMs, making allocation easy and validation quick.

    In addition to the considerations you mention I also do the following. Separate out any VMs that need attention to their storage design, eg and Exchange or SQL server. Often there may be multiple tiers of performance of underling storage, so the VMs and Datastores are categorised into the appropriate tiers. Lastly, consideration to keep some VMs disks away from each other, this is essentially like your IO limit but even though some may not break the metrics you may want them separate, for example different tiers in an application stack, development boxes etc.


  5. Like Duncan said. I do not see the benefit of a script that can perform the needed storage vmotions.
    I wouldn’t dare to run such a script!

    I’m just interested in an algorythm that can sort this puzzle out.

    Where are the geeks that can do the math?


  6. Pingback: Arnim van Lieshout
  7. We have the following rule. A maximum of 7 vm’s per lun, have 15% of free space.

    I deploy vm’s with from a .csv, from there i can choose the best place for each vm. Furthermore i make sure that the lun’s are balanced accross the hba’s.

  8. Our problem right now is that we are tech refreshing our old “shared” filers and moving our entire infrastructure (1500 VMs) to a new dedicated fabric.

    We have been trying to come up with some logic to both take size of the VMs and I/O into account. But we don’t have a set standard on LUN size anymore since the storage is now dedicated to our clusters, so we are also trying to come up with the best LUN size for each cluster individually.

    I would love to see a script or spreadsheet where you could input 1) the known metrics (i.e. VM sizes & disk I/O for the past month) and 2) the known caps (i.e. Max VMs per LUN, & LUN size if you know it) and it would output the most efficient use of space taking into account disk I/O and map VMs to the LUNs. It would also be awesome if the tool could tell you what the most efficient LUN size would be based off your VM sizes if you didn’t already know it.

    Another reason why we don’t have a set LUN size anymore is also because as our offering progresses more into production applications, users want more storage on the VM. So the average VM size on an older fully populated cluster and a newer cluster are drastically different. So we figured we would create a standard LUN size per cluster instead of an overall standard LUN size.

    I don’t have the scripting skills to pull this one off, but it would make my life a lot easier during our huge storage migration project. ;)


Comments are closed.