Don’t we all know the time you forgot to monitor your LUN available space? The call at 3am in the night from the monitoring-room that a LUN has no free space and VMs are crashing (VMs with active snapshots). So had we. We now have configured addequate monitoring to alert when less then 20Gb free space. But despite this it could still occure that a LUN runs out of space.
Problem is that when a LUN is full and you have to clean up snapshots, you also need extra space to remove them. This is not an easy task and certainly not for the average admin that has Windows servers, webservers, SQL servers and ESX servers just as part of the job and gets called in the middle of the night.
My collegue Arnim came up with this great idea to place a 5Gb dummy vmdk on each LUN. Now when a LUN runs out of space, you can delete the dummy vmdk, clean up the mess, free space and after that ofcourse don’t forget to recreat the dummy for the next time.
For the admin that gets called in the middle of the night, there now is an option to just delete the dummy vmdk and postpone solving the real problem till first thing the morning.
(We also will be monitoring on the absence of the dummy file!)
Love to hear your thoughts on this.