Lately I have seen a number of VMs going down after the backup has finished and snapshots get removed. At some point when cleaning the snapshot, the ESXi host decides something is wrong and then brings down the VM. After this we’re unable to power on the VM and receive the error: “Cannot open the disk ‘/vmfs/volumes/xxxxxxxxxx/xxxxx.vmdk’ or one of the snapshot disks it depends on”. When you then check the event details of the VM, you’ll see two messages:
– “Cannot open the disk ‘/vmfs/volumes/xxxxxxxxxx/xxxxx.vmdk’ or one of the snapshot disks it depends on”.
– “Could not open/create change tracking file”.
The fix to make the VM power on again is simple, however we still haven’t been able to solve the root cause. VMware Support has suggested to bring down the number of LUNs connected to a single host and to upgrade to vSphere 5.5.
The fix:
– Make sure the VM is down
– Go to Edit Settings -> Options -> Advanced / General -> Press the “Configuration Parameters” button.
– Search the list for the name: “ctkEnabled” and change the value to “false”.
– Search the list for the names: “scsi0:0.ctkEnabled”, “scsi0:1.ctkEnabled”, “scsi1:0.ctkEnabled”, etc…. and set these values to “false”.
– Click OK a few times.
Last step is to remove the reference to the change tracking file from the vmdk descriptors.
– Enable SSH on you ESXi host and login through SSH
– Go to the directory of the VM that holds the VMDK file. If your VM has multiple VMDKs, maybe spread over multiple datastores, you’ll have to repeat this for each VMDK.
– List all VMDK files: ls -l *.vmdk
– Check which VMDK file still has a reference to the change tracking files (CBT): grep changeTrackPath VMName.vmdk
– You should see something like this: changeTrackPath=”SPVSQ001-ctk.vmdk”
– If the references is still present, edit the vmdk file using the vi editor and place a # at the start of the changeTrackPath line. (Go to the line, press i for insert, type #, press <ESC>:wq to save the VMDK).
– Check the other VMDKs as well but leave the “-flat” and “-ctk” vmdk files alone.
– Now try to Power On the VM.
When the VM is running, don’t forget to exit your SSH session and disable SSH Service on the host.
I think this may happen when LUN snapshot clones are created for backup purposes. They are temporarily connected to your VM for in-VM file backup. When they are deleted after the backup all paths to those LUN’s disappear and this causes ESX to get upset. It may then bring down your VM. ESX 5 does not power down VM under the same circumstances.
We don’t snapshot LUNs only VMs at VMware level.
I’ve been seeing something similar when trying to do a SRM failover.
Resetting CBT fixes it, but it appears to only fix it temporarily, the problem comes back.
Does anyone know if VMware had a fix for this in the end? We are had this problem and following these instructions resolved the issue however we haven’t had to do this before. We have recently upgraded our vCentre to vSphere 6 from vSphere 5 so unsure is that is the cause. Our ESXi hosts are still on vSphere 5 but again these are shortly been upgraded to 6. Any advice on this would be most appreciated. I will also log this with VMware support and see what they say.
Duuuuude! This just saved a half-dozen of my production VM’s after running through the Update Manager VM Version upgrade (ESXi 5.5 from v7 to v10). I first thought it was a descriptor issue but couldn’t imagine that the update could’ve screwed up the .vmdk structure that badly.
After I calmed down and looked around a bit I came across this post and it worked, yay!
I wish I understood why that occurred?
Thanks man saved my ass today! :)
Glad to help. Thank you for your comment.
Great post Gabe – hope all is well !
I had the same errore
when I wanted to turn on a virtual machine that is a node of OARACLE
RAC, I commented with the line changeTrackPath = “- ctk.vmdk and get the
virutal machine up, but now I want to know what the effect would be
when The two virtual machines of ORACLE RAC are
working and that would happen if I let it activate the line
changeTrackPath = “- ctk.vmdk, I would have some problem of ne not
ignite any virtual machine?
Thanks friends for your help !!!
Hi. Not sure if I fully understand your question. Can you rephrase it?
Gabrie
Thanks for your reply Gabrie
I
have two mv’s that are part of an ORACLE RAC, I made a movement (strage
vmotion of the .vmdk that contains the OS of one of the MV’s of the
RAC), when I wanted to turn on the MV I had the following error “Can not
open the Disk
‘/vmfs/volumes/xxxxxxxxxx/xxxxx.vmdk’ or one of the snapshot disks it
depends on ‘, so I followed your advice, I edited the line
“changeTrackPath =” – ctk.vmdk “, with that already it could be turned
on (# “ChangeTrackPath =” – ctk.vmdk) there will
be some inconsistency in the shared .VMDK that I have done to them (#
“changeTrackPath =” – ctk .vmdk).
No there will not be any inconsistencies. The CTK file is just for record keeping of which blocks have changed. Biggest impact will be that the next backup will need a full scan of the disk and take a little more time.
Gabrie
Thanks for your answer, you take my weight off !!!
I’ve
been reading vmware dumentation about how to share the disks for an
ORACLE RAC environment, I have to present additional disks to the RAC,
so I’m going to follow the procedure they recommend, in this case you
think I should turn off the two MV’s that form The RAC, to
again activate the line changeTrackPath = “- ctk.vmdk, or do you think
it is not necessary to activate again changeTrackPath =” – ctk.vmdk?
Your support has been very valuable.
Regards,
Rafael
The CTK will be enabled by your Backup product.
Gabrie
Thank you!!!
Ran into this issue when upgrading the virtual hardware from 7 to 10. Thanks for saving me