You may run Commvault Backup Software or any other on a Hyper-V Cluster and your VSS Snapshots of the virtual machines fail from time to time.
Symthoms:
- If you run the command
vssadmin list writers
in cmd.exe on the Hyper-V cluster node you see this:
Writer name: 'Microsoft Hyper-V VSS Writer' Writer Id: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de} Writer Instance Id: {ef683ba9-5fb2-488e-aeee-2e6e0b1d720f} State: [5] Waiting for completion Last error: Retryable error
- In Commvault Console the backup job is pending and has the following error status:
Error Code: [91:9], Description: Volume Shadow Copy Service (VSS) error. VSS service or writers may be in a bad state. Please check vsbkp.log and Windows Event Viewer for VSS related messages. Or run vssadmin list writers from command prompt to check state of the VSS writers.
- A server reboot, as often documented does not help anything. Next backup, same failure again.
Reason:
- One or more disks inside a Virtual Machine (VM) is full.
- One or more disks inside a Virtual Machine (VM) has less than 1-2GB free space left.
Details:
At CommVault Virtual iDataAgent using Microsoft VSS and Hyper-V Installation, Backup and Restore problems they describe several reason for this type of failure. They recommend many missing Microsoft hotfixes (you should really install) and a disabled automount, but the top one issue - that happens most often from my expierence is - you are just running out of disk space in one of the VMs located on the Cluster Shared Volume.
You cannot expect that your users have any idea that they should not fill a file servers disk below 1 GB free space, what is one of the key limitations if Shadow Copies are used for backup. If you have less than 1GB free disk space the VSS Snapshots will fails without any helpful error messages in the Eventlog.
This issue is not limited to Commvault only. On the end of the day the VSS need to be more verbose to the eventlog. If VSS would log the reason (disk full, inside VM), it wouldn't be so extremly difficult to identify this issue and therefore I see this as a Microsoft reporting bug often named a feature.
Workaround:
A reasonable workaround I'm using for all servers is to reconfigure the Shadow Copies storage of all disks to a storage area where you should always have enough disk space. If you right click on a Disk > Shadow Copies tab, select every volume and re-locate the Storage area to drive C:\ or create an extra drive in the VM that is reserved for VSS snapshots only. I do expect that my C:\ drive will never be filled up as nobody has access to it from external over shares. Make sure you never create any shares on drive C:\ where your operating system resides. But the file server shares on D:\ or other drives may be filled up, until these disks are full. Whatever disk size you give your drives the users will fill it with data. :-)
History:
08/03/2012: Documentation created.