SWY's technical notes

Relevant mostly to OS X admins

Category Archives: Virtualization

Alternate way to deal with XenServer storage issues

I’ve administered a small pool of 2 physical XenServers with shared storage for my employer’s Windows and Linux virtual servers for a few years.  One issue that has come up under both XenServer 5.6 and 6.x is a failure for XS to properly remove the disk files when a snapshot is deleted.  This can be an issue when using a VM backup tool such as PHDVirtual, where the backup server is another VM, which on a schedule has the hypervisor

  1. Take a snapshot of another running VM
  2. Attach that snap to the backup VM
  3. Read that drive, back up to the configured storage
  4. detach and delete the snap

Result is a buildup of undeleted snapshot files, taking up storage space on your SR.  One way to confirm if this is happening is to execute the following line on the XenServer console:

xe vdi-list is-a-snapshot=true | grep name-label | sort

If you don’t expect to see any snapshots, seeing them listed here is an issue.

02847365c4067b1afd46f2cfe684292d

These shouldn’t be here.

Citrix offers a command line tool to address this, outlined in Knowledgebase document CTX123400. This is called an Offline Coalesce, and is formatted as

xe host-call-plugin host-uuid=<UUID of the pool master Host> plugin=coalesce-leaf fn=leaf-coalesce args:vm_uuid=<uuid of the VM you want to coalesce>

Earlier this week, I was using the above in hopes to address the large discrepancies in my SR used vs allocated values from the not deleted, but not seen in XenCenter snapshots seen above:

146d77cf1b7262bd8b13cbcc46ea21a8

Problem is, it didn’t work.  Tailing /var/log/SMlog, the standout clue was “no space to coalesce”.  So great… a bug in snapshots takes up all your space, and when you try to use the tool to fix it, it can’t because there’s no space.

A day later, this became quite concerning.  I don’t know what Xen will do when an SR hits 100% capacity, but I doubt I’ll enjoy it.

8629fac8974e8652861acfd9f61b35cb

I decided to take a gamble and see what happens if I move a disk to a different SR.  Lacking Storage XenMotion, I shut down a non-critical VM- the Windev2 machine listed above, detached the C: drive, and asked Xen to move it to a different SR.  It took much longer than expected (about 30 minutes for 48 gigs), but in the end, I regained much more than 48 gigs of storage on my SR:

a7c2fe1bf586438e568b3f90b57fd721

557.6- 281.9= 275 gigs of wasted snapshot space due to XenServer bugs, restored by moving the disk to a different SR.  My nightly PHDVirtual backups were now able to take a snapshot last night and perform a proper backup.

Lessons learned:

  • When leaf-coalesece fails, moving a disk to different storage can clean up the wasted storage space
  • If I had to do it all over, I think I’d go VMWare. XenServer has had problems with snapshot management for a long time, and they still haven’t figured it out.