Xenserver Space Woes
- 
 @momurda said in Xenserver Space Woes: Ok, well then, I am just about out of ideas. 
 Is your Unitrends doing a backup now? At what point do your backups fail? Quickly? during bacup? During cleanup after backup(deleting snapshots, etc).
 I also use Unitrends and had some problems at the beginning.
 Perhaps i can help fix the Unitrends problem before you go about deleting 2TB vdis that you may or may not need.The problem comes in when it goes to attach the snapshot to the backup appliance, it fails. And this not because of the maximum number of allowed VDIs on a VM, I ruled this out when I re-did the jobs so that no more than 8 VDIs are attached at any given time (plus the 4 backup targets and system drive, totalling 13, which is less than the 16 max that is allowed). Not sure what it does after that, though given that I don't have a million orphaned snapshots I have to assume the it is deleting them after this failure. Something must've gone wrong with these 2 though. Here is an example of a failed log (I get these for almost every backup attempt): Files to backup: 
 -=-=-==-=-=-=-=
 Client Name....: localhost.localdomain [127.0.0.1]
 Starting Dir...: /
 Command-line...: /usr/bp/bin/xprotect/xprotect --server 10.1.1.12 --username root --password ******** --backup --uuid <redacted>
 Exit Code......: 0
 Exit Status....: Failed
 ----- XProtect Messages ----
 attaching the disk to vm failed
 ----- End XProtect Messages ----
 No client messages due to log level 3
 See client log in /usr/bp/logs.dir/bpserver-5.log
 The database update failed. (Attempt 1 of 3)
 The database update failed.
 The command was "/usr/bp/bin/updatedb -b 1283 -f 1283 -a remove -l /tmp/file8JppsG > /dev/null 2>&1 ".
- 
 I was looking for more ways to look at virtual storage in Xenserver 
 http://devsops.blogspot.com/2013/02/how-to-use-vhd-util-tool-in-xenserver.html
 This post may be useful, uses vgs and vhd-util. This could give you a bit of insight into the storage
- 
 Also with regards to your backups, you may have to look at bpserver-5.log to actually see the error in the backup. The attach the disk to vm failed seems to me a generic error for Unitrends. Ive always foudn more useful info int he bpserver log file. 
- 
 http://discussions.citrix.com/topic/324186-too-many-base-copy-when-xe-vdi-list/ 
 More fun with vhd-util.
 this guy at citrix forums says moving the disk to new SR, then runnign Reclaim Free Space may do something.
- 
 Well something changed overnight, becuase now when I look at the storage useage it says: Used: 6505.4Gb and Allocated: 6734.8Gb So why on earth has my allocated amount gone up by nearly 2587Gb? sigh this is very very odd... EDIT: Just did a rescan, and now it says Used:6505.4 Allocated: 4147.8. So it's back to "normal" 
- 
 What does your storage server say about all this? 
- 
 @momurda said in Xenserver Space Woes: What does your storage server say about all this? Nada. It's happily running as intended. 
- 
 Did you try using vgs and vhd-util, to see if the hidden=1 is on for your troubled vhd? 
 THere is also an XS6.5 update that was just released (well just showed up in my XC the other day), XS65ESP1029, with the description "fixes to Storage and Dom0 kernel modules"I also just noticed your post yesterday in this thread about "Run out of space while coalescing". I still think this is something to do with you Unitrends failures and snapshotting. 
 At this point i think all you can really do is reboot and/or delete those snapshots-but-not-snapshots, or just move the disk to another SR. However, befopre you do any of this, you should do something to save the data that could go missing.
 I would robocopy /mir /copy:DATSO the shares on that virtual disk drive somewhere else just in case you wipe outthe virtual disk. I keep a 4TB usb3 drive hooked up to my desktop just for things like this.
- 
 @momurda said in Xenserver Space Woes: Did you try using vgs and vhd-util, to see if the hidden=1 is on for your troubled vhd? 
 THere is also an XS6.5 update that was just released (well just showed up in my XC the other day), XS65ESP1029, with the description "fixes to Storage and Dom0 kernel modules"I also just noticed your post yesterday in this thread about "Run out of space while coalescing". I still think this is something to do with you Unitrends failures and snapshotting. 
 At this point i think all you can really do is reboot and/or delete those snapshots-but-not-snapshots, or just move the disk to another SR. However, befopre you do any of this, you should do something to save the data that could go missing.
 I would robocopy /mir /copy:DATSO the shares on that virtual disk drive somewhere else just in case you wipe outthe virtual disk. I keep a 4TB usb3 drive hooked up to my desktop just for things like this.Unfortunately I don't have a place to dump 2Tb of home folders, and even if I did, I don't have the luxury of taking everyone's folder offline for the amount of time that would take. The VGS and VHD tool did yield an entire list of VHDs and showed me their relationships, but I can't seem to correlate the 2 VHDs to any of the vhd-util output, though they do appear to be gone. I am thinking that Unitrends is definitely behind this odd issue. So I have turned off all the backup jobs for now, so once the job currently running (backup of this VM with the 2TB drive) is complete, I'll run all these commands again and see what it looks like. Plus I reboot and update all the Xen hosts before I turn them back on. 
- 
 So this weekend I updated my Xen hosts, and re-booted them both. So everything is 100% up to date. The reboot did not, however, fix my space issue. I am still using 2Tb more than I should. I also turned off Unitrends for the last few days. So I am 100% sure that I do not have any snapshots on there, at least none that show up in Xencenter. Xencenter reports that I am using 6173.8Gb but only have 4115.7 allocated, 2 TB more used then there should be. Now when I run xe vdi-list is-a-snapshot=true I get: uuid ( RO) : 5535a3db-da4f-4211-afa8-077241f63221 name-label ( RW): Staff Home name-description ( RW): VDI for staff home folders sr-uuid ( RO): 4558cecd-d90d-3259-7ea5-09478d0e386c virtual-size ( RO): 2,193,654,546,432 sharable ( RO): false read-only ( RO): trueSo I tried to delete this VDI with xe vdi-destroy uuid=5535a3db-da4f-4211-afa8-077241f63221, I get: This operation cannot be performed because the system does not manage this VDI 
 vdi: 5535a3db-da4f-4211-afa8-077241f63221 (Staff Home)A reclaim freed space does not make a difference (and only takes about 5 seconds to run). So any suggestions on where I can go from here? Moving this VM to other storage and then back is not really an options, since I can't take this VM down for the time it would take to move all 2TB (hours and hours), since this is where all my user's home folders are. 
- 
 You'll have to change the VDI from read-only: true to false before you can edit it. Is that disk not supposed to be there though? 
- 
 @DustinB3403 said in Xenserver Space Woes: You'll have to change the VDI from read-only: true to false before you can edit it. Is that disk not supposed to be there though? I am increasingly of the opinion that it should not be there. I have 2 VDIs called "Staf Home" one is the actual VDI connected to the VM, the other is this one. This VDI is also the exact size of the discrepancy in space used and space allocated. But I am completely open to any and all suggestions on how I can confirm that this VDI is in fact just wasted space. 
- 
 Can you post the results of xe vdi-listshowing both VDIs? I'm wondering if one VDI is acting as a base copy for the other.
- 
 @Danp said in Xenserver Space Woes: Can you post the results of xe vdi-listshowing both VDIs? I'm wondering if one VDI is acting as a base copy for the other.Bad? One: uuid ( RO): 5535a3db-da4f-4211-afa8-077241f63221 
 name-label ( RW): Staff Home
 name-description ( RW): VDI for staff home folders
 sr-uuid ( RO): 4558cecd-d90d-3259-7ea5-09478d0e386c
 virtual-size ( RO): 2193654546432
 sharable ( RO): false
 read-only ( RO): trueGood one: uuid ( RO): 6255caa0-e7d4-4d27-a257-b33aaf3a7507 
 name-label ( RW): Staff Home
 name-description ( RW): VDI for staff home folders
 sr-uuid ( RO): 4558cecd-d90d-3259-7ea5-09478d0e386c
 virtual-size ( RO): 2193654546432
 sharable ( RO): false
 read-only ( RO): falseEDIT: Maybe I am barking up the wrong SR-VDI here, since I ran the vhd-util command and got: vhd=VHD-f832866c-1bb4-48d5-81e7-4dd468b2618b capacity=2,193,654,546,432 size=2,197,689,466,880 hidden=1 parent=none 
 vhd=VHD-5535a3db-da4f-4211-afa8-077241f63221 capacity=2,193,654,546,432 size=14,424,211,456 hidden=1 parent=VHD-f832866c-1bb4-48d5-81e7-4dd468b2618b
 vhd=VHD-6255caa0-e7d4-4d27-a257-b33aaf3a7507 capacity=2,193,654,546,432 size=2,197,945,319,424 hidden=0 parent=VHD-5535a3db-da4f-4211-afa8-077241f63221That seems to imply that the "good" vdi is a child of the "bad" vdi, which is itself a copy of the base VDI. Which would seem to be "normal" but still, where is that extra 2Tb going? And why can't I free it up? 
- 
 This issue is fascinating. 
 Here is an article from Citrix, the answer is probably here, though at this time it is a bit over my head.
 http://support.citrix.com/article/CTX201296
 This discusses coalescing, and reasons for failure and steps to troubleshoot and fix the coalescing issues.
 There seem to be 8 possible issues for this happening automatically.
 /var/log/SMlog probably has more info about the problem according to this.
 Also, are you able to move the SR(which will automatically get rid of ss chains) or export the vm and delete it, then import it?
 I also think that any of these solutions require you to have sufficient free space on the SR.
- 
 @momurda said in Xenserver Space Woes: This issue is fascinating. 
 Here is an article from Citrix, the answer is probably here, though at this time it is a bit over my head.
 http://support.citrix.com/article/CTX201296
 This discusses coalescing, and reasons for failure and steps to troubleshoot and fix the coalescing issues.
 There seem to be 8 possible issues for this happening automatically.
 /var/log/SMlog probably has more info about the problem according to this.
 Also, are you able to move the SR(which will automatically get rid of ss chains) or export the vm and delete it, then import it?
 I also think that any of these solutions require you to have sufficient free space on the SR.Browsing through /var/log/SMlog does not really show anything obvious. I can see where it is doing some thing with the three VDIs previously mentioned, but it looks like that was a success. Yet I continue to be using 2Tb more than is virtually assigned. I am going to dig through that support doc you linked and see if I can work anything out. 
- 
 I think I may have worked it out. It would appear that the online coalesce for the VM in question keeps timing out on the specific VDI in question (the 6255... one), they go on to say this might be due to heavy load on the storage at the time it tries. I do not think this is the case here, but the suggested solution is to shut it down and do an offline coalesce with the command: xe host-call-plugin host-uuid=<UUID of the pool master Host> plugin=coalesce-leaf fn=leaf-coalesce args:vm_uuid=<uuid of the VM you want to coalesce> I am going to try this tonight and see what happens. A side question: How does one work out: 1. If your storage is too slow? and 2. What is the IOP speed your storage is capable of? 
- 
 In XenCenter, if your Xenserver is up to date with all hotfixes, you can use the performance tab in XC on the XS host to measure disk performance (read/write/total iops, queue length for each SR or vd) and you should get accurate results. If you dont have the hotfixes installed, you prob will not get accurate results. In general longer queue lengths mean the disk cant keep up with what it is being asked to do. 
 You can also query performance from the cli using iostat.
- 
 @momurda said in Xenserver Space Woes: In XenCenter, if your Xenserver is up to date with all hotfixes, you can use the performance tab in XC on the XS host to measure disk performance (read/write/total iops, queue length for each SR or vd) and you should get accurate results. If you dont have the hotfixes installed, you prob will not get accurate results. In general longer queue lengths mean the disk can't keep up with what it is being asked to do. 
 You can also query performance from the cli using iostat.Cool, I created a graph and added Disk IO Wait and Disk Queue size, but there appears to be no data (the hosts are completely up to date as of this weekend). I do note that on the standard Disk Performance graph there is not too much activity, over the last few days it's topped out at around 0.33MBps. I guess I'll check in on it over the next few days and see what it looks like, but I don't think I'm having disk performance issues. 
- 
 @momurda said In XenCenter, if your Xenserver is up to date with all hotfixes, Is it the hotfixes, or the XS Tools? I know the tools have to be installed to run some of the stuff. (Like memory.) 




