Xenserver Space Woes
-
@momurda said in Xenserver Space Woes:
This issue is fascinating.
Here is an article from Citrix, the answer is probably here, though at this time it is a bit over my head.
http://support.citrix.com/article/CTX201296
This discusses coalescing, and reasons for failure and steps to troubleshoot and fix the coalescing issues.
There seem to be 8 possible issues for this happening automatically.
/var/log/SMlog probably has more info about the problem according to this.
Also, are you able to move the SR(which will automatically get rid of ss chains) or export the vm and delete it, then import it?
I also think that any of these solutions require you to have sufficient free space on the SR.Browsing through /var/log/SMlog does not really show anything obvious. I can see where it is doing some thing with the three VDIs previously mentioned, but it looks like that was a success. Yet I continue to be using 2Tb more than is virtually assigned.
I am going to dig through that support doc you linked and see if I can work anything out.
-
I think I may have worked it out. It would appear that the online coalesce for the VM in question keeps timing out on the specific VDI in question (the 6255... one), they go on to say this might be due to heavy load on the storage at the time it tries. I do not think this is the case here, but the suggested solution is to shut it down and do an offline coalesce with the command:
xe host-call-plugin host-uuid=<UUID of the pool master Host> plugin=coalesce-leaf fn=leaf-coalesce args:vm_uuid=<uuid of the VM you want to coalesce>
I am going to try this tonight and see what happens.
A side question: How does one work out: 1. If your storage is too slow? and 2. What is the IOP speed your storage is capable of?
-
In XenCenter, if your Xenserver is up to date with all hotfixes, you can use the performance tab in XC on the XS host to measure disk performance (read/write/total iops, queue length for each SR or vd) and you should get accurate results. If you dont have the hotfixes installed, you prob will not get accurate results.
In general longer queue lengths mean the disk cant keep up with what it is being asked to do.
You can also query performance from the cli using iostat. -
@momurda said in Xenserver Space Woes:
In XenCenter, if your Xenserver is up to date with all hotfixes, you can use the performance tab in XC on the XS host to measure disk performance (read/write/total iops, queue length for each SR or vd) and you should get accurate results. If you dont have the hotfixes installed, you prob will not get accurate results.
In general longer queue lengths mean the disk can't keep up with what it is being asked to do.
You can also query performance from the cli using iostat.Cool, I created a graph and added Disk IO Wait and Disk Queue size, but there appears to be no data (the hosts are completely up to date as of this weekend). I do note that on the standard Disk Performance graph there is not too much activity, over the last few days it's topped out at around 0.33MBps.
I guess I'll check in on it over the next few days and see what it looks like, but I don't think I'm having disk performance issues.
-
@momurda said
In XenCenter, if your Xenserver is up to date with all hotfixes,
Is it the hotfixes, or the XS Tools? I know the tools have to be installed to run some of the stuff. (Like memory.)
-
@BRRABill said in Xenserver Space Woes:
@momurda said
In XenCenter, if your Xenserver is up to date with all hotfixes,
Is it the hotfixes, or the XS Tools? I know the tools have to be installed to run some of the stuff. (Like memory.)
Good point. The tools are not up to date. So I'll need to update them tonight, though I am looking at historical data from before I applied SP1 and the other updates.
-
You can also throw some io at a disk by copying a large file or lots of small files to a vm(do it twice at the same time if you want to see if you max out) to test your iops. Or reboot a few vms at the same time. My storage array hits 1500 or so before it starts to peak, iirc from some tests i did back in the winter. Though i do wonder if some of that isnt bound by us using a Gb network rather than 10Gb.![iscsi iops for my XS001 Xenserver host]( image url)
This shows the last ten minutes of iops for all SRs attached to my XS001 host. The purple iscsi3 is an SR; i booted a vm that lives there that nobody ever uses. -
So my IOPs seem to be jumping between 0 and 900k fairly quickly. But the Queue size seems to stay between 0 and 1, with the latency very low (near zero) as well. Network traffic is well under 1MBps. This is from the performance meters on the Xen master host.
-
@jrc said in Xenserver Space Woes:
So my IOPs seem to be jumping between 0 and 900k fairly quickly. But the Queue size seems to stay between 0 and 1, with the latency very low (near zero) as well. Network traffic is well under 1MBps. This is from the performance meters on the Xen master host.
Basically what that is telling me is that you have plenty of IOPS in reserve and you are never demanding more from it than it can provide. Those numbers are basically showing your storage as "idle" and ready for whatever you want to throw at it.
-
@scottalanmiller said in Xenserver Space Woes:
@jrc said in Xenserver Space Woes:
So my IOPs seem to be jumping between 0 and 900k fairly quickly. But the Queue size seems to stay between 0 and 1, with the latency very low (near zero) as well. Network traffic is well under 1MBps. This is from the performance meters on the Xen master host.
Basically what that is telling me is that you have plenty of IOPS in reserve and you are never demanding more from it than it can provide. Those numbers are basically showing your storage as "idle" and ready for whatever you want to throw at it.
Ok, so my gut on that was right. Then I need to work out why the leaf quiescence thingy is timing out, since it appears to not be a disk IO thing.
-
I fixed it! Shut down the VM, then ran an offline quiescence and that did it:
xe host-call-plugin host-uuid=<Host UUID> plugin=coalesce-leaf fn=leaf-coalesce args:vm_uuid=<VM UUID>
It did take about 45 minutes, but once it was done the space was free. Xencenter is now happily reporting the used space as 4127Gb and a virtually assigned is 4115Gb, it's not perfect, but I'll take it!
-
Awesome, glad that that fixed things.