XenServer 6.5 Troubleshoot HBA SR Disk Space



  • I need help troubleshooting one of my SRs.

    I have a ~2 TB HBA SR that has two virtual disks (20 GB and 1024 GB respectively) attached to one VM. XenCenter reports that the SR has 1793.2 GB used. There are no current snapshots for said VM as far as XenCenter can see. I suspect there is some back-end cleanup that needs to happen. Screen shots for those who are visual:

    0_1462211804995_SRtrouble1.JPG


    0_1462211829590_SRtrouble2.JPG


    0_1462211853712_SRtrouble3.JPG

    Now, in the world of thinly provisioned EXT4 SRs, I'd simply cd do the SR and delete virtual disks that I knew were no longer needed. However, in this case, it doesn't look the same.

    Any suggestions?



  • Reboot the hypervisor. I had an issue like this on my home lab, the files just can't get cleaned out, but are marked for removal.



  • @DustinB3403 said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    Reboot the hypervisor. I had an issue like this on my home lab, the files just can't get cleaned out, but are marked for removal.

    Really?? This is a 16 host pool....I can't just reboot. lol



  • I had the same thing happen with my home lab (granted single host) but what happened was I was out of space on the array for the hypervisor, from snapshotting and VM space (also non-thin provisioned).

    And was using the total array, even after removing snapshots. I checked with the XS forums, they said to reboot the hypervisor.



  • @DustinB3403 said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    I had the same thing happen with my home lab (granted single host) but what happened was I was out of space on the array for the hypervisor, from snapshotting and VM space (also non-thin provisioned).

    And was using the total array, even after removing snapshots. I checked with the XS forums, they said to reboot the hypervisor.

    insert expletive here

    So, the reason why this is a problem is because I cannot take snapshots of said VM. And because I cannot take snapshots, I cannot (easily) back it up.

    I wonder if I can cycle through the hosts and reboot them one at a time...if that would work.

    Or, I can take the shortcut and make the SR bigger...but then that's just wasted disk space.

    Hmmm...



  • @anthonyh said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    @DustinB3403 said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    I had the same thing happen with my home lab (granted single host) but what happened was I was out of space on the array for the hypervisor, from snapshotting and VM space (also non-thin provisioned).

    And was using the total array, even after removing snapshots. I checked with the XS forums, they said to reboot the hypervisor.

    insert expletive here

    So, the reason why this is a problem is because I cannot take snapshots of said VM. And because I cannot take snapshots, I cannot (easily) back it up.

    I wonder if I can cycle through the hosts and reboot them one at a time...if that would work.

    Or, I can take the shortcut and make the SR bigger...but then that's just wasted disk space.

    Hmmm...

    Can you not live migrate the systems from the host with the issue to another host, and then reboot it?



  • @dafyre said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    @anthonyh said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    @DustinB3403 said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    I had the same thing happen with my home lab (granted single host) but what happened was I was out of space on the array for the hypervisor, from snapshotting and VM space (also non-thin provisioned).

    And was using the total array, even after removing snapshots. I checked with the XS forums, they said to reboot the hypervisor.

    insert expletive here

    So, the reason why this is a problem is because I cannot take snapshots of said VM. And because I cannot take snapshots, I cannot (easily) back it up.

    I wonder if I can cycle through the hosts and reboot them one at a time...if that would work.

    Or, I can take the shortcut and make the SR bigger...but then that's just wasted disk space.

    Hmmm...

    Can you not live migrate the systems from the host with the issue to another host, and then reboot it?

    How do I know which host is causing the locks?



  • @dafyre said:

    Can you not live migrate the systems from the host with the issue to another host, and then reboot it?

    Not sure that would work in this case.

    @anthonyh said :

    This is a 16 host pool... How do I know which host is causing the locks?

    So, you have 16 XS hosts connected to a network storage array?



  • @Danp said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    @dafyre said:

    Can you not live migrate the systems from the host with the issue to another host, and then reboot it?

    Not sure that would work in this case.

    @anthonyh said :

    This is a 16 host pool... How do I know which host is causing the locks?

    So, you have 16 XS hosts connected to a network storage array?

    Well, I have 16 hosts connected to a 3PAR FibreChannel SAN. So more-or-less yes.


  • Service Provider

    @anthonyh said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    So, you have 16 XS hosts connected to a network storage array?

    Well, I have 16 hosts connected to a 3PAR FibreChannel SAN. So more-or-less yes.

    Ah okay, thank goodness it's a 3PAR. I was very afraid that we were going to hear something like MSA, MD or Equalogic.



  • I have to say it's been a decent setup for what it is. I would prefer a vSAN setup of some sort utilizing local storage on the hosts, but it does the job for now. 🙂


  • Service Provider

    @anthonyh said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    I have to say it's been a decent setup for what it is. I would prefer a vSAN setup of some sort utilizing local storage on the hosts, but it does the job for now. 🙂

    At sixteen hosts that's not trivial. You could do something like a Scale HC3 at that size, but most solutions are not meant to scale up like that. Around the dozen server mark (more or less) a good SAN solution starts to be a pretty obvious choice just because it is so cost effective to consolidate to that degree. 3PAR is one of the three best names in the SAN game, IMHO, and I'd likely stick with that unless you are really looking to redesign heavily. Now, two 3PARs that are synced, yeah, that would be the way to go. But, budgets.



  • @scottalanmiller said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    @anthonyh said in XenServer 6.5 Troubleshoot HBA SR Disk Space:

    I have to say it's been a decent setup for what it is. I would prefer a vSAN setup of some sort utilizing local storage on the hosts, but it does the job for now. 🙂

    At sixteen hosts that's not trivial. You could do something like a Scale HC3 at that size, but most solutions are not meant to scale up like that. Around the dozen server mark (more or less) a good SAN solution starts to be a pretty obvious choice just because it is so cost effective to consolidate to that degree. 3PAR is one of the three best names in the SAN game, IMHO, and I'd likely stick with that unless you are really looking to redesign heavily. Now, two 3PARs that are synced, yeah, that would be the way to go. But, budgets.

    Good to know it's not a completely crazy setup. 😃

    I really have no complaints about the 3PAR setup we have. Support is good. I have an InformOS upgrade scheduled for Thursday. The upgrade will give us data deduplication which I'm excited to try.


  • Service Provider

    HPE 3PAR is one of the big boys. That's basically HPE Integrity technology under the hood. Way better reliability than a Proliant which is pretty rocking as it is. Between the high end hardware and high end support it's a pretty darn reliably bit of kit. Right up there with the high end EMC and HDS stuff. Of course, having only one unit, there is no escaping the SPOF fears, but that SPOF is really reliable.

    With sixteen hosts connected to it, the value is in the cost savings primarily. It's a cost effective strategy for standard to slightly high reliability. I wouldn't call it HA quite, it doesn't likely make it quite that far but it is more reliable than a single Proliant server setup almost certainly, but not as reliable as a single Integrity server. It's all on a scale.



  • So, are you saying I shouldn't migrate to a Netgear NAS?

    All joking aside, it sounds like the consensus thus far is a reboot? I can cycle the hosts one by one...I suppose it's worth a try...


  • Service Provider

    Sounds reasonable.



  • I'll start by rebooting the host the VM is running on and see if I get lucky...



  • How many hosts have that particular LUN mounted? If it's only one, that should be the only host that needs to be rebooted.

    You did mention this was a 2 TB LUN, right?



  • @Dashrender Yes, 2 TB LUN. All 16 hosts are attached to said LUN.

    Hey, perhaps I can detatch/reattach the LUN? Shut down the VM first, of course.

    Thoughts?? Would this trigger cleanup?



  • I have absolutely no experience at this... Good luck!



  • Welp, rebooting the host that the VM resided on did not clear up any disk space. Hmmm...


Log in to reply
 

Looks like your connection to MangoLassi was lost, please wait while we try to reconnect.