Xenserver and Storage
-
Look at this link.
255 disks per VM, with a maximum of 4096 disks per host at the limits. Not 16.
-
@jrc said in Xenserver and Storage:
EDIT: Apparently the 16 VHD limit is a Xenserver 6.5 and earlier thing. Xenserver 7 is 255. I need to upgrade to 7....
Yeah you'd better.
-
Which means you can have an absolutely insane amount of storage space attached to the VSAN VM, and then pass that out to your other guests.
@scottalanmiller do any single server units support up to 521,220 TB worth of raw space?
-
@jrc said in Xenserver and Storage:
@dustinb3403 said in Xenserver and Storage:
@jrc said in Xenserver and Storage:
Ok, let me take a step back here.
On the Host you install a VSAN controller VM, you then attach storage to this VM which it will then use as it's VSAN storage space and map that over to the host via iSCSI. Am I correct so far?
Allright, so with Xenserver, the max attached VHDs would be 16, at 2Tb each. And since this VSAN VM is to be used to home all your VMs on the host, you'd need it to have a fair bit of space. So having roughly a 32Tb limit could be a problem. Does this mean you'd need to have a second VSAN VM in that host, therefore upping that limit to 64Tb? And does the VSAN OS handle spanning the data across all 16 VHDs?
Where are you getting your limits from? I literally just posted them and did the math on what you could provide as storage to a single VM.
Xenserver has a limit of 16 VHDs per VM, I know this from experience and it is buried in their docs. I think the 255 you mention is for pure Xen, which I am not running.
EDIT: Apparently the 16 VHD limit is a Xenserver 6.5 and earlier thing. Xenserver 7 is 255. I need to upgrade to 7....
Ah good, so not perfect, but there is a solution. Made VDIs to one VM is just like small LUNs in the SAN world. Annoying, but fine.
-
@dustinb3403 said in Xenserver and Storage:
@scottalanmiller do any single server units support up to 521,220 TB worth of raw space?
With DAS units, maybe. Some things like large Oracle and IBM chassis can have a single server spanning dozens of racks. So... it's theoretically possible.
-
@dustinb3403 said in Xenserver and Storage:
@dbeato You can completely skip Windows and use the Linux VSAN controllers.
https://www.starwindsoftware.com/announcing-new-linux-based-starwind-virtual-storage-appliance-video
Any idea on how I can download this? When I try all I can seem to find is the Windows installer.
-
@jrc said in Xenserver and Storage:
@dustinb3403 said in Xenserver and Storage:
@dbeato You can completely skip Windows and use the Linux VSAN controllers.
https://www.starwindsoftware.com/announcing-new-linux-based-starwind-virtual-storage-appliance-video
Any idea on how I can download this? When I try all I can seem to find is the Windows installer.
Use the request demo portion here.
Otherwise I'd hit up the folks @StarWind_Software to point you in the right direction.
-
Additionally @Oksana might be able to point you into the right direction.
-
@dustinb3403 said in Xenserver and Storage:
@jrc said in Xenserver and Storage:
@dustinb3403 said in Xenserver and Storage:
@dbeato You can completely skip Windows and use the Linux VSAN controllers.
https://www.starwindsoftware.com/announcing-new-linux-based-starwind-virtual-storage-appliance-video
Any idea on how I can download this? When I try all I can seem to find is the Windows installer.
Use the request demo portion here.
Yeah, I did. It just send me the link to the Windows installer.
Otherwise I'd hit up the folks @StarWind_Software to point you in the right direction.
I'll give that a go.
-
Hmm I see alot of Vsan advice which is the correct way to go, but I also wonder cant he do a simple thing like GlusterFS VM ? as well ? will that work in this case, and be simpler route ?
-
@emad-r said in Xenserver and Storage:
Hmm I see alot of Vsan advice which is the correct way to go, but I also wonder cant he do a simple thing like GlusterFS VM ? as well ? will that work in this case, and be simpler route ?
Yes and No. Would it work, yes. Would it be easier to manage and maintain and setup, no. @olivier can speak more to GlusterFS.
-
@emad-r said in Xenserver and Storage:
Hmm I see alot of Vsan advice which is the correct way to go, but I also wonder cant he do a simple thing like GlusterFS VM ? as well ? will that work in this case, and be simpler route ?
GlusterFS is still RLS, the advice is not really to use a VSAN, but to use RLS. People used to be sloppy and use VSA to refer to RLS, now they use VSAN. Neither is correct as RLS is more than any one connection technology.
GlusterFS will work here, but it requires more nodes and is not practical at this scale. It would be slow and problematic. No advantages that I can think of.
-
Gluster on 2 nodes won't be slow or problematic (which problems?) just a bit complicated without a turnkey deployment method (ie XOSAN).
-
@emad-r said in Xenserver and Storage:
Hmm I see alot of Vsan advice which is the correct way to go, but I also wonder cant he do a simple thing like GlusterFS VM ? as well ? will that work in this case, and be simpler route ?
No simpler if not understood or with a turnkey "layer" on top.
Gluster is not that complicated, but still, you need to grasp some concepts. It's like Xen vs XenServer in short. Second is turnkey and you don't need to get all stuff needed vs learning Xen "alone" on your distro.
-
@olivier official gluster docs say a 2 node config will go readonly if 1 node dies... You need at least an arbiter node afaik
-
@matteo-nunziati This is why we have an extra arbiter VM in 2 nodes setup. I node got 2 VMs (1x normal and 1x arbiter), and the other one just a normal VM.
This way, if you lose the host with one gluster VM, it will still work and you can't have a split-brain scenario.
An arbiter node cost very few resources (it just works with metadata)
-
@olivier said in Xenserver and Storage:
@matteo-nunziati This is why we have an extra arbiter VM in 2 nodes setup. I node got 2 VMs (1x normal and 1x arbiter), and the other one just a normal VM.
This way, if you lose the host with one gluster VM, it will still work and you can't have a split-brain scenario.
An arbiter node cost very few resources (it just works with metadata)
Wow blowing my mind! Always considered physical gluster nodes where gluater was installed on dom0. x-D.
But what if the node w/ volume AND arbiter goes down? I'm still missing this... Is arbiter replicated in any way on the xen nodes? -
Gluster client is installed in Dom0 (the client to access data). But Gluster server are in VMs, so you got more flexibility.
If the node with arbiter goes down, yes, you are in RO. But you won't enter a split brain scenario (which is the worst case in 2 nodes thing).
Eg using DRBD, in 2 nodes in multi-master, if you just lose the replication link, and you wrote on both sides, you are basically f***ed (you'll need to discard data on one node).
There is no miracle: play defensive (RO if one node down) or risky (split brain). We chose the "intermediate" way, safe and having 50% of chance to lose the "right" node without being in RO then.
Obviously, 3 nodes is the best spot when you decide to use hyperconvergence at small scale. Because the usual 3rd physical server used previously for storage, can be also now a "compute" node (hypervisor) with storage, and you could lose any host of the 3 without being in read only (disperse 3).
edit: XOSAN allow to go from 2 to 3 nodes while your VM are running, ie without any service interruption. So you can start with 2 and extend later
-
@dbeato said in Xenserver and Storage:
@olivier I would not do HA Lizard, it is problematic with XenServer. You can ask @StorageNinja . I have gone through many SW posts having issues with this. I did recommend it once but it was not worth it. XOSAN will be much better
https://xen-orchestra.com/blog/xenserver-hyperconverged-with-xosan/
or if you can afford two more host with WIndows Server and StarWind VSAN then it would be good too.Note, XOSAN is just Gluster under the hood. You do NOT WANT TO RUN GLUSTSER WITH 2 nodes. IT IS NOT SUPPORTED. (you can run a 3rd metadata only node, but you need SOMETHING out there to provide quorum).
It requires a proper stateful quorum of a 3rd node. Also for maintenance, you really likely want 4 nodes at a minimum so you can do patching and still take a failure. You'll also need to consider having enough free capacity on the cluster to maintain health slack on the Bricks, (20-30%) AND take a failure, so do that math into your overhead. Also for reasons, I'll get into in a moment you REALLY want to run local raid on Gluster nodes.
Also note, Gluster's local drive failure handling is very... binary... RedHat (who owns Gluster) refuses to issue a general support statement for JBOD mode with their HCI product, and directs you to use RAID 6 for 7.2K drives (no RAID 10). Given the unpredictable latency issues with SSD's (Garbage collection triggering failure detection etc) their deployment guide completely skips SSDs (as I would expect until they can fix the failure detection code to be more dynamic, or they can build a HCL). JBOD because of these risks is a "Contact your Red Hat representative for details." (Code for we think this is a bad idea, but might do a narrowly tested RPQ type process).
-
@olivier said in Xenserver and Storage:
Gluster client is installed in Dom0 (the client to access data). But Gluster server are in VMs, so you got more flexibility.
This architecture has a few limitations vs. something running against bare metal on a hypervisor, or a 3 tier storage.
-
You are adding latency to the back-end disk path unless you are running SR-IOV pass thru of the HBA/RAID controller.
-
You are adding TCP overhead (CPU, and 10us of latency) to the front end EVEN if/when the data is local. If you are using NFS to present gluster to the hosts (the supported/tested method).
-
Unless you've invented a native client for Xen, you destroy the primary thing I liked about gluster (local DRAM on the client side being used for ultra-low latency reads) as you are adding 10us and TCP overhead (Well I guess you could do NFS RDMA, but that's even more non-standard/unstable than pNFS)
-
The above hairpins (BACK and front end) burn a lot of extra compute. As you scale (especially on the network transport side) this gets ugly on wasting CPU cores. If you have any applications licensed per core or socket this becomes a nasty "VSA TAX" on your environment vs. a traditional 3 tier storage array deployment or something more efficient.
I do agree with you that 2 node multi-master DRDB is hilarious dangerous. I've personally had to fix split brains multiple times from people doing this and the stateful system (like what gluster uses) is 1000x safer to use. The challenge with DRDB is that the people smart enough to deploy it correctly gennerally are smart enough to do something else instead....
-