Designing for tech startup: Network, AD, Backup etc
-
To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort.
Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically.
-
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort.
Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically.
Is this sort of scale something @scale could deal with? (Had to drop the pun.)
-
@travisdh1 said in Designing for tech startup: Network, AD, Backup etc:
Is this sort of scale something @scale could deal with? (Had to drop the pun.)
They don't make storage systems for a long time.
-
It sounds like they probably don't know how much space they need. Somebody probably told them storage is cheap and they threw out some insane number. They are likely doing something very wrong to need that much storage.
Can you get some clarity of why they think they need that much storage? How much storage are they currently using?
-
@scottalanmiller said in Designing for tech startup: Network, AD, Backup etc:
To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort.
Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically.
It seems like many use Lustre with ZFS for huge high performance storage (tens of GB/s).
Supermicro, Dell EMC, HPE and others have reference solutions for it."... is ideal for organizations that are able to self-support such as universities, National Labs, and others with such capabilities."
Supermicro solution allows you to expand with 0.5PB or 1PB at a time.
From what I can see you need 18U rack space for a 1 PB solution.
If you fill up an entire rack you have 3PB usable storage.
https://supermicro.com/en/solutions/lustre -
@IRJ said in Designing for tech startup: Network, AD, Backup etc:
It sounds like they probably don't know how much space they need. Somebody probably told them storage is cheap and they threw out some insane number. They are likely doing something very wrong to need that much storage.
That matches all the rest of the setup.... all needs, vendors, etc. that don't make much sense and sound like a non-technical person throwing out words overhead in an airport.
-
@IRJ said in Designing for tech startup: Network, AD, Backup etc:
Can you get some clarity of why they think they need that much storage? How much storage are they currently using?
The background is that the system design is all politically motivated and not rooted in business or technical needs. We spent some time on that. There's not even a known use case for the storage (it's unclear if it will be object, block/SAN, file/NAS, etc.) The need is to have "petabyte storage" without any other specification.
-
Hi @gjacobse , consider something like a tiered approach to the problem. 1Pb are a lot of data.
Maybe 5-10Tb of fast SSD for caching, 50-100Tb of spinning disks for caching/capacity and the rest will go to the cloud.
For instance, a single AWS Storage Gateway appliance could be the solution if you have good internet uplink.
Another solution could be Azure Stack.
Feel free to contact me if need advices about that kind of setup. -
If this is something that just has to happen for whatever reason, political, technical, whatever, then the only real way to do it is as Scott mentioned... RAIN. It's the only way to do it that is truly scalable and manageable to sizes like that, even if starting out with much lower storage capacity. You can start with a few nodes of several hundred TBs, and add more nodes to scale as required.
Maybe look at something like DataOn.
-
RAIN over RAID
This is likely the answer. I found this: Drath of RAID and am reading it.
I don’t see many diagrams on it, or much on it really- maybe I’m not searching the right term(s).
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
RAIN over RAID
This is likely the answer. I found this: Drath of RAID and am reading it.
I don’t see many diagrams on it, or much on it really- maybe I’m not searching the right term(s).
Of course there are no diagrams, ML is to discuss, not a hand book on how to setup an exact system. Everything you're being asked to do is going to require RAIN, but the specifics of how it's setup is going to be completely unique to this environment.
-
Reading the documentation for gluster, there is nothing particularly difficult to understand here. Since RAIN is what you're going to be using, might be worth reading. https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/
-
@DustinB3403 said in Designing for tech startup: Network, AD, Backup etc:
Reading the documentation for gluster, there is nothing particularly difficult to understand here. Since RAIN is what you're going to be using, might be worth reading. https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/
Doesn’t meet the requested OS-
They only want Windows. -
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
@DustinB3403 said in Designing for tech startup: Network, AD, Backup etc:
Reading the documentation for gluster, there is nothing particularly difficult to understand here. Since RAIN is what you're going to be using, might be worth reading. https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/
Doesn’t meet the requested OS-
They only want Windows.Well they clearly don't know what they want and how it works.
My assumption is they want a Windows File Server, they shouldn't care how the underlying environment is setup so long as they are presented with the interface to "manage the files" that they are used to.
-
I suppose you could use Storage Spaces Direct (all windows across the entire thing) but I wouldn't consider SSD at all mature nor production ready, especially at this scale.
-
@DustinB3403 said in Designing for tech startup: Network, AD, Backup etc:
I suppose you could use Storage Spaces Direct (all windows across the entire thing) but I wouldn't consider SSD at all mature nor production ready, especially at this scale.
Thanks, had not heard of this.
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
@DustinB3403 said in Designing for tech startup: Network, AD, Backup etc:
I suppose you could use Storage Spaces Direct (all windows across the entire thing) but I wouldn't consider SSD at all mature nor production ready, especially at this scale.
Thanks, had not heard of this.
You'd not heard of Storage Spaces Direct?
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
RAIN over RAID
This is likely the answer. I found this: Drath of RAID and am reading it.
I don’t see many diagrams on it, or much on it really- maybe I’m not searching the right term(s).
RAIN is used pretty much everywhere. But it's not something you normally implement yourself. So you aren't going to find much on it because almost no one works with it.
-
@gjacobse said in Designing for tech startup: Network, AD, Backup etc:
@DustinB3403 said in Designing for tech startup: Network, AD, Backup etc:
I suppose you could use Storage Spaces Direct (all windows across the entire thing) but I wouldn't consider SSD at all mature nor production ready, especially at this scale.
Thanks, had not heard of this.
DataOn solutions fully support this and vice versa. They are experienced with this kind of scale and much larger.