Designing for tech startup: Network, AD, Backup etc
- 
 @scottalanmiller said in Designing for tech startup: Network, AD, Backup etc: @gjacobse said in Designing for tech startup: Network, AD, Backup etc: has anyone used Free RAID Calculator before? Yeah, but the math is very straightforward, not much call for it. while I would agree,... when you're dealing with that Petabyte, it's nice to know your math is right - 
- 
 @gjacobse said in Designing for tech startup: Network, AD, Backup etc: has anyone used Free RAID Calculator before? Problems with it include.... The parts that it gets right are insanely simple and can be done in your head faster than you can type the details in. The other parts are just wrong. The performance part is wrong, as is the risk part. The only part it gets right is the capacity. 
- 
 @gjacobse said in Designing for tech startup: Network, AD, Backup etc: @scottalanmiller said in Designing for tech startup: Network, AD, Backup etc: @gjacobse said in Designing for tech startup: Network, AD, Backup etc: has anyone used Free RAID Calculator before? Yeah, but the math is very straightforward, not much call for it. while I would agree,... when you're dealing with that Petabyte, it's nice to know your math is right - Can't really consider a petabyte on RAID. So not useful for storage at that scale. 
- 
 @scottalanmiller said in Designing for tech startup: Network, AD, Backup etc: @gjacobse said in Designing for tech startup: Network, AD, Backup etc: has anyone used Free RAID Calculator before? Problems with it include.... The parts that it gets right are insanely simple and can be done in your head faster than you can type the details in. The other parts are just wrong. The performance part is wrong, as is the risk part. The only part it gets right is the capacity. good thing I am ignoring that aspect. for any type of performance, I would go with a full hybrid system. 
- 
 @gjacobse said in Designing for tech startup: Network, AD, Backup etc: @scottalanmiller said in Designing for tech startup: Network, AD, Backup etc: @gjacobse said in Designing for tech startup: Network, AD, Backup etc: has anyone used Free RAID Calculator before? Problems with it include.... The parts that it gets right are insanely simple and can be done in your head faster than you can type the details in. The other parts are just wrong. The performance part is wrong, as is the risk part. The only part it gets right is the capacity. good thing I am ignoring that aspect. for any type of performance, I would go with a full hybrid system. Hybrid isn't going to fix the fundamental issue of RAID but being viable at even a fraction of this size. 
- 
 As RAID arrays get large, you have to move more and more towards RAID 10. Using roughly the largest drives available broadly on the market (12TB), a single petabyte would be 180 drives in a single RAID array. This is way, way larger than is practical to have in a single array from both a spindle count, and a storage volume number. 
- 
 @scottalanmiller said in Designing for tech startup: Network, AD, Backup etc: As RAID arrays get large, you have to move more and more towards RAID 10. Using roughly the largest drives available broadly on the market (12TB), a single petabyte would be 180 drives in a single RAID array. This is way, way larger than is practical to have in a single array from both a spindle count, and a storage volume number. Considering the 16TB is so new - I wouldn't recommend them. I need to go back and re-re-read the IPOD of yours..... 
- 
 @gjacobse said in Designing for tech startup: Network, AD, Backup etc: Considering the 16TB is so new - I wouldn't recommend them. 16TB is an SSD. 
- 
 @gjacobse said in Designing for tech startup: Network, AD, Backup etc: I need to go back and re-re-read the IPOD of yours..... That's a separate issue, but also huge. But as this is purely a storage question, we need to focus there. RAID essentially stops being viable around 100TB - 200TB. If you are dealing with really slow, low priority archival storage maybe slightly larger. For a large petabyte scale storage system, RAIN is really your only option. 
- 
 To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort. Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically. 
- 
 @scottalanmiller said in Designing for tech startup: Network, AD, Backup etc: To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort. Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically. Is this sort of scale something @scale could deal with? (Had to drop the pun.) 
- 
 @travisdh1 said in Designing for tech startup: Network, AD, Backup etc: Is this sort of scale something @scale could deal with? (Had to drop the pun.) They don't make storage systems for a long time. 
- 
 It sounds like they probably don't know how much space they need. Somebody probably told them storage is cheap and they threw out some insane number. They are likely doing something very wrong to need that much storage. Can you get some clarity of why they think they need that much storage? How much storage are they currently using? 
- 
 @scottalanmiller said in Designing for tech startup: Network, AD, Backup etc: To get into petabyte range you are into "specialty everything" no matter how you slice it. If you are going to build it yourself you are pretty much stuck with CEPH or Gluster. And those aren't fast and you'll expect a storage expert to be managing them. It's not like buying a hardware RAID controller and just letting it handle everything for you. This is a significant engineering effort. Realistically, even if you are looking for 100TB of production storage, you are going to want to be bringing in vendors like EMC or Nimble where they build and manage systems like this specifically. It seems like many use Lustre with ZFS for huge high performance storage (tens of GB/s). 
 Supermicro, Dell EMC, HPE and others have reference solutions for it."... is ideal for organizations that are able to self-support such as universities, National Labs, and others with such capabilities." Supermicro solution allows you to expand with 0.5PB or 1PB at a time. 
 From what I can see you need 18U rack space for a 1 PB solution.
 If you fill up an entire rack you have 3PB usable storage.
 https://supermicro.com/en/solutions/lustre
- 
 @IRJ said in Designing for tech startup: Network, AD, Backup etc: It sounds like they probably don't know how much space they need. Somebody probably told them storage is cheap and they threw out some insane number. They are likely doing something very wrong to need that much storage. That matches all the rest of the setup.... all needs, vendors, etc. that don't make much sense and sound like a non-technical person throwing out words overhead in an airport. 
- 
 @IRJ said in Designing for tech startup: Network, AD, Backup etc: Can you get some clarity of why they think they need that much storage? How much storage are they currently using? The background is that the system design is all politically motivated and not rooted in business or technical needs. We spent some time on that. There's not even a known use case for the storage (it's unclear if it will be object, block/SAN, file/NAS, etc.) The need is to have "petabyte storage" without any other specification. 
- 
 Hi @gjacobse , consider something like a tiered approach to the problem. 1Pb are a lot of data. 
 Maybe 5-10Tb of fast SSD for caching, 50-100Tb of spinning disks for caching/capacity and the rest will go to the cloud.
 For instance, a single AWS Storage Gateway appliance could be the solution if you have good internet uplink.
 Another solution could be Azure Stack.
 Feel free to contact me if need advices about that kind of setup.
- 
 If this is something that just has to happen for whatever reason, political, technical, whatever, then the only real way to do it is as Scott mentioned... RAIN. It's the only way to do it that is truly scalable and manageable to sizes like that, even if starting out with much lower storage capacity. You can start with a few nodes of several hundred TBs, and add more nodes to scale as required. Maybe look at something like DataOn. 
- 
 RAIN over RAID This is likely the answer. I found this: Drath of RAID and am reading it. I don’t see many diagrams on it, or much on it really- maybe I’m not searching the right term(s). 
- 
 @gjacobse said in Designing for tech startup: Network, AD, Backup etc: RAIN over RAID This is likely the answer. I found this: Drath of RAID and am reading it. I don’t see many diagrams on it, or much on it really- maybe I’m not searching the right term(s). Of course there are no diagrams, ML is to discuss, not a hand book on how to setup an exact system. Everything you're being asked to do is going to require RAIN, but the specifics of how it's setup is going to be completely unique to this environment. 





