Cross Posting - Storage Spaces Conundrum



  • Hello Spiceheads (from here),

    I am currently looking at implementing a large file server. I have a Lenovo server with 70x 1.8tb 10k sas drives attached via DAS. This server will be used as a file server. Serving up 80% small files 1 - 2mb and 20% large files 10GB+.

    What I am not sure about is how to provision the drives. Do I use RAID? Should I use storage spaces? Or should I go with something else like ScaleIO, OpenIO, Starwinds etc..?

    I am looking for a solution that is scalable so if I wanted to increase the volume and I was also thinking about a little future proofing so setting this up so I could scale it out if I wanted to.

    This dose need to be resilient with a quick turn around should a disk go down and it also needs to be scalable.

    Looking forward to hearing your views.



  • So immediately I think the OP needs some guidance with regards to this project. It sounds as if he's out of his realm of expertise.

    Then the follow up questions are,

    • Why are there so many low capacity drives,
    • How is this DAS storage configured?
    • Who configured this system, isn't there documentation from the deployment?
    • Is this system virtual?
    • How much storage do you have today?
    • How much storage do you need?


  • How do you building something this huge (did you really say 70, as in seven zero, drives?) and not have a hired consultant who designed it for you?

    This thing has to be massive. I'm trying to remember, I think Scott told us he once had a SamSD that held 24-26 drives? So we're definitely talking about DAS trays connected to some kind of adapter inside the server. I can't think this was less than $30K.



  • @DustinB3403 said in Cross Posting - Storage Spaces Conundrum:

    I am currently looking at implementing a large file server. I have a Lenovo server with 70x 1.8tb 10k sas drives attached via DAS. This server will be used as a file server. Serving up 80% small files 1 - 2mb and 20% large files 10GB+.

    How are the number of drives chosen without knowing the other answers? You have to know EVERYTHING about the RAID, RAIN, Scale Out, etc. before even having the first clue about what drives and have many to choose. Something is dreadfully wrong. Cart is driving the horse.



  • @Dashrender said in Cross Posting - Storage Spaces Conundrum:

    This thing has to be massive. I'm trying to remember, I think Scott told us he once had a SamSD that held 24-26 drives? So we're definitely talking about DAS trays connected to some kind of adapter inside the server. I can't think this was less than $30K.

    You can get to about 48 without going to an external tray. In theory.



  • @DustinB3403 said in Cross Posting - Storage Spaces Conundrum:

    What I am not sure about is how to provision the drives. Do I use RAID? Should I use storage spaces?

    Storage Spaces IS RAID. It's just Windows software RAID. So no, it's not well tested and certainly not smart to be a guinea pig at this scale!



  • At this size, 99% of the time you are looking at Scale Out. So products like Exablox or OpenIO. Once you talk about adding DAS... stop. You should almost never do that. Go out, not up.

    @SeanExablox @GuillaumeDelaporte



  • @DustinB3403 said in Cross Posting - Storage Spaces Conundrum:

    I am currently looking at implementing a large file server. I have a Lenovo server with 70x 1.8tb 10k sas drives attached via DAS. This server will be used as a file server. Serving up 80% small files 1 - 2mb and 20% large files 10GB+.

    So the server is already purchased? How did we get to this point?



  • If going RAID, nothing is going to be great. Only RAID 10 is viable at this scale with any real protection.



  • @DustinB3403 said in Cross Posting - Storage Spaces Conundrum:

    This dose need to be resilient with a quick turn around should a disk go down and it also needs to be scalable.

    So the hardware that you have is not resilient or scalable, that's why you need a different approach. The more disks and trays that you add to your system, the worse it will get with this Scale Up design.



  • @DustinB3403 said in Cross Posting - Storage Spaces Conundrum:

    I am looking for a solution that is scalable so if I wanted to increase the volume and I was also thinking about a little future proofing so setting this up so I could scale it out if I wanted to.

    You need three servers to scale out and the DAS is just wasted, you'll want to remove that. You have to plan for scale out from the beginning, not buy and build a scale up solution and then try to switch. You are lacking the hardware and software for scale out. You'll have to replace a lot of what you have.



  • @scottalanmiller Thanks for the feedback. My long term goal was to purchase more of these thus providing scale out. RAID 10 is the plan at this scale. I have read a lot of threads with your comments about the different scale out options you use or suggest. Buying 3x servers with 24 disks is a possibility for sure. What would be your suggestion going down the 3 server route?



  • @DustinB3403 Hopefully some answers to your questions.

    • I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
    • It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
    • The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
    • No the system is not virtual.
    • I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
    • I need at least 50 TB to start but I anticipate that to double in a year.

    Hope this helps.



  • Is SSD just outside your cost reach?

    Scott would have to say if RAID 5 or 6 is doable with SSD at this scale?

    Where is your current bottle neck? The drives? The drive interface (SAS)?



  • @Dashrender Yes i think that SSD for the amount of storage we need is going to be too expensive. I have £39k to spend on the entire solution at this time. This is not to say we won't have more money available later. But I also need to think of backups etc in that same pot of cash.
    The bottleneck is the drives they just can't read and write fast enough....



  • @munderhill said in Cross Posting - Storage Spaces Conundrum:

    @Dashrender Yes i think that SSD for the amount of storage we need is going to be too expensive. I have £39k to spend on the entire solution at this time. This is not to say we won't have more money available later. But I also need to think of backups etc in that same pot of cash.
    The bottleneck is the drives they just can't read and write fast enough....

    So then you have to jump to SSD or a Hybrid Drive.

    7200 is slow yes 10k is faster, but SSD or Hybrid will be way faster.

    Which if the business needs speed SSD would be where it's at.





  • Here's the drive from CDW



  • @DustinB3403 said in Cross Posting - Storage Spaces Conundrum:

    Here's the drive from CDW

    he's need 4 of those just to get 60 TB, and that is with no RAID, I think that would blow his budget. Frankly, his budget might be to low for what he is trying to accomplish.

    Of course a storage consultant would need to be involved who know what the current IOPs is, what it needs to be for customers to be happy, and then what it will take to get that to happen.

    I saw talk of small files and huge files - can those all be on the same array and allow for the needed/wanted performance? I have no clue.



  • @Dashrender oh I know, but 70 drives seems like it would blow a lot of money on old technology.



  • I cannot image a file server needing that kind of speed. Are these files being read and wrote consistently or something?

    A DB or logging server I could see, but not a normal file server.



  • @munderhill said in Cross Posting - Storage Spaces Conundrum:

    @DustinB3403 Hopefully some answers to your questions.

    • I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
    • It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
    • The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
    • No the system is not virtual.
    • I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
    • I need at least 50 TB to start but I anticipate that to double in a year.

    Hope this helps.

    Do you have IOPS numbers? How is the old SAN configured?

    Too many questions to do more than speculate.



  • @munderhill said in Cross Posting - Storage Spaces Conundrum:

    @scottalanmiller Thanks for the feedback. My long term goal was to purchase more of these thus providing scale out. RAID 10 is the plan at this scale. I have read a lot of threads with your comments about the different scale out options you use or suggest. Buying 3x servers with 24 disks is a possibility for sure. What would be your suggestion going down the 3 server route?

    Dell R730xd is really nice for building your own storage. HPE Proliant DL380 G9 is quite nice, too. Although for scale out, I'd often lean to SuperMicro. Talk to OpenIO, their product likely fits your needs very well for build your own scale out of this nature. You need their enterprise product with the network file system option to do what you want.



  • @munderhill said in Cross Posting - Storage Spaces Conundrum:

    @Dashrender Yes i think that SSD for the amount of storage we need is going to be too expensive. I have £39k to spend on the entire solution at this time. This is not to say we won't have more money available later. But I also need to think of backups etc in that same pot of cash.
    The bottleneck is the drives they just can't read and write fast enough....

    I think that that might give you the ability to move to Exablox and have a totally built out, totally supported solution in your price envelope. If not, it should be close. In the US the price would be $30K for the three base units then the drives added on top of that. You would use NL-SAS drives at a fraction of the cost per TB as what you are looking at now.



  • @Dashrender said in Cross Posting - Storage Spaces Conundrum:

    Scott would have to say if RAID 5 or 6 is doable with SSD at this scale?

    RAID 6, yes.


  • Banned

    @munderhill said in Cross Posting - Storage Spaces Conundrum:

    The bottleneck is the drives they just can't read and write fast enough....

    @munderhill said

    • The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
    • No the system is not virtual.

    Are you talking about the current system or the new system in terms of not being virtual?


    "Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "

    How many users?


    If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.

    And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.

    There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.



  • @Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:

    @munderhill said in Cross Posting - Storage Spaces Conundrum:

    The bottleneck is the drives they just can't read and write fast enough....

    @munderhill said

    • The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
    • No the system is not virtual.

    Are you talking about the current system or the new system in terms of not being virtual?


    "Serving up 80% small files 1 - 2mb and 20% large files 10GB+. "

    How many users?


    If you want a resilient system and performance, why not go for 2-4 storage nodes with the data replicated between them both so that the work load is shared between the devices, transparent to the users and programs but in the background it is working.

    And even if an entire raid controller dies, or the server spontaneously fails, the second one carries on.

    There are lots of ways of doing it, many different providers but I would be hesitant to put all my eggs in one basket with a single server. Think about a different approach.

    Doesn't this double or more the cost? You'd need at least two times the storage.


  • Banned

    @Dashrender said

    Doesn't this double or more the cost? You'd need at least two times the storage.

    Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.

    Yes it could drive the cost of the storage higher but if speed and reliability are the primary goals, 1 node won't cut it.



  • @munderhill said in Cross Posting - Storage Spaces Conundrum:

    @DustinB3403 Hopefully some answers to your questions.

    • I need speed over capacity so had to go for the biggest 10k disk as the budget doesn't allow for 15k.
    • It is not configured at the moment but it would consist of a raid controler and 3x expansion units.
    • The system has been designed by our technical partner specialists but I am not sure it is the best way forward or meets our needs long term so that is why i am asking for assistance!
    • No the system is not virtual.
    • I have an old HP P2000 SAN with Near-Line SAS 7k disks that is struggling with the demands. The volume is 30tb.
    • I need at least 50 TB to start but I anticipate that to double in a year.

    Hope this helps.

    Well, I think one part of the performance problem is that SAN. How is it connected to the hosts? Going to that many drive shelves doesn't make sense to me, at some point the external connections (even if they are SAS/SATA) become a bottleneck. Stick to what people are recommending here rather than what a vendor is telling you to get!



  • @Breffni-Potter said in Cross Posting - Storage Spaces Conundrum:

    @Dashrender said

    Doesn't this double or more the cost? You'd need at least two times the storage.

    Depends on the model you use and how you do it. If you look at a Scale system, you have 3 nodes which combine to give you more usable storage than a single node.

    Yes it could drive the cost of the storage higher but if speed and reliability are the primary goals, 1 node won't cut it.

    Why won't one node do it? Where is the bottle neck in one node? We really don't know enough from the OP to know where the bottle neck really is - we've only been told that it's the disk throughput, but that's really not enough information. Number of users simultaneously accessing, how much data, how it's accessed, etc. Maybe the real bottleneck is the network, we just don't have enough information.

    Of reliability is an issue. But you mentioned loosing a RAID card wouldn't remove access to data. That only happens if a) you have redundant RAID cards in front of that storage, or b) you have two copies of the data (meaning at least two times the needed storage).

    Since the general consensus around these parts is that RAID cards don't fail often, it's not something you make redundant within a single box. So that only leaves b - two copies of the data.


Log in to reply