Application clustering VS RAID with modern SSD



  • I have POF setup of an enterprise server (IBM x3550m4) running with just an Intel PCIe-card-form-factor p3600 1.6Tb, hosting an ERP with MS SQL DB and a fileserver with ~400Gb of regular office files. Have done several benchmarks in the last months and everything is running just fine.

    I will soon put this stuff in production, and I have a spare (new, with 1.2 Tb of 10k spindles) x3550m4... within my budget (~1500 euro, less is better), I’m thinking about two architectural pattern to get better reliability than the POF single node:

    • Buy another enterprise SSD (maybe the samsung 1725a, 1.6 Tb too) for the other node, setup a SQL replication for the DB and fileserver sync daemon (DRFS, Synchthing, ecc), and failover the DNS in case of the first node fail (in any way). We can tolerate some hours of downtime;
    • Buy a couple of smaller SAS SSD and RAID 1 them, use those as the primary storage, use the p3600 or the spindles for the replicas.

    A couple of consideration about that:

    • I think that today’s enterprise-class PCIe SSD in the first 5 years from deployment and with the right overprovisioning (like the ones that I mentioned) almost as reliable as a RAID controller, because they have a full solid-state storage controller, no moving parts and a declared MTBF that is very reassuring;
    • Those services can tolerate even a day of downtime every 3-5 years without major impact on revenues;
    • I don’t see the point of RAID and in general of node-level reliability if I can rely on better application clustering, this idea was inspired by hyperscalers/opencompute machines, that works on single PSU etc. because their reliability is achieved at an higher level.

    What do you think about it? Any hints are welcome!



  • @scottalanmiller I’m sure you’ve written something about application clustering somewhere.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    @scottalanmiller I’m sure you’ve written something about application clustering somewhere.

    Something, but not anything covering this specifically.



  • There are multiple layers that can address needs and remove a need for RAID. Essentially, all of them are things that are "more than" RAID.



  • In many cases, such as a two node application cluster, you can certain do without RAID but typically do so by effectively replicating RAID in a different way.

    So let's take a MySQL two node cluster as an example. Doing MySQL clustering is fine but for all intents and purposes requires a combination of manually mirroring and the application clustering acting like Network RAID 1. It's not actually RAID, but is acting like it.

    The reason you use RAID normally, local RAID that is, is to avoid node rebuilds. Whether it's Network RAID, or application mirroring or whatever, there is an impact for a node rebuild. If you have no local RAID, those rebuilds become incredibly frequent rather than rare to non-existant.



  • Some things like stateless app servers, rebuilds might not affect the app at all. So skipping RAID might be totally feasible with little to no impact. But something like a database where other nodes have to recreate data over the network, it might be pretty negative.



  • Thanks @scottalanmiller . I thin I’ll start with a DRBD over two single PCIe NVMe cards (one in each nodes) synced through an infiniband link (infiniband is cheap today!), and I will slowly move every capable workloads to application clustering.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    Thanks @scottalanmiller . I thin I’ll start with a DRBD over two single PCIe NVMe cards (one in each nodes) synced through an infiniband link (infiniband is cheap today!), and I will slowly move every capable workloads to application clustering.

    Just remember, application clustering at that level is normally a large cost, whereas RAID is a low cost. And application clustering is slow, while RAID is fast.

    But DRBD cannot be part of application clustering. DRBD is a RAID system. So you can, of course, use DRBD and application clustering, but they are two very different things.

    Local RAID is about speed and low cost. DRBD is Network RAID, so high cost and introduces latency. Application clustering doesn't need RAID of any sort, as it is already clustered. You would use totally independent storage for application clustering.



  • But consider cost and risk of traditional RAID 1 vs. mirrored application clustering for a workload like MariaDB (just as a sample.)

    Base Server: $10K


    Application Clustering: You need two servers, so your cost is $20,000. And that's assuming application clustering is available for the workload, and free. It is free with MariaDB, so this is a good use case.

    Traditional RAID: You need an extra SSD for your one server, so say add $500 onto your base cost for a total of $10,500. That's a fraction of the cost of the application clustering.



  • Performance:

    Application Clustering: Because data has to be synced over the network, there is a performance hit from application clustering. For enough money, you can minimize this greatly, but it just costs more and more to do so.

    Traditional RAID: RAID 1 is faster than no-RAID. And moving to things like RAID 10 can speed you up even more. So rather than taking a performance hit, RAID for protection of this nature will speed you up.



  • Reliability:

    Application Clustering requires everything be duplicated, even CPUs and RAM, so there are some benefits to reliability improvements from the high cost of redudancy. But typically these are minor, as the extra redundancy is typically around pieces that rarely fail. It's a brute force redundancy, rather than a finesse redundancy.

    RAID targets the pieces of the system that are most fragile and critical - the storage. It is the drives failing alone that causes full data loss, and drives represent the majority of hardware failures. So you get 99% of the protection, at a fraction of the price.

    Because RAID is so mature and reliable, there is an argument that that combined with its insane speed, cache protection options and such will actually be safer than application layer protections that are comparable.



  • Effort:

    Application Clustering requires a lot of expertise, and unique expertise to each and every workload, which must then be monitored, maintained, and updated to keep working. This often triples or quadruples the effort to build and maintain a workload and in extreme cases can be far worse. This is an ongoing effort requiring expertise around maintaining clustering and dealing with edge situations, software changes, and so forth. This is generally outside of the skill set of many IT shops, depending on the workloads. Some clustering, like Windows AD is really simple, some like many databases, is very hard.

    RAID zero effort. Tell it to turn on, ignore it. There is nothing to know or do and the system can be safely turned over even to non-technical staff to maintain.



  • Cost of Licensing:

    Application this is often a costly add on to many software products (and is not always available), and often requires extra software purchases. For example, with MS SQL Server it generally requires more Windows Server and SQL Server licenses, plus additional cost for the application clustering layer. So for many workloads, and any on Windows, the licensing cost soars rapidly.

    RAID no known products have any licensing costs tied to block storage redundancy. So there is no cost of this in the real world.



  • @scottalanmiller I see your points, but let me just add some additional information about the current configuration:

    • we already have three identical server and one NVMe PCIe card;
    • I want to use DRBD replication only for stuff that cannot be made high-available without upgrades like SAL server standard etc. Thinking of use syncthing for file replication.


  • @scottalanmiller said in Application clustering VS RAID with modern SSD:

    But consider cost and risk of traditional RAID 1 vs. mirrored application clustering for a workload like MariaDB (just as a sample.)

    Base Server: $10K


    Application Clustering: You need two servers, so your cost is $20,000. And that's assuming application clustering is available for the workload, and free. It is free with MariaDB, so this is a good use case.

    Traditional RAID: You need an extra SSD for your one server, so say add $500 onto your base cost for a total of $10,500. That's a fraction of the cost of the application clustering.

    I already have servers, they are the same spec and out of vendor support.



  • @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Performance:

    Application Clustering: Because data has to be synced over the network, there is a performance hit from application clustering. For enough money, you can minimize this greatly, but it just costs more and more to do so.

    Traditional RAID: RAID 1 is faster than no-RAID. And moving to things like RAID 10 can speed you up even more. So rather than taking a performance hit, RAID for protection of this nature will speed you up.

    Async replication has almost NO performance hit on the master.



  • @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Reliability:

    Application Clustering requires everything be duplicated, even CPUs and RAM, so there are some benefits to reliability improvements from the high cost of redudancy. But typically these are minor, as the extra redundancy is typically around pieces that rarely fail. It's a brute force redundancy, rather than a finesse redundancy.

    RAID targets the pieces of the system that are most fragile and critical - the storage. It is the drives failing alone that causes full data loss, and drives represent the majority of hardware failures. So you get 99% of the protection, at a fraction of the price.

    Because RAID is so mature and reliable, there is an argument that that combined with its insane speed, cache protection options and such will actually be safer than application layer protections that are comparable.

    This is unfair, you are really comparing apple to oranges: in one case you have a completely shared-nothing cluster, in the other you are just protected from storage disk failure. What if the cpu/mobo/controller/psu/etc fail?



  • @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Effort:

    Application Clustering requires a lot of expertise, and unique expertise to each and every workload, which must then be monitored, maintained, and updated to keep working. This often triples or quadruples the effort to build and maintain a workload and in extreme cases can be far worse. This is an ongoing effort requiring expertise around maintaining clustering and dealing with edge situations, software changes, and so forth. This is generally outside of the skill set of many IT shops, depending on the workloads. Some clustering, like Windows AD is really simple, some like many databases, is very hard.

    RAID zero effort. Tell it to turn on, ignore it. There is nothing to know or do and the system can be safely turned over even to non-technical staff to maintain.

    Mostly true, but very different stuff.



  • @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Cost of Licensing:

    Application this is often a costly add on to many software products (and is not always available), and often requires extra software purchases. For example, with MS SQL Server it generally requires more Windows Server and SQL Server licenses, plus additional cost for the application clustering layer. So for many workloads, and any on Windows, the licensing cost soars rapidly.

    RAID no known products have any licensing costs tied to block storage redundancy. So there is no cost of this in the real world.

    This is true, and I’m trying ti avoid that cost via drbd replication.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    • I want to use DRBD replication only for stuff that cannot be made high-available without upgrades like SAL server standard etc. Thinking of use syncthing for file replication.

    That can work, but things like RSYNC are often better for that. Less latency.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    @scottalanmiller said in Application clustering VS RAID with modern SSD:

    But consider cost and risk of traditional RAID 1 vs. mirrored application clustering for a workload like MariaDB (just as a sample.)

    Base Server: $10K


    Application Clustering: You need two servers, so your cost is $20,000. And that's assuming application clustering is available for the workload, and free. It is free with MariaDB, so this is a good use case.

    Traditional RAID: You need an extra SSD for your one server, so say add $500 onto your base cost for a total of $10,500. That's a fraction of the cost of the application clustering.

    I already have servers, they are the same spec and out of vendor support.

    That's very different. If the goal is to use "whatever hardware is lying around" rather than designing for a specific use case, then anything that fits the needs of the hardware might make sense.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Performance:

    Application Clustering: Because data has to be synced over the network, there is a performance hit from application clustering. For enough money, you can minimize this greatly, but it just costs more and more to do so.

    Traditional RAID: RAID 1 is faster than no-RAID. And moving to things like RAID 10 can speed you up even more. So rather than taking a performance hit, RAID for protection of this nature will speed you up.

    Async replication has almost NO performance hit on the master.

    HA Application Clustering is always sync, though, not async. Application needs to wait for confirmation for its peers before unlocking, or else it is not competing with RAID for data protection.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Reliability:

    Application Clustering requires everything be duplicated, even CPUs and RAM, so there are some benefits to reliability improvements from the high cost of redudancy. But typically these are minor, as the extra redundancy is typically around pieces that rarely fail. It's a brute force redundancy, rather than a finesse redundancy.

    RAID targets the pieces of the system that are most fragile and critical - the storage. It is the drives failing alone that causes full data loss, and drives represent the majority of hardware failures. So you get 99% of the protection, at a fraction of the price.

    Because RAID is so mature and reliable, there is an argument that that combined with its insane speed, cache protection options and such will actually be safer than application layer protections that are comparable.

    This is unfair, you are really comparing apple to oranges: in one case you have a completely shared-nothing cluster, in the other you are just protected from storage disk failure. What if the cpu/mobo/controller/psu/etc fail?

    It may be apples and oranges, but that's where it starts. It's two very different things and under normal circumstances you never consider application replication unless you have RAID. RAID is cheap and really effective.

    Although it might seem like apples and oranges, it's like turbo charging or getting a bigger engine - very different techniques, same goal. Here it is two reliability techniques, one goal. The point is, of the two, 90% of the time RAID is actually more effective. It might SEEM like having all those other parts with extra redundancy would do a lot for you, but in the real world it doesn't do all that much. And the risks you take on by avoiding the RAID will rarely be offset by all that extra redundancy.

    Think of it like an airplane... the thing you want redundant is the engine. Sure extra seats, steering wheels, wings, wheels, etc. all sound great, and if they are free then sure, but 95% of the time it is the engine that fails, not any of those things. So you can't get distracted by the "what if X happens", you have to remain focused on the resultant reliability and I think that you will find that RAID is either around a break even or even safer than a non-RAID general redundancy approach at the same redundancy level (single mirror, double mirror, etc.)



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Effort:

    Application Clustering requires a lot of expertise, and unique expertise to each and every workload, which must then be monitored, maintained, and updated to keep working. This often triples or quadruples the effort to build and maintain a workload and in extreme cases can be far worse. This is an ongoing effort requiring expertise around maintaining clustering and dealing with edge situations, software changes, and so forth. This is generally outside of the skill set of many IT shops, depending on the workloads. Some clustering, like Windows AD is really simple, some like many databases, is very hard.

    RAID zero effort. Tell it to turn on, ignore it. There is nothing to know or do and the system can be safely turned over even to non-technical staff to maintain.

    Mostly true, but very different stuff.

    That it is different really doesn't matter. Remember, the goal is resulting reliability. So while it is different, we don't care, we care that the better reliability or equal reliability is cheaper, easier, and faster.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    @scottalanmiller said in Application clustering VS RAID with modern SSD:

    Cost of Licensing:

    Application this is often a costly add on to many software products (and is not always available), and often requires extra software purchases. For example, with MS SQL Server it generally requires more Windows Server and SQL Server licenses, plus additional cost for the application clustering layer. So for many workloads, and any on Windows, the licensing cost soars rapidly.

    RAID no known products have any licensing costs tied to block storage redundancy. So there is no cost of this in the real world.

    This is true, and I’m trying ti avoid that cost via drbd replication.

    DRBD replication can only avoid licensing costs (in most cases) if you don't have the workloads running in the second location. Which, of course, you can do. But that's really not how it is designed to be used. But it's certainly valid. But this results in something drastically different as you pointed out about the app vs RAID earlier. In your DRBD non-licensed model, if one drive fails, your application fails. Giving you dramatically less protection than regular RAID.

    So again, if this is about reusing what you have, then cost savings might trump everything else. If it is about planning for alternative high availability approaches, I think the "don't consider anything until you have RAID locally" manta remains valid for all but the rarest cases.



  • Again, valid point. The alternative is to put another pcie ssd in the first node and raid it (mdadm). And, of course, buy another TWO of them and put it in the other node, in case of the first one failed. This is gonna be much higher in costs…



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    Again, valid point. The alternative is to put another pcie ssd in the first node and raid it (mdadm). And, of course, buy another TWO of them and put it in the other node, in case of the first one failed. This is gonna be much higher in costs…

    That's misleading because it's not the real alternative. If you were okay with no RAID, but two nodes, you are okay with one node and RAID. Your leap to needing a second node doesn't make sense, it's a level of reliability you don't require. So that's an apple to the orange.

    It's a second SSD in the single node and NO second node that is comparable, and easily safer, than two nodes without RAID. Even if it isn't safer, it's REALLY close.

    So you can't use the "need a second node with RAID" scenario as a comparison for anything, it's outside of the scope and not roughly comparable. So ignore it, it's not relevant.

    Your options are... one node with RAID, or two nodes with network RAID. Single node with regular RAID is faster, simpler, cheaper, and easily comparable if not better for reliability. A second node with no RAID is just vastly impracticable unless it is somehow free while having RAID is not.



  • @scottalanmiller said in Application clustering VS RAID with modern SSD:

    @francesco-provino said in Application clustering VS RAID with modern SSD:

    Again, valid point. The alternative is to put another pcie ssd in the first node and raid it (mdadm). And, of course, buy another TWO of them and put it in the other node, in case of the first one failed. This is gonna be much higher in costs…

    That's misleading because it's not the real alternative. If you were okay with no RAID, but two nodes, you are okay with one node and RAID. Your leap to needing a second node doesn't make sense, it's a level of reliability you don't require. So that's an apple to the orange.

    It's a second SSD in the single node and NO second node that is comparable, and easily safer, than two nodes without RAID. Even if it isn't safer, it's REALLY close.

    So you can't use the "need a second node with RAID" scenario as a comparison for anything, it's outside of the scope and not roughly comparable. So ignore it, it's not relevant.

    Your options are... one node with RAID, or two nodes with network RAID. Single node with regular RAID is faster, simpler, cheaper, and easily comparable if not better for reliability. A second node with no RAID is just vastly impracticable unless it is somehow free while having RAID is not.

    Uhm, I’m sorry but I don’t agree with you.
    The network RAID will have the same cost (always one other SSD) and BETTER reliability, because even if the mainboard/cpu/ecc fail in the first node, I will have another ready-to-go host with all I need to start my environment.

    There is also the possibility of create TWO drbd replica set, one active on the first node and the other active on the second; that way, I can easily double the total cpu count and ram available for the VMs… sort of hyperconvergency on the cheap!



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    The network RAID will have the same cost (always one other SSD) and BETTER reliability, because even if the mainboard/cpu/ecc fail in the first node, I will have another ready-to-go host with all I need to start my environment.

    I explained why that isn't how reliability was measured earlier, though. Stating a "what if" doesn't make it more reliable. Motherboard and CPU rarely fail, so just because you protect against that, at the cost of making the storage far less reliable, doesn't make the system more reliable. It makes failover and recovery of the majority of failures far worse and more risky.

    This approach is protecting against the unlikely because it "sounds bad" rather than protecting better against the most likely because it's boring.



  • @francesco-provino said in Application clustering VS RAID with modern SSD:

    There is also the possibility of create TWO drbd replica set, one active on the first node and the other active on the second; that way, I can easily double the total cpu count and ram available for the VMs… sort of hyperconvergency on the cheap!

    It's hyperconverged whether you do that or not. HC is free, even with far more robust systems like Starwind. HC doesn't imply that you have HA or can move workloads around. Most people do that, but it's HC from the moment you go with the design here. But DRBD isn't saving you anything over normal baseline. So while this is cheap, it's not special or cheaper, and it's a well known model that under normal circumstances you would never do without local RAID because it's been analyzed heavily for decades and it just doesn't provide a logical protection versus simpler, cheaper approaches.


Log in to reply