Examples of proper utilization of SAN

DustinB3403

@Obsolesce said in Examples of proper utilization of SAN:

What's wrong with NVMe in a non-San shared nothing setup, or even a serverless database?

What?

There is nothing wrong with it, besides that it doesn't scale easily.

Edit: Also it doesn't fit the subject of "proper utilization of SAN", because that's just a standalone server.

1337

@travisdh1 said in Examples of proper utilization of SAN:

@Pete-S said in Examples of proper utilization of SAN:

If you look at it a SAN is very desirable when you're running large databases, typical of the enterprise, because you need low latency block storage.

SAN always increases latency. I'd love it to be magic, but whenever you add more connections, you add latency. There is no getting around it. If you need low latency, you always go local.

When I wrote "large database" I didn't talk about a wordpress installation. So it's implied that if we are talking about SAN we are talking about shared block storage - meaning local storage is out.

DustinB3403

@Pete-S said in Examples of proper utilization of SAN:

@travisdh1 said in Examples of proper utilization of SAN:

@Pete-S said in Examples of proper utilization of SAN:

If you look at it a SAN is very desirable when you're running large databases, typical of the enterprise, because you need low latency block storage.

SAN always increases latency. I'd love it to be magic, but whenever you add more connections, you add latency. There is no getting around it. If you need low latency, you always go local.

When I wrote "large database" I didn't talk about a wordpress installation. So it's implied that if we are talking about SAN we are talking about shared block storage - meaning local storage is out.

Talking about SAN at all means local storage is out. Why are yourself and @Obsolesce talking about local storage at all?

The only reasonable use case for SAN is with massive scale out storage requirements.

DustinB3403

Talking about vSAN, which this topic very clearly isn't based on:

@EddieJennings said in Examples of proper utilization of SAN:

Yes. I'm intending the scope of the discussion to be about a physical SAN (storage devices and their network).

Would of course take advantage of the local storage on each server and create a SAN with it. But it still functions as a SAN.

So talking about, presumably things like Dell EMC SAN products (throw a name up) there is no reasonable need for them unless you are discussing scale out storage requirements.

DustinB3403

@EddieJennings what conversation is going on that you're looking for more information regarding SAN (products I assume). Which SAN isn't something you can purchase, it's something you have to build.

1337

@DustinB3403 said in Examples of proper utilization of SAN:

The only reasonable use case for SAN is with massive scale out storage requirements.

Wrong. Low latency shared block storage for OLTP applications don't have to be massive to make sense. Just need high performance requirements. Also, for instance a HPC cluster might fit in one rack but need a high performance storage solution.

scottalanmiller

Just Google: When to Consider a SAN

A voila, first hit.

scottalanmiller

@DustinB3403 said in Examples of proper utilization of SAN:

Large scale out storage is the only logical use. Storage way above what could be fit in a single server.

SAN is scale up, not scale out.

scottalanmiller

@Dashrender said in Examples of proper utilization of SAN:

@DustinB3403 said in Examples of proper utilization of SAN:

@davide-bonavita said in Examples of proper utilization of SAN:

We deployed a starwind vSAN in HA to store some critical VMs, it works quite well (cfr. "StarWind Virtual SAN
Installation and Configuration of HyperConverged 2 Nodes with Hyper-V Cluster" technical paper)

While that is an good example of uses to deploy a SAN solution, I think @EddieJennings is referring to physical SAN products and not the logical vSAN solutions that we know about today.

OK that brings up a good point - is physical SANs even really worth it much anymore today considering the abilities of vSANs? I mean I'm sure there are times where it can be worthwhile - but likely not for anyone really hanging out on these forums.

vSANs were there first. Physical SAN came later, by definition. We get all weird when talking about SANs, but in the real world, everything is software first and appliance later. NAS is an appliance of a file server. SAN is an appliance of a block storage server. SAN became "so famous" and so treated as a magic black box, that people had to go back and rename the original product a vSAN so that people would know what it was. Would be the same as calling a normal file server a vNAS today. Sounds stupid, but that's how stupid vSAN is.

We've never had a time when vSAN wasn't everywhere.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

If you look at it a SAN is very desirable when you're running large databases, typical of the enterprise, because you need low latency block storage.

Actually that's where you avoid it. Specifically for that reason. Because for databases the additional latency and risk of the SAN doesn't normally make sense. That's why in high performance databases were one of the first places to abandon SAN because they needed something faster.

Remember, SAN is the slow option, not the fast one. Simple physics says that a SAN has to be slower than its local equivalent. Maybe not a lot, but it's physically impossible for it to be as fast or faster. It has to be at least a tiny bit slower. SAN is always chosen despite performance losses.

scottalanmiller

@DustinB3403 said in Examples of proper utilization of SAN:

@Pete-S said in Examples of proper utilization of SAN:

If you look at it a SAN is very desirable when you're running large databases, typical of the enterprise

The part quoted is the only bit that makes sense.

Not really, because databases tend to need speed and reliability and they all replicate at the application layer and can't be replicated blindly at the storage layer nor is there a real use case of multiple RDBMS seeing a single pool of storage - the locking problems would be terrible for performance. So for myriad reasons, we would expect databases to be among the worst use cases for SAN. And in the real world, that's exactly where we saw SANs avoided first. Databases were where we exposed the needs for local storage options first.

Of course, the problem is, most places just throw money at solutions until even the worst option works. Then people who see that working assume it was a good choice, instead of a bad one, because they see it "in use" rather than see the evaluation, cost and risk involved.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

So it's implied that if we are talking about SAN we are talking about shared block storage - meaning local storage is out.

Local storage is the best way to share block storage. In no way whatsoever does needing shared block imply that a SAN is a need.

https://smbitjournal.com/2013/07/replicated-local-storage/

Much of the worst runs on shared local. Some do it by using block protocols like a vSAN, some do it without through things like Gluster or SCRIBE. But RLS is the best way to get high performance shared block, if you need shared.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

@DustinB3403 said in Examples of proper utilization of SAN:

The only reasonable use case for SAN is with massive scale out storage requirements.

Wrong. Low latency shared block storage for OLTP applications don't have to be massive to make sense. Just need high performance requirements. Also, for instance a HPC cluster might fit in one rack but need a high performance storage solution.

As Travis had pointed out, SAN guarantees more latency, not less. So any low latency requirements make SAN less desirable, not more. Your belief that SAN provides more performance than local storage does is causing you to think SAN would solve problems where it doesn't (or doesn't as well as more obvious solutions.)

This is basic computing physics.... the same storage local or distant has to be faster when local. Maybe not much faster, but it can't be slower or the same. There is just less latency - less wire, fewer hops.

Using SAN implies only two things.... distant, and block. Anything else assumed about SAN is just incorrect, it's not part of SAN.

How most people approach it is that they assume their SAN is crazy expensive and their local is cheap and then use that to show that SAN is "faster" by comparing apples and oranges. But the same NVMe drive local vs. hooked to a separate server and shared over even NVMeoF is a tiny bit slower.

1337

@scottalanmiller said in Examples of proper utilization of SAN:

@Pete-S said in Examples of proper utilization of SAN:

@DustinB3403 said in Examples of proper utilization of SAN:

The only reasonable use case for SAN is with massive scale out storage requirements.

Wrong. Low latency shared block storage for OLTP applications don't have to be massive to make sense. Just need high performance requirements. Also, for instance a HPC cluster might fit in one rack but need a high performance storage solution.

As Travis had pointed out, SAN guarantees more latency, not less. So any low latency requirements make SAN less desirable, not more. Your belief that SAN provides more performance than local storage does is causing you to think SAN would solve problems where it doesn't (or doesn't as well as more obvious solutions.)

This is basic computing physics.... the same storage local or distant has to be faster when local. Maybe not much faster, but it can't be slower or the same. There is just less latency - less wire, fewer hops.

Using SAN implies only two things.... distant, and block. Anything else assumed about SAN is just incorrect, it's not part of SAN.

How most people approach it is that they assume their SAN is crazy expensive and their local is cheap and then use that to show that SAN is "faster" by comparing apples and oranges. But the same NVMe drive local vs. hooked to a separate server and shared over even NVMeoF is a tiny bit slower.

You assumed I made assumptions I didn't make.

Yes, local is always faster but local is not shared. So then it all becomes just a question how we share and access the data. If we put the shared storage on dedicated servers we have a SAN. If we put the storage on the same servers that we are running compute we have hyperconverged storage.

In the first case we can optimize both hardware and software and it's only running this single task. In the second case we usually run both compute and storage on the same hardware. By pure logic the first case, the SAN, has to be the higher performing option.

Looking at replicated local storage though, that implies that we can fit the storage on one server. Of course this is almost as fast as local storage (assuming synchronous replication). But it also means that SAN advantage of consolidating storage is lost.

So in order of performance on equal hardware we have:

local storage
replicated local storage
SAN
vSAN and similar

Of course then we have local storage cache for vSAN solutions and other things to mess this up. Also of course in real life its the cost of it all that determines what is the best solution.

When I said that the SAN is the low latency option for OLTP or HPC, it's compared to things like gluster or vSAN - as they are comparable when it comes to storage consolidation. But you need enough workloads and servers for it to make sense to consolidate. Consolidate increase utilization (lower cost) by sacrificing some performance and increasing complexity.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

Yes, local is always faster but local is not shared

But local CAN be shared. SAN is not shared either, but CAN be shared. Both are both shared or not shared.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

In the first case we can optimize both hardware and software and it's only running this single task. In the second case we usually run both compute and storage on the same hardware. By pure logic the first case, the SAN, has to be the higher performing option.

That is in no way logical. That actually is both incredibly unlikely due to logic, and totally not true in the real world. I have no idea what kind of logic would make you think that making them far away from each other would be fast and local slow based on "dedicated harware" when the resource needs of storage is so tiny that it's of no consequence in modern devices.

1337

@scottalanmiller said in Examples of proper utilization of SAN:

@Pete-S said in Examples of proper utilization of SAN:

Yes, local is always faster but local is not shared

But local CAN be shared. SAN is not shared either, but CAN be shared. Both are both shared or not shared.

Local or not local has to be a question of where the data is used. It's always local somewhere.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

Looking at replicated local storage though, that implies that we can fit the storage on one server. Of course this is almost as fast as local storage (assuming synchronous replication). But it also means that SAN advantage of consolidating storage is lost.

RLS does imply that, yes. But SAN does, too. SAN and RLS both have the "fit it in one server" limitation.

In almost all cases, you need SAN to replicate, too. So any replication overhead in 99.9% of cases is the same RLS vs SAN. If your SAN doesn't need to be replicated, then chances are your local storage does not. There are extreme cases where you need shared storage that isn't replicated for reliability where SAN has a consolidation advantage for lower criticality workloads.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

@scottalanmiller said in Examples of proper utilization of SAN:

@Pete-S said in Examples of proper utilization of SAN:

Yes, local is always faster but local is not shared

But local CAN be shared. SAN is not shared either, but CAN be shared. Both are both shared or not shared.

Local or not local has to be a question of where the data is used. It's always local somewhere.

Yes, local to the computer or local to a remote dedicated storage server.

RLS means that the data is LOCAL to multiple locations.

scottalanmiller

@Pete-S said in Examples of proper utilization of SAN:

So in order of performance on equal hardware we have:

local storage
replicated local storage
SAN
vSAN and similar

I would not break it down that way. RLS can be as fast as any other local storage, if you don't require full sync. Async is an option and can have no performance overhead.

vSAN is not slower than SAN. A SAN and vSAN are the same speed. And in the real world, since vSAN options are more flexible, they are actually faster.