ZFS Based Storage for Medium VMWare Workload
-
@donaldlandru said in ZFS Based Storage for Medium VMWare Workload:
Ok, so a little background. the storage situation at my organization is our weakest link in our network. Currently we have a single HP MSA P2000 with 12 spindles (7200 rpm) serving two separate ESXi clusters.
This is not a lot of IOPS I have easily 100x more IOPS in the laptop I'm typing this on, than this disk configuration.
It is not uncommon for us to max out the disk i/o on 12 spindles sharing the load of almost 150 virtual machines and everyone is on board that something needs to be changed.
Yep, Go all Flash.
Here is what the business cares about the solution: Reliable solution that provides necessary resources for the development environments to operate effectively (read: we do not do performance testing in-house as by the very nature, it is much a your mileage may vary depending on your deployment situation).
If your a VMware shop doing QA testing, there's some workflows you can do with Linked Clones, and Instant Clones (No IO overhead, 400ms to clone a VM with zero memory or disk as it runs ajournal log for both) to reduce disk load, speed up process's and in general make everyone's life easier.
In addition to the business requirements, I have added my own requirements that my boss agrees with and blesses.
- Operations and Development must be on separate storage devices
This isn't necessary at 150 VM's. Just get something with all flash and if it is that big of a deal that people are running IO burner something that has QoS as an option to smack them down.
- Storage systems must be built of business class hardware (no RED drives -- although I would allow this in a future Veeam backup storage target)
Think strongly about using reds with Veeam. Reverse Incremental or roll ups use random IO and your backup windows will hate you.
Requirements for development storage
- 9+ Tib of usable storage
- Support a minimum of 1100 random iops (what our current system is peaking at)
- disks must be in some kind of array (zfs, raid, mdadm, etc)
Proposed solutions:
#1 a.k.a the safe option
HP StoreVirtual 4530 with 12 TB (7.2k) spindles in RAID6 -- this is our vendor recommendation. This is an HP renew quote with 3 years 5x9 support next-day on-site for ~$15,000Wait is this a single node storevirtual? ALso the IOPS on this will be awful. (7.2K drives are slow).
Less performance than solution #2 out of the box
More expensive to upgrade later (additional shelves and drives at HP prices)
All used hardware
Its worse than that, as you have to not just buy HP parts but licensing.#2 ZFS Solution ~$10,000
24 spindle 900Gb (7.2k SAS) in 12 mirrored vdevsTo my knowledge no one makes 900GB 7.2K SAS drives.
Based on Supermicro SC216E16 chassis
X9SRH-7F Motherboard
Intel E5-1620v2 CPU
64 GB of RAM
No L2ARC or ZIL plannedThen why the hell would you use ZFS?
Dual 10gig NICs
Pros
Better performance out of the box (twice the spindle count)
Non-vendor specific parts means upgrades require less investmentCons
Alright, tear me apart tell me I am wrong or provide any other useful feedback. The biggest concerns I have exist in both platforms (drives fail, controllers fail, data goes bad, etc) and have to be mitigated either way. That is what we have backups for -- in my opinion the HP gets me the following things:- The "ability" to purchase a support contract
- Next-day on-site of a tech or parts if needed
Be careful with NBD Parts, as a failure on Thursday afternoon really means Monday afternoon is within SLA as its based on when it was isolated.
With the $4000 saved from not buying the HP support contract I can buy a duplicate Supermicro system, and a couple extra hard drives, and have the same level of protection.
Note: this is my first time posting an actual give me feedback topic, I tried to include all information I felt was relevant. If more is needed I can provide.
Your a VMware shop, curious if you looked at Using VSAN? You could go all flash, and get inline Dedupe and Compression which makes all flash cheaper than 10K drives at that point.
-
@Dashrender IF you put enterprise grade in a REAL datacenter its scary how long it will run without failure. Now if this stuff is going in some Nolo Telco Dataslum, or his closet yah, stuff dies all the time.
-
@Dashrender said in ZFS Based Storage for Medium VMWare Workload:
@dafyre said:
@donaldlandru said:
The politics are likely to be harder to play as we just renewed our SnS for both Essentials and Essentials plus in January for three years.
<snip>
Another important piece of information with the local storage is that everything is based on 2.5" disks -- and all but two servers only have two bays each, getting any really kind of local storage without going external direct attached (non-shared) is going to be a challenge.He brings a good point about the 2 bays and 2.5" drives... Do they even make 4 / 6 TB drives in 2.5" form yet?
If not, would it be worth getting an external DAS shelf for each of the servers?
It's been 15 years, but I've seen DAS shelves that can be split between two hosts. Assuming those are still made, and there is enough needed disk slots, that would save a small amount.
They make flash drives in that size. Actual large capacity small format magnetic drives? No.
The bigger issue is with this level of VM density you run out of IOPS before you run out of capacity.Shared DAS shelfs though don't handle RAID they are designed for some type of controller to manage them and disk locking etc.
BTW, you can limit IOPS on a VM in vSphere if you just have a noisy neighbor problem.
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1038241 -
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
That $1200 number was based off of Essentials. Just saw that you have Essentials Plus. What is that for? Eliminating that will save you many thousands of dollars! This just went from a "little win" to a major one!
Essentials Plus SnS includes 24/7 flat rate support. Getting that On other platforms is a hell of a lot more expensive. Its ~$1200 a year (Essentials is only like $600 and it's software updates only renewal is like $100)
-
VSAN is more than high availability.
-
Dedupe and Compression, combined with distributed erasure code's makes all flash cheaper than hybrid in most cases.
-
Distributed RAID/RAIN So you don't have to mirror mirrors (doubles capacity efficiency vs. P4xxx type design)
-
On the fly ability to change availability. Have a test/dev workload you don't care about? FTT=0 and that sucker is RAID 0. Its moving to Production? change to FTT=1 and without disruption it will now be protected with either RAID 1, or distributed RAID 5 (All Flash). Its become really damn important, Change to FTT=2 and get triple mirror or RAID 6. Can even adjust stripe width etc on the fly. Moving ephemeral workloads to FTT=0 saves a lot of space.
-
Non-disruptive expansion (No RAID expansion limits, just add drives or hosts as you need).
-
Self healing. When you have 4 nodes, and 1 node fails, it will use free capacity on the other hosts to re-protect mirrored data.
It may be overkill for what he's doing, but the product has some opex beenfits (no managing LUNs, you get 24/7 support with it etc).
Also I saw a comment about 2 hosts. You can technically deploy VSAN in a 2 node configuration (It just needs a witness VM to run elsewhere).
One thing I will say is for a shop with 100+ VM's and a dev heavy enviroment your bound to be hemorrhaging money from wasted labor on people waiting on things because of how IOPS starved this enviroment is. To Show the value of "Fast" storage Grab A good flash drive, put it in one of the hosts (behind a proper smart array or HBA, not some b model garbage) and put a few VM's on it and see if people notice the difference. They'll be fighting each other to get you budget for more flash...
-
-
@donaldlandru Cuts licensing for VSAN in half (single CPU)
-
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
WD makes RE and Red drives. Don't call them RED, it is hard to tell if you are meaning to say RE or Red. The Red Pro and SE drives fall between the Red and the RE drives in the lineup. Red and RE drives are not related. RE comes in SAS, Red is SATA only.
They just re-branded the consumer side (IronWolf, brought back Barracuda with a FireCuda cache drive). I was looking at them and realized they have like a .05 DWPD rating which actually makes their write endurance worse than the cheapest of Enterprise TLC drives.
-
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@Dashrender IF you put enterprise grade in a REAL datacenter its scary how long it will run without failure. Now if this stuff is going in some Nolo Telco Dataslum, or his closet yah, stuff dies all the time.
It's true, enterprise servers are generally insanely solid. The idea that servers "just die" is either from abused equipment or from the early 1990s.
-
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@donaldlandru Cuts licensing for VSAN in half (single CPU)
Can you fill in the background on this comment for the rest of us?
-
@donaldlandru said in ZFS Based Storage for Medium VMWare Workload:
My next biggest concern, like any technology, is how do I get there from here. I have enough budget for a storage node, and we are going to run out of space within the next 60 days. I do not have, and will not receive additional funding this year for new servers. So some form of "in-place" style of upgrade has to occur. Obviously, this is a server down, convert vm bring it back up type of process that has an unknown LoE.
So when someone says "You will not get more funding" you need to reply "You will not get more IT for no spend". Apply quota's to the file servers, implement aggressive archiving cut back vCPU and memory allocations so your cluster can fail over, quota mailboxes on Exchange and in general "enforce" their "no new budget policy". Turn on FSRM reporting and post reports that show who's using the most space (Dump similar reports from Exchange). Don't let things hit a wall and crash, start putting the breaks on growth. IT is not a magyver episode where your expected to conjure IOPS and capacity and RAM from thin air.
Trying to not paint a picture of a rock and a hard place, but realistically where else am I at right now?
Your not between a rock and a hard place they are. Follow the rules above, and they will either find you capital to invest in so they can continue to use more storage and compute, or they will agree that its not worth the spend. THIS IS NOT YOUR PROBLEM. Your problem is to quantify what it will cost to deliver x and deploy if funded. It is not your job to be given an arbitrary budget and produce y. I remember when I realized this, and my job became a lot more Zen like. "Do what you say you will do" was my old office motto. Because of that people said no a hell of a lot.
-
@scottalanmiller You say free storage migrations?
-
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@donaldlandru Cuts licensing for VSAN in half (single CPU)
Can you fill in the background on this comment for the rest of us?
He said he only had single sockets deployed on one of his clusters. VSAN is licensed by socket (well, among other options but this would be the most common in his case)
-
@bhershen said in ZFS Based Storage for Medium VMWare Workload:
Hi Scott,
Donald mentioned SM and referenced generic ZFS (could be Oracle, OpenIndiana, FreeBSD, etc.) solutions which have uncoordinated HW, SW and support. Nexenta is packaged to compete with EMC, NetApp, etc. as primary storage in the Commercial market.
If you would like to get an overview, please feel free to ping me.
Best.Weird I've seen it packaged as software only (as a virtual NAS piece to run on top of HCI).
-
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@bhershen said in ZFS Based Storage for Medium VMWare Workload:
Hi Scott,
Donald mentioned SM and referenced generic ZFS (could be Oracle, OpenIndiana, FreeBSD, etc.) solutions which have uncoordinated HW, SW and support. Nexenta is packaged to compete with EMC, NetApp, etc. as primary storage in the Commercial market.
If you would like to get an overview, please feel free to ping me.
Best.Weird I've seen it packaged as software only (as a virtual NAS piece to run on top of HCI).
Yes, for a long time that was all that they had, I'm pretty sure. Maybe they phased that out as I could imagine that it was difficult to support and not a big money maker while the appliances were a clearer product line.
-
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@donaldlandru Cuts licensing for VSAN in half (single CPU)
Can you fill in the background on this comment for the rest of us?
He said he only had single sockets deployed on one of his clusters. VSAN is licensed by socket (well, among other options but this would be the most common in his case)
Oh okay, cool. I figured but wanted to be sure. Doesn't that cause VSAN some license disparity with Essentials Plus users, but line up well with Standard users?
-
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@scottalanmiller You say free storage migrations?
I just saw that last night
-
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
Your not between a rock and a hard place they are. Follow the rules above, and they will either find you capital to invest in so they can continue to use more storage and compute, or they will agree that its not worth the spend. THIS IS NOT YOUR PROBLEM.
This can't be overstated. Do the best that you can, of course, but don't feel that you have to deliver the impossible. If that were really the case, every business would require that all IT run on zero budget.
-
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@donaldlandru Cuts licensing for VSAN in half (single CPU)
Can you fill in the background on this comment for the rest of us?
He said he only had single sockets deployed on one of his clusters. VSAN is licensed by socket (well, among other options but this would be the most common in his case)
Oh okay, cool. I figured but wanted to be sure. Doesn't that cause VSAN some license disparity with Essentials Plus users, but line up well with Standard users?
Actually it works fine with essentials plus (i've deployed it). Not VSAN includes a vDS license so you'll get that and NIOC thrown in with it.
-
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@scottalanmiller said in ZFS Based Storage for Medium VMWare Workload:
@John-Nicholson said in ZFS Based Storage for Medium VMWare Workload:
@donaldlandru Cuts licensing for VSAN in half (single CPU)
Can you fill in the background on this comment for the rest of us?
He said he only had single sockets deployed on one of his clusters. VSAN is licensed by socket (well, among other options but this would be the most common in his case)
Oh okay, cool. I figured but wanted to be sure. Doesn't that cause VSAN some license disparity with Essentials Plus users, but line up well with Standard users?
Actually it works fine with essentials plus (i've deployed it). Not VSAN includes a vDS license so you'll get that and NIOC thrown in with it.
Oh I know it works. I meant if you get single socket servers you pay for double the EP license, but VSAN you only pay for what you use.
-
@donaldlandru said in ZFS Based Storage for Medium VMWare Workload:
@scottalanmiller said:
@donaldlandru said:
Back to the original requirements list. HA and FT are not listed as needed for the development environment. This conversation went sideways when we started digging into the operations side (where there should be HA) and I have a weak point, the storage.
Okay, so we are looking exclusively at the non-production side?
But production completely lacks HA today, it should be a different thread, but your "actions" say you dont need HA in production even if you feel that you do. Either what you have today isn't good enough and has to be replaced there, or HA isn't needed since you've happily been without it for so long. This can't be overlooked - you are stuck with either falling short of a need or not being clear on the needs for production.
Ahh -- there is the detail I missed. Just re-read my post and that doesn't make this clear. Yes, the discussion was supposed to pertain to the non-production side. My apologies.
I agree we do lack true HA in the production side as there is a single weak link (one storage array), the solution here depends on our move to Office 365 as that would take most of the operations load off of the network and change the requirements completely.
We have qasi-HA with the current solution, but now based on new enlightenment I would agree it is not fully HA.
To be clear, Exchange (2010 on) shouldn't be putting that much load. What could/can happen is your arbitraging DiskIO for CPU (CPU threads waiting on disk IO) and Memory (cache). If your running cached mode for your users a single reasonably sized Exchange server can serve thousands of users....