5 things to think about with Hyperconverged Infrastructure

  • 1. Simplicity

    A Hyperconverged infrastructure (or HCI) should take no more than 30 minutes to go from out of the box to creating VM’s. Likewise, an HCI should not require that the systems admin be a VCP, a CCNE, and a SNIA certified storage administrator to effectively manage it. Any properly designed HCI should be able to be administered by an average windows admin with nearly no additional training needed. It should be so easy that even a four-year-old should be able to use it…

    2. VSA vs. HES

    In many cases, rather than handing disk subsystems with SAN flexibility built in at a block level directly to production VMs, you see HCI vendors choosing to simply virtualize a SAN controller into each node in their architectures and pull the legacy SAN + storage protocols up into the servers as a separate VM, causing several I/O path loops to happen with IO’s having to pass multiple times through VMs in the system and adjacent systems. This approach of using Storage Controller VMs (sometimes called VSAs or Virtual Storage Appliances) consume so much CPU and RAM that they redefine inefficient – especially in the mid-market. In one case I can think of, a VSA running on each server (or node) in a vendor’s architecture BEGINS its RAM consumption at 16GB and 8 vCores per node, then grows that based on how much additional feature implementation, IO loading and maintenance it is having to do. With a different vendor, the VSA reserves around 50GB RAM per node on their entry point offering, and over 100GB of RAM per node on their most common platform – a 3 node cluster reserving over 300 GB RAM just for IO path overhead. An average SMB to mid-market customer could run their entire operation in just the CPU and RAM resources these VSA’s consume.

    There is a better alternative called the HES approach. It eliminates the dedicated servers, storage protocol overhead, resource consumption, multi-layer object files, filesystem nesting, and associated gear by moving the hypervisor directly into the OS of a clustered platform as a set of kernel modules with the block level storage function residing alongside the kernel in userspace, completely eliminating the SAN and storage protocols (not just virtualizing them and replicating copies of them over and over on each node in the platform). This approach simplifies the architecture dramatically while regaining the efficiency originally promised by Virtualization.

    3. Stack Owners versus Stack Dependents

    Any proper HCI should not be stack dependent on another company for it’s code. To be efficient, self-aware, self-healing, and self-load balancing, the architecture needs to be holistically implemented rather than piecemealed together by using different bits from different vendors. By being a stack owner, an HCI vendor is able to do things that weren’t feasible or realistic with legacy virtualization approaches. Things like hot and rolling firmware updates at every level, 100% tested rates on firmware vs customer configurations, 100% backwards and forwards compatibility between different hardware platforms – that list goes on for quite a while.

    4. Using flash properly instead of as a buffer

    Several HCI vendors are using SSD and Flash only (or almost only) as a cache buffer to hide the very slow IO path’s they have chosen to build based on VSAs and Erasure Coding (formerly known as software RAID 5/6/X) used between Virtual Machines and their underlying disks – creating what amounts to a Rube Goldberg machine for an IO path (one that consumes 4 to 10 disk IO’s or more for every IO the VM needs done) rather than using Flash and SSD as proper tiers with an AI based heat mapping and QOS-like mechanism in place to automatically put the right workloads in the right place at the right time with the flexibility to move those workloads fluidly between tiers and dynamically allocate flash as needed on the fly to workloads that demand it (up to putting the entire workload in flash). Any architecture that REQUIRES the use of flash to function at an acceptable speed has clearly not been architected efficiently. If turning off the flash layer results in IO speeds best described as Glacial, then the vendor is hardly being efficient in their use of Flash or Solid State. Flash is not meant to be the curtain that hides the efficiency issues of the solution.

    5. Future proofing against the “refresh everything every 5 years” spiral

    Proper HCI implements self-aware bi-directional live migration across dissimilar hardware. This means that the administrator is not boat anchored to a technology “point in time of acquisition”, but rather, they can avoid over buying on the front end, and take full advantage of Moore’s law and technical advances as they come and the need arises. As lower latency and higher performance technology comes to the masses, attaching it to an efficient software stack is crucial in eliminating the need the “throw away and start over ” refresh cycle every few years.

    Bonus number 6. Price –

    Hyperconvergence shouldn’t come at a 1600+% price premium over the cost of the hardware it runs on. Hyperconvergence should be affordable – more so than the legacy approach was and VSA based approach is by far.

    These are just a few points to keep in mind as you investigate which Hyperconverged platform is right for your needs

    This weeks blog is brought to you by @Aconboy