New Infrastructure to Replace Scale Cluster

scottalanmiller

@mroth911 said in New Infrastructure to Replace Scale Cluster:

Yeah I am mostly all linux. Not a fan of Microsoft. the DC's Were some of my old computers that I had on a domain that I just haven't had time to migrate stuff. and demote the dC. also I try to start current in windows but there is not enough time. I am running server 2012.. Which I originally figured I would create 3 hyper v servers. that turned into a cluster nightmare for me.

So if I can assume that in this move that you can either... 1) finish the demotion and eliminate the Windows machines or 2) leave them behind on the Scale HC3 to deal with later...

Then I'd recommend looking into LXC containers (with LXD front end, just makes it easy.) It might be so fast and easy to automate that you want to go this route.

An oVirt / KVM / Gluster cluster could work here, but feels heavy. But it might be the simplest to set up (without throwing money at it.)

But long term, LXC will give you more capacity and what I feel is an easier time automating things.

But the oVirt path has built in failover if you go with Gluster, DRBD or CEPH. Whereas with LXC you are a bit more on your own for that. But doing really rapid recovery might be trivial to script. But still, a little "making it yourself."

scottalanmiller

LXC is "Linux only", in case that wasn't clear as a limitation.

scottalanmiller

@scottalanmiller said in New Infrastructure to Replace Scale Cluster:

@mroth911 said in New Infrastructure to Replace Scale Cluster:

. I am in the process of build my OWN Whmcs...

Like this?

https://www.whmcs.com/

This bit makes me think that LXC might be a really good choice for you. You don't necessarily even need a cluster in the traditional sense. Each LXC node could be stand alone and you could build "simple" logic into your WHMCS clone that looks at average or peak loads and chooses the "least" used node for the next deployment.

scottalanmiller

LXD does support clustering with failover, if you use CEPH, Gluster, etc.

https://discuss.linuxcontainers.org/t/lxd-clustering-issues-with-container-failover-with-node-failure/2168

scottalanmiller

Now using CEPH or Gluster might not prove to be worth it. Local RAID is normally faster and easier during operational times, just not as nice during a failure.

But sometimes "simply, well understood, and easy to support" matter more than automated failover. It is worth considering.

With solid local RAID and LXD management of the nodes, you could just have a good backup and restore system to get a failed node or single VM back up and running in the event of a big failure.

1337

@scottalanmiller said in New Infrastructure to Replace Scale Cluster:

@mroth911 said in New Infrastructure to Replace Scale Cluster:

Scale specs are 24 cores, 188gb of ram 10tb.

Sorry, not being familiar at all with Scale, what does it mean? Are the cores/RAM/storage in one node or in several and this is the config for each node or the total amount of cores in the cluster??

mroth911

@Pete-S From my understanding this is bundled together. This is the total resources that you can use.

@scottalanmiller I was on the phone with redhat today about getting Redhat subscriptions with Virtual manager

scottalanmiller

@Pete-S said in New Infrastructure to Replace Scale Cluster:

@scottalanmiller said in New Infrastructure to Replace Scale Cluster:

@mroth911 said in New Infrastructure to Replace Scale Cluster:

Scale specs are 24 cores, 188gb of ram 10tb.

Sorry, not being familiar at all with Scale, what does it mean? Are the cores/RAM/storage in one node or in several and this is the config for each node or the total amount of cores in the cluster??

That's his "cluster spec". He has 1150 nodes if I remember. They are single CPU nodes.

So this is 3x Dell R310 servers with 1x 8 core Intel CPU, 64GB RAM, and 3.3TB of storage each.

scottalanmiller

@mroth911 said in New Infrastructure to Replace Scale Cluster:

@scottalanmiller I was on the phone with redhat today about getting Redhat subscriptions with Virtual manager

RHEV, their enterprise virtualization cluster tech with fully supported KVM, oVirt and Gluster? Not free, but it's an excellent choice with the entire thing supported end to end.

mroth911

and I use my own equipment. I know it's not free

scottalanmiller

@Pete-S

This is another Scale cluster. This is what it looks like in the system, it's a total across however many nodes you have.

Screenshot from 2019-02-01 19-23-25.png

Scale really treats your cluster as a single unit. You can see a little bit on a node by node basis, but very little. It auto balances, auto fails over, and storage is truly a fluid pool across all nodes.

mroth911

@scottalanmiller I like scale for what it does. However I think at a certain point if a client wants to manage there own equipment after so many years they should know how to do it.

1337

@scottalanmiller

OK, I understand. So it's three homogeneous nodes in a cluster with a common "control panel".

If you have the knowledge, why the need for a support contract? If the servers are standard then any hardware failures would be easy to solve or no? And isn't the software proven and stable as is? Or would it too be dangerous to run them without patching?

scottalanmiller

@mroth911 said in New Infrastructure to Replace Scale Cluster:

@scottalanmiller I like scale for what it does. However I think at a certain point if a client wants to manage there own equipment after so many years they should know how to do it.

This is not a Scale "issue" but one of appliances or not appliances. It's not about knowing how to do it, it's a black box and there isn't anything to know, it's not accessible. Same with any appliance. The thing that makes it powerful for its support and features is also what makes it unable to be managed in other ways. The idea with the appliance model is that when the support agreement expires, the equipment is EOL and automatically retired. Similar to Meraki, Unitrends, etc.

Nothing wrong with that approach, but it means that you have to rule out appliances as a product category as something that you want to work with.

scottalanmiller

@Pete-S said in New Infrastructure to Replace Scale Cluster:

@scottalanmiller

OK, I understand. So it's three homogeneous nodes in a cluster with a common "control panel".

If you have the knowledge, why the need for a support contract? If the servers are standard then any hardware failures would be easy to solve or no? And isn't the software proven and stable as is? Or would it too be dangerous to run them without patching?

In theory you can run without patching. But... eeek. It's incredibly stable and really well tested. But the biggest issue is hardware replacements. It's all specially managed drivers and firmware. We aren't sure if he loses a drive if there is anything that he can do to replace it, for example. No one's tried this, but we are pretty sure that a third party drive that isn't from Scale can't be put into the cluster.

mroth911

@scottalanmiller agreed, I went with this, based on a recommendation. Haven't had any major issue with it. But I feel its the calm before the storm. And I need to be proactive with not having all my eggs on one basket. I would like to have bought another scale cluster a have a beefed up one that I can run nextcloud on it . but I am not there.

mroth911

@scottalanmiller Correction I was able to put on that same model drive. but the system didn't detect it. They had to remote in and enable the port for the drive. But the drive the a bought off of amazon worked.

They wanted to spend 350 for the drive that I paid 98 bucks for . It's a 1tb sas drive.

scottalanmiller

@mroth911 said in New Infrastructure to Replace Scale Cluster:

@scottalanmiller Correction I was able to put on that same model drive. but the system didn't detect it. They had to remote in and enable the port for the drive. But the drive the a bought off of amazon worked.

They wanted to spend 350 for the drive that I paid 98 bucks for . It's a 1tb sas drive.

Right, I know that THEY can force the acceptance manually. But the cluster itself will not do it, and they likely had to do a manual change of the firmware to get it to work. Firmware that you don't likely have access to.

1337

@scottalanmiller said in New Infrastructure to Replace Scale Cluster:

@mroth911 said in New Infrastructure to Replace Scale Cluster:

@scottalanmiller Correction I was able to put on that same model drive. but the system didn't detect it. They had to remote in and enable the port for the drive. But the drive the a bought off of amazon worked.

They wanted to spend 350 for the drive that I paid 98 bucks for . It's a 1tb sas drive.

Right, I know that THEY can force the acceptance manually. But the cluster itself will not do it, and they likely had to do a manual change of the firmware to get it to work. Firmware that you don't likely have access to.

Ah, so basically you are screwed.

Can the hardware itself be sold either as Scale or as generic servers? I mean if you build a replacement cluster can you get some money back?

mroth911

@Pete-S I don't know if anyone has attempted to do this.