ESXi cluster, advice needed
-
@stacksofplates said in ESXi cluster, advice needed:
My point is, you can't just say "this is available through open source tools" and expect people to be able to do a setup like that from scratch with no experience.
This point is correct. What we know is that there are tools (open, free, low cost, closed, high cost, etc.) that can do this and plenty of money to hire the IT expertise to do it. There are alternate approaches that hands making everything easier (going with Scale, for example) and ones that are cheaper (hiring IT experts to dig into needs and do what needs to be done), etc.
The real underlying point that I think is important is that getting good IT is cheap. Nothing is cheaper. Anything that is done that deviates from having the right IT resources is going to be more costly in the long run. Maybe more costly because things take longer to get done, maybe because money is spent where it doesn't need to be, maybe by there being risks that didn't need to be there. IT's one purpose is to make the company money. Intentionally "not making money" is a crazy business approach
-
@IRJ said in ESXi cluster, advice needed:
First of all, you need move every service you can to SaaS and Pass solution. Get rid of your database servers on prem and put them on PaaS solution. Why do you want to manage infrastructure especially if you are short staffed in IT?
this is public sector and it is not allowed yet.
-
@travisdh1 said in ESXi cluster, advice needed:
@rtfm said in ESXi cluster, advice needed:
hi everybody!
first of all thank you for your contribution. Keep it simple...
However, i did not mention (intentionally, no ofence i will explain myself) the following facts:- we already have these VMs hosted in a 4-node flexpod environment (if vmware enterprise plus is an overkill for us, then how would you judge flexpod???).
- our organization is rich in terms of money but poor in terms of IT intellectual capital. Therefore we need outsourced support. in our place it is hard to find that, so we usually address ourselves to certified solutions.
- we have invested time and money on vmware hypervisor and our poor IT would not like to throw this away.
- the initial question was supposed to refer to a DRS solution based on vmware SRM (VM based replication, not array based), however years after studying your recommendations i would like to try something more simple.
sorry for wasting your time. i am thankful to you for your recommendations. BTW what happens if a single node, or a node with local storage is lost? isn't that a potential cause for filesystem corruption?
- Just a waste of money
- What region of the world are you located in?
- This is just plain bad thinking. IT by it's nature is always changing. Learning something different should be very quick, resisting change just because you already know something is just the opposite of what IT should be doing.
- Also just a waste of money
Your statement about competitive environments doesn't make any sense. Many of the solutions mentioned are open source and available to anyone with an internet connection.
Assuming you are running with at least the 3-node minimum and a single node is lost, nothing happens. Once the node is put back online, everything is automatically handled in the background for you (Starwind, Gluster, Ceph, Scale).
No need for a "maintenance mode". Updates are handled without the need of a reboot, but we still recommend power cycling everything on a regular basis.
hello,
- agree. but have you ever thought that money might not be a problem at all in some cases?
- balkan peninsula...europe.
- what if you are responsible for the services and the infrastructure and you don't have other IT staff and you are not allowed to hire?
- see "1"
As far as "competitive" is concerned, let's just say that IT integration without vendor support in greece is not much different than gambling in terms of reliability.
About the technical part solely: i thought you had proposed one server, now they have become three? -
@Dashrender said in ESXi cluster, advice needed:
@rtfm said in ESXi cluster, advice needed:
hi everybody!
first of all thank you for your contribution. Keep it simple...
However, i did not mention (intentionally, no ofence i will explain myself) the following facts:- we already have these VMs hosted in a 4-node flexpod environment (if vmware enterprise plus is an overkill for us, then how would you judge flexpod???).
- our organization is rich in terms of money but poor in terms of IT intellectual capital. Therefore we need outsourced support. in our place it is hard to find that, so we usually address ourselves to certified solutions.
- we have invested time and money on vmware hypervisor and our poor IT would not like to throw this away.
- the initial question was supposed to refer to a DRS solution based on vmware SRM (VM based replication, not array based), however years after studying your recommendations i would like to try something more simple.
i understand that all above mentioned arguments are usually trivial in competitive environments, but unfortunately this is not our case.
sorry for wasting your time. i am thankful to you for your recommendations.
BTW what happens if a single node, or a node with local storage is lost? isn't that a potential cause for filesystem corruption?
Moreover, how do i put the host in maintenance mode (Hmmm, and why should i do that if i only have one host, especially with let's say free esxi?)?- ?
- Why is your organization poor on IT capital? Why not hire consultants to do it? No reason to have them be on staff, is there?
- This is the sunk cost fallacy. That money is already spent, consider it gone and move forward with a most cost effective solution - that said, sometimes where you already are is the most cost effective when all aspects are considered - at least until a full overhaul is required.
- ?
- it's plain old fashioned public sector, the last IT hired was 15 years ago...
-
@scottalanmiller said in ESXi cluster, advice needed:
@rtfm said in ESXi cluster, advice needed:
our organization is rich in terms of money but poor in terms of IT intellectual capital. Therefore we need outsourced support. in our place it is hard to find that, so we usually address ourselves to certified solutions.
This is a misunderstanding of markets. There is no such thing as a place with hard to get IT. IT has no location and there are essentially unlimited numbers of available excellent resources ready to assist any business. Businesses simply choose not to look for or hire them and instead hire sales people who screw them and hide their costs in "products" rather than honest or qualified advice. Every business should have outsources support, almost no company is big enough to have all the right people internally. But no business is in a location or situation that it can't get good people.
Certified solutions is really just a way to say "expensive products that are focused on resellers" or, another way, bad solutions that cost you far more to operate. They are channel products designed beginning to end to take advantage of this mindset and to get as much money out of companies that believe this as possible. It's an extremely common and effective game that they play.
If your company doesn't know how to achieve this, then the first thing that they need is a real outsourced CIO. A good CIO will save you a fortune in hours. Running without one is financially reckless.
i agree 100% and i am looking forward to convert your theory into practice.
-
@scottalanmiller said in ESXi cluster, advice needed:
@rtfm said in ESXi cluster, advice needed:
we already have these VMs hosted in a 4-node flexpod environment (if vmware enterprise plus is an overkill for us, then how would you judge flexpod???).
This is unfortunate. Kind of the "worst of the worst". There is no way to really sugar coat this. It's the worst hardware on the market (Cisco), with a rather poor storage layer (NetApp), with an unnecessarily expensive hypervisor (VMware) where you end up with really, really high cost and hardware/setup that many of us would want to just throw in the trash.
Does it work? Well, kinda. Chances are the cost of this one purchase alone would have paid to hire an IT department to solve the bigger problems and implement a simpler, more efficient, more reliable solution that addresses your needs, rather than just empties your coffers.
This is, unfortunately, a setup designed specifically to prey on companies that think that they have to buy "products" instead of expertise and that they can skip IT. But it doesn't work that way. To quote VMware themselves "high availability is something you do, not something that you buy." Even the companies that sell this product don't believe that this is in any way a substitute for getting access to IT resources that are going to look at your needs and engineer a solution based around them.
We all understand that this means that you have now already purchased all of this and that there is no way to fix that. The only thing you can do now is look at the scale of this mistake and use it as a learning exercise to go back to your company and try to address the broken thought processes that brought them to what should have been an obvious "never do this" scenario. This suggests that they likely do the text book "never do this in business" things of engaging sales people, resellers and the vendors asking how they can spend money rather than getting business experts to actually figure out what the needs are, and what would address them. The goal had to be "how do we spend money", not "how do we solve a business need." There is a massive opportunity for improvement here, but it won't help until "next time." but there will be a next time, so this lesson is insanely important to learn.
That said, though, you are asking how to fix this. You've figured out that this setup isn't good. That's the first step. You know that you need reliable storage instead of the RAID 4 NAS device single point of failure, that's good. I would step all the way back and consider all of it a waste and look at "how best to move forward" based on what you own, and try to remove IT's emotions from it because it is what it is, and those emotions will only hurt the company (and IT itself) long term.
hi,
apart from the tremendous contribution from a generic point of view, would you suggest a 2-node setup with local datastores and a fast network and that's it, given that we take as proper backups as possible
? BTW we also have veeam B & R standard edition.what about the vmware SRM? what do you think of it?
-
@rtfm said in ESXi cluster, advice needed:
@Dashrender said in ESXi cluster, advice needed:
@rtfm said in ESXi cluster, advice needed:
hi everybody!
first of all thank you for your contribution. Keep it simple...
However, i did not mention (intentionally, no ofence i will explain myself) the following facts:- we already have these VMs hosted in a 4-node flexpod environment (if vmware enterprise plus is an overkill for us, then how would you judge flexpod???).
- our organization is rich in terms of money but poor in terms of IT intellectual capital. Therefore we need outsourced support. in our place it is hard to find that, so we usually address ourselves to certified solutions.
- we have invested time and money on vmware hypervisor and our poor IT would not like to throw this away.
- the initial question was supposed to refer to a DRS solution based on vmware SRM (VM based replication, not array based), however years after studying your recommendations i would like to try something more simple.
i understand that all above mentioned arguments are usually trivial in competitive environments, but unfortunately this is not our case.
sorry for wasting your time. i am thankful to you for your recommendations.
BTW what happens if a single node, or a node with local storage is lost? isn't that a potential cause for filesystem corruption?
Moreover, how do i put the host in maintenance mode (Hmmm, and why should i do that if i only have one host, especially with let's say free esxi?)?- ?
- Why is your organization poor on IT capital? Why not hire consultants to do it? No reason to have them be on staff, is there?
- This is the sunk cost fallacy. That money is already spent, consider it gone and move forward with a most cost effective solution - that said, sometimes where you already are is the most cost effective when all aspects are considered - at least until a full overhaul is required.
- ?
- it's plain old fashioned public sector, the last IT hired was 15 years ago...
Ah, that puts it in perspective.
-
@rtfm said in ESXi cluster, advice needed:
would you suggest a 2-node setup with local datastores and a fast network and that's it, given that we take as proper backups as possible
Under most conditions, yes. Fast, easy, effective. If you need absolute uptime, do so higher in the stack at the application level. If you just need really good uptime, take excellent backups, be able to restore really quickly, and probably keep a recent (like hours old) "copy" on the second host so that you can spin up in minutes.
Two stand alone hosts with good backups is often a "couple minutes" of downtime during a crisis solution, instead of a few milliseconds. Sure, a few minutes is relatively long compared to a few milliseconds, but to most businesses (and certainly most governments) a few minutes of downtime every few years doesn't matter at all.
-
@scottalanmiller said in ESXi cluster, advice needed:
@rtfm said in ESXi cluster, advice needed:
would you suggest a 2-node setup with local datastores and a fast network and that's it, given that we take as proper backups as possible
Under most conditions, yes. Fast, easy, effective. If you need absolute uptime, do so higher in the stack at the application level. If you just need really good uptime, take excellent backups, be able to restore really quickly, and probably keep a recent (like hours old) "copy" on the second host so that you can spin up in minutes.
Two stand alone hosts with good backups is often a "couple minutes" of downtime during a crisis solution, instead of a few milliseconds. Sure, a few minutes is relatively long compared to a few milliseconds, but to most businesses (and certainly most governments) a few minutes of downtime every few years doesn't matter at all.
From the requirements perspective i totally agree to your words! Considering our poor in-house human resources an RPO of even one day is totally satisfactory and acceptable! In that case i understand that there is no actual need to mess with HA, FA, SRM etc...
BTW your comment about how "system integrators" try to sell bare metal and certifications instead of brains was more than appropriate!
-
@rtfm said in ESXi cluster, advice needed:
BTW your comment about how "system integrators" try to sell bare metal and certifications instead of brains was more than appropriate!
It's where all the money is. As a consultancy... I can sell you a full IT department for a year. But I have to pay those people and my profits are small, but they can provide you with a wealth of work, advice, etc. building, maintaining, and supporting whatever you need.
But for the same money, I could just sell you a product that you don't need, that sounds good but doesn't do a good job for you, and earn easily triple the profits because I don't need to pay staff.
So the challenge is, if I resell those products, how do I make myself "do the right thing" when the customer is literally paying me only if I screw them over? The customer effectively demands, through how they pay and choose solutions, to only get the bad solutions. As a company, it's all but impossible to resist selling the product because the customer never knows the difference, and you earn so much more money as a salesperson than in provided sound IT. That's why companies like NTG and Bundy Associates simply don't sell any product at all, so that that incentive to do so isn't there at all. Because if it was, it's all but irresistable.