Replacing the Dead IPOD, SAN Bit the Dust
-
Also, the SAN in question has bee retired. We have 2 others in our datacenter that has their data pulled from them and the SANs in question have been taken offline. I'll go and pull them out of the datacenter tomorrow.
-
@NerdyDad said in Replacing the Dead IPOD, SAN Bit the Dust:
- Management (if you are listening): Put your IT department on a 5-7 year refresh cycle.
That's not at all the issue here. There are three real issues, none of them related to the age of the equipment, this could have happened on day one with new gear.
- Using low end gear that isn't designed for high reliability when highly reliable is needed (you said that $2,400 for data recovery was a drop in the bucket, and yet they chose gear that doesn't reflect that financial reality). Your SAN is around the home line, it's not something I would use in any production scenario.
- Using an appliance without support. This is way below the home line.
- Using an architecture that is designed to be ultra risky without benefit. (You addressed, this, just pointing it out again.)
Fix any of those three mistakes that the issue would have been avoided.
-
@NerdyDad said in Replacing the Dead IPOD, SAN Bit the Dust:
My management has decided not to check out hyperconvergence, but are sticking with the IPOD scheme for now. We are going to be reutilizing one of our EQL's for replication of data from the Compellent SAN. However, I want to note in one of @scottalanmiller's videos that added complexity does not increase resiliency in the network, but adds more of Moore's Law saying that if it can fail, it will fail.
Wow, so at this point, they are committed to the fact that their systems aren't valuable. If I was the CEO, this is where I'd be investigating to see what is going on, where is the money flowing for these SANs and why would someone be spending so much money to put the company at risk.
-
Moving to the Compellent definitely helps, but retains all of the core problems.
-
The big thing that I would ask.... why didn't they do a post mortem to determine what went wrong? This was pretty huge and it sounds like they learned nothing from it and are burying their heads in the sand.
There should be a team investigation to determine how things go so bad. Finding the technical issues that I listed above would get them to a proximate tech failure point. But then there needs to be questions asked of "how did those mistakes happen." There is a management decision making problem somewhere in management that sounds like it is being ignored completely. It's known to exist, but I'm guessing that no one is checking on it at all. How do they expect to improve as a company if they ignore these problems? Not only does that avoid improvement, but in a way it rewards bad practices. No risk to screwing over the company by doing a bad job, no one will even mention it, I'm guessing.
In a healthy environment, there should be a team probing to figure out how things got this bad. Was it because someone in management doesn't know tech but injected an opinion? Did a tech person make a mistake? Was someone not doing their job and hoping that a sales guy would do it for free for them? Did someone get a kickback (more common than you'd think.)
-
@scottalanmiller One of those was a technician mistake by neglecting the alerts of the SAN. As said before, the SAN was throwing errors of disk failures. 2 disks had already failed and was trying to rebuild off of spares that it had. During this rebuild, 2 other disks were also wanting to fail but the SAN controllers were not allowing for it to fail.
I'm trying to start better practices in myself by checking in on these systems on a daily basis to make sure there are no actions that would need to be taken before alerts leads to issues.
We're only a 4-man team covering these 3 locations. IT Manager (Boss), SysAdmin (Me), 2 other guys in helpdesk. Not trying to promote laziness or anything, but I also can't monitor systems 24/7 or I'll find myself divorced and crazy real quick. I suppose there is a way to have a system monitor other systems and alert me if certain conditions arise? I assume off of such things such as SNMPv3 or something? Any recommendations?
-
Can the SANs fire off email alerts or SNMP traps or anything?
-
@dafyre Typically yes, but the storage consultant advised that we not connect the storage to the house network as it posses a security issue. My thought process is that if they are already within the network then they are going to get to the data, then they are going to get through to the virtual environment anyways. If they are already in your network, then they are probably using either an admin account or a service account. Either way, they're getting in.
-
@NerdyDad said in Replacing the Dead IPOD, SAN Bit the Dust:
@dafyre Typically yes, but the storage consultant advised that we not connect the storage to the house network as it posses a security issue. My thought process is that if they are already within the network then they are going to get to the data, then they are going to get through to the virtual environment anyways. If they are already in your network, then they are probably using either an admin account or a service account. Either way, they're getting in.
Typical recommendations I've seen are for there to be a management VLAN, and a separate VLAN for the actual storage traffic... But as you say, when hackers get in, you have bigger problems anyhow.
My 2c worth would be to set up the email alerts anyway... it will save you this pain later on down the road. I'd set it up on any SAN you have that has the option, lol.
-
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
Can the SANs fire off email alerts or SNMP traps or anything?
Pretty much any device can do that.
-
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
Typical recommendations I've seen are for there to be a management VLAN, and a separate VLAN for the actual storage traffic... But as you say, when hackers get in, you have bigger problems anyhow.
Storage should always be a true physical SAN, not a VLAN SAN. VLAN is fine for security, but you want a physically separate SAN to make sure that the backplane does not get overloaded. It's performance and reliability why you keep the SAN separate physically.
-
@StrongBad said in Replacing the Dead IPOD, SAN Bit the Dust:
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
Typical recommendations I've seen are for there to be a management VLAN, and a separate VLAN for the actual storage traffic... But as you say, when hackers get in, you have bigger problems anyhow.
Storage should always be a true physical SAN, not a VLAN SAN. VLAN is fine for security, but you want a physically separate SAN to make sure that the backplane does not get overloaded. It's performance and reliability why you keep the SAN separate physically.
The recommendations I saw were to keep the actual SAN storage traffic separate from the rest of the network to improve performance and security.
-
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
@StrongBad said in Replacing the Dead IPOD, SAN Bit the Dust:
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
Typical recommendations I've seen are for there to be a management VLAN, and a separate VLAN for the actual storage traffic... But as you say, when hackers get in, you have bigger problems anyhow.
Storage should always be a true physical SAN, not a VLAN SAN. VLAN is fine for security, but you want a physically separate SAN to make sure that the backplane does not get overloaded. It's performance and reliability why you keep the SAN separate physically.
The recommendations I saw were to keep the actual SAN storage traffic separate from the rest of the network to improve performance and security.
I've seen this too, mostly here and SW. And by separate, I've read that to mean, it's own equipment with no VLANing. Heck, I'm pretty sure I've seen @scottalanmiller suggest Netgear layer 2 equipment because it's fast, cheap and no bells and whistles to get in the way.
-
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
@StrongBad said in Replacing the Dead IPOD, SAN Bit the Dust:
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
Typical recommendations I've seen are for there to be a management VLAN, and a separate VLAN for the actual storage traffic... But as you say, when hackers get in, you have bigger problems anyhow.
Storage should always be a true physical SAN, not a VLAN SAN. VLAN is fine for security, but you want a physically separate SAN to make sure that the backplane does not get overloaded. It's performance and reliability why you keep the SAN separate physically.
The recommendations I saw were to keep the actual SAN storage traffic separate from the rest of the network to improve performance and security.
Really separate, not VLAN separate. VLAN traffic is comingled.
-
@Dashrender said in Replacing the Dead IPOD, SAN Bit the Dust:
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
@StrongBad said in Replacing the Dead IPOD, SAN Bit the Dust:
@dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:
Typical recommendations I've seen are for there to be a management VLAN, and a separate VLAN for the actual storage traffic... But as you say, when hackers get in, you have bigger problems anyhow.
Storage should always be a true physical SAN, not a VLAN SAN. VLAN is fine for security, but you want a physically separate SAN to make sure that the backplane does not get overloaded. It's performance and reliability why you keep the SAN separate physically.
The recommendations I saw were to keep the actual SAN storage traffic separate from the rest of the network to improve performance and security.
I've seen this too, mostly here and SW. And by separate, I've read that to mean, it's own equipment with no VLANing. Heck, I'm pretty sure I've seen @scottalanmiller suggest Netgear layer 2 equipment because it's fast, cheap and no bells and whistles to get in the way.
Yes, it's been a long time, but Netgear Prosafe unmanaged in lab tests was the fastest on the market like six or seven years ago. $300 switches outperforming $10,000 switches.
-
Also the needs of a SAN are different than the needs of a LAN. So you likely want different switches. I'd love Netgear Prosafe unmanaged on my SAN but would generally prefer Ubiquiti EdgeSwitches on my LAN.
-
@scottalanmiller said in Replacing the Dead IPOD, SAN Bit the Dust:
Also the needs of a SAN are different than the needs of a LAN. So you likely want different switches. I'd love Netgear Prosafe unmanaged on my SAN but would generally prefer Ubiquiti EdgeSwitches on my LAN.
Any opinion on Unifi Switches yet?
-
@Dashrender said in Replacing the Dead IPOD, SAN Bit the Dust:
@scottalanmiller said in Replacing the Dead IPOD, SAN Bit the Dust:
Also the needs of a SAN are different than the needs of a LAN. So you likely want different switches. I'd love Netgear Prosafe unmanaged on my SAN but would generally prefer Ubiquiti EdgeSwitches on my LAN.
Any opinion on Unifi Switches yet?
We use one in the lab and it's been great, but we aren't pushing its limits or anything.