XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!
-
Your predecessor definitely pulled this on you: https://mangolassi.it/topic/11852/why-it-builds-a-house-of-cards
-
Looks like, on top of other problems, the SAN has died. It's hard to tell from this, but it looks like those are the LUNs that hold all of your VMs?
-
So 2 drives failed at once? You should be able to go into the server room and see some sort of blinky light pattern that indicates what/how many drives are gone.
Did you lose a RAID Controller? -
Dear God I pray that you have backups outside of the environment. Please tell me that you do. Another NAS, tapes, diskettes, something?
-
@momurda said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
So 2 drives failed at once? You should be able to go into the server room and see some sort of blinky light pattern that indicates what/how many drives are gone.
Did you lose a RAID Controller?It's a dual controller device. So in theory it should fail over. But in reality, they rarely do.
-
@NerdyDad said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Dear God I pray that you have backups outside of the environment. Please tell me that you do. Another NAS, tapes, diskettes, something?
At this point, recovering from backup to a new cluster might be the best way to go. The SAN is worthless if the arrays have failed. And the local servers probably don't have the necessary storage to run without it. If the array is really lost, the old hardware has probably dropped to a zero value level. Time to get something new in and recover to that ASAP.
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@momurda said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
So 2 drives failed at once? You should be able to go into the server room and see some sort of blinky light pattern that indicates what/how many drives are gone.
Did you lose a RAID Controller?It's a dual controller device. So in theory it should fail over. But in reality, they rarely do.
But if drives are lost, that won't help.
-
Isn't this saying the virtual drives for each failed? This should be different than a physical drive failure, right? Or am I reading something wrong?
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Having been through this once before, and learning the hard way, I do normally have a physical DC.
This is absolutely the wrong response. You should never have a physical DC, ever. There is zero issues here with virtualization. There are two problems....
- Zero AD redundancy
- An inverted pyramid of doom (single storage for all systems)
Fixing either of those anti-practices would have saved you. Physical would have zero benefit and is the polar opposite of the reaction that you should have.
having a physical in this situation would have probably saved him. That said, I agree it's not the solution. If you really wanted to have a DC outside this cluster, fine, but you still virtualize that third server, then install a DC on that.
-
@seal said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Isn't this saying the virtual drives for each failed? This should be different than a physical drive failure, right? Or am I reading something wrong?
Well, yes and no. You are correct. The warning is that the LDs have failed. But the LDs fail when their underlying array fails. That underlying array is built on physical drives. So for the LDs to fail, it means that the array(s) that they share has failed, which means that the drives it has in its pool have failed. Or that both controllers have failed. In this case, since two utility LUNs are still hanging around, we are guessing that the controller(s) are intact and only the array has failed.
-
@Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
having a physical in this situation would have probably saved him.
Don't feed the crazy. Physical can never save you. You are mixing assumptions to come to the wrong conclusion. Physical will never help. What helps is separate storage.
Physical with shared storage = fail just the same.
Physical with separate storage = just fine.
Virtual with shared storage = fail just the same.
Virtual with separate storage = just fine.As you can see, physical vs virtual is unrelated. It's all about the storage separation and nothing else.
-
@Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
That said, I agree it's not the solution. If you really wanted to have a DC outside this cluster, fine, but you still virtualize that third server, then install a DC on that.
While I generally agree that "outside the cluster" is good in extreme cases where you have extreme levels of AD dependencies, that's not necessary. Same cluster with different storage is all that is needed. Same scenario on a Scale cluster, for example, would not have a problem even being on a single cluster. Having "inter-cluster" protection is good, but a whole level beyond what is needed here.
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
That said, I agree it's not the solution. If you really wanted to have a DC outside this cluster, fine, but you still virtualize that third server, then install a DC on that.
While I generally agree that "outside the cluster" is good in extreme cases where you have extreme levels of AD dependencies, that's not necessary. Same cluster with different storage is all that is needed. Same scenario on a Scale cluster, for example, would not have a problem even being on a single cluster. Having "inter-cluster" protection is good, but a whole level beyond what is needed here.
Right, just don't setup circular requirements and you should be fine - sure it means having an extra set of credentials, but compared to eveything else you need if you don't do that, probably not worth it.
-
@Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
That said, I agree it's not the solution. If you really wanted to have a DC outside this cluster, fine, but you still virtualize that third server, then install a DC on that.
While I generally agree that "outside the cluster" is good in extreme cases where you have extreme levels of AD dependencies, that's not necessary. Same cluster with different storage is all that is needed. Same scenario on a Scale cluster, for example, would not have a problem even being on a single cluster. Having "inter-cluster" protection is good, but a whole level beyond what is needed here.
Right, just don't setup circular requirements and you should be fine - sure it means having an extra set of credentials, but compared to eveything else you need if you don't do that, probably not worth it.
That would protect against the one issue of circular dependencies. Obviously don't do that. But there is also the "single point of failure" risk that the SAN creates. A single cluster doesn't necessarily carry that risk either. An HC cluster (like Scale, Starwinds, Nutanix, Simplivity) doesn't have the single SAN dependency problem either. Both are major risks here, and both are 100% outage issues here. We just appear to have hit both at once.
-
A lack of a backup and DR strategy was one of the things I was brought in to remediate. No. There are no backups.
-
Looks like I'm building a bunch of new servers.
-
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
A lack of a backup and DR strategy was one of the things I was brought in to remediate. No. There are no backups.
Oh no.
-
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Looks like I'm building a bunch of new servers.
I'd say your almost starting from scratch.
-
@NerdyDad On the plus side, this environment was so effed up, I'm not THAT terribly upset about starting from scratch. It'll be a beating to get a lot of that data back, however. Anyone know of any services able to accomplish such a thing?
-
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@NerdyDad On the plus side, this environment was so effed up, I'm not THAT terribly upset about starting from scratch. It'll be a beating to get a lot of that data back, however. Anyone know of any services able to accomplish such a thing?
I ran into a similar issue with a failed drive almost 2 months ago. Almost lost the farm on it. I had a RAID6 on an EqualLogics. I called ACE Data Recovery here in Dallas. They told me to bring my last failed drive and a brand new drive of the same model and size to them. Overnight and $2,400 later, my SAN was limping along enough for me to get the data off of it and retire the SAN. I don't know where at in Dallas you are, but it might be worth checking out. They may be able to help.
Ace Data Recovery
17778 Preston Rd
Dallas, TX 75252
(972) 528-6580
http://www.datarecovery.net/They close at 7:00 tonight.