XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Can you get to the console of the SAN (PowerVault) to see what it thinks is going on? Is it healthy itself and this is just a XenServer issue... or has storage failed?
To this question, Xen2 was working fine, and the VMs hosted by Xen2 were working fine for most of yesterday, until I was commanded to shut everything down and bring it back up. It's not just the iSCSI storage that's unplugged. Local storage is also unplugged, as is DVD Drives and Removable Storage. See attached screenshot.
-
Oh okay, that doesn't sound like a SAN problem then.
-
I am logged into the SAN. I can see easily that there's been a virtual disk failure, but that's been there for awhile. I'm not hugely familiar with this interface, but nothing immediately pops out as an issue.
-
Can you tell what has changed between working conditions and now? I know you said that you're at 6.2, but has there been any updates done?
Also, would it be possible to backup the configs for the hosts and reinstall XS from scratch, then restore configs?
-
Also, have you tried shutting both down and bringing up Xen2, since its the pool master? Maybe Xen1 is trying to take over, thinking that Xen2 is down.
-
@NerdyDad said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Can you tell what has changed between working conditions and now? I know you said that you're at 6.2, but has there been any updates done?
Also, would it be possible to backup the configs for the hosts and reinstall XS from scratch, then restore configs?
@NerdyDad To my knowledge, from the time it was working, last Thursday/Friday, until I walked in the door yesterday morning and found issues, there had been no changes. I'm positive no updates have been done.
With regards to the backups, I'm sure I could try that. Do you have a link handy with a procedure for that, by any chance?
@NerdyDad said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Also, have you tried shutting both down and bringing up Xen2, since its the pool master? Maybe Xen1 is trying to take over, thinking that Xen2 is down.
That's the last thing I'd tried before posting here. I couldn't bring Xen2 out of maintenance mode, nor reconnect storage.
-
The network issue concerns me the most. That's going to cause everything to fail on its own.
-
Can you ping the SAN from either XS?
-
I think that we need to look at the logs. Do you have access to get onto the machines themselves via SSH?
-
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@NerdyDad said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Can you tell what has changed between working conditions and now? I know you said that you're at 6.2, but has there been any updates done?
Also, would it be possible to backup the configs for the hosts and reinstall XS from scratch, then restore configs?
@NerdyDad To my knowledge, from the time it was working, last Thursday/Friday, until I walked in the door yesterday morning and found issues, there had been no changes. I'm positive no updates have been done.
With regards to the backups, I'm sure I could try that. Do you have a link handy with a procedure for that, by any chance?
@NerdyDad said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Also, have you tried shutting both down and bringing up Xen2, since its the pool master? Maybe Xen1 is trying to take over, thinking that Xen2 is down.
That's the last thing I'd tried before posting here. I couldn't bring Xen2 out of maintenance mode, nor reconnect storage.
-
@Dashrender said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Can you ping the SAN from either XS?
I can from Xen2, SAN IPs of .100 and .101 on the mgmt network. I have console control of Xen2 from XenCenter, but Xen1 requires me to run up to the server room. Bear with me.
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
I think that we need to look at the logs. Do you have access to get onto the machines themselves via SSH?
I haven't tried yet. Let me get a hold of putty.
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
I think that we need to look at the logs. Do you have access to get onto the machines themselves via SSH?
I'm logged into Xen2 with SSH now.
-
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
I think that we need to look at the logs. Do you have access to get onto the machines themselves via SSH?
I'm logged into Xen2 with SSH now.
Great. Let's get as many logs as we can. Likely what we need to know is going to be in there. The logs are under /var/log
-
The messages log and any log with "xen" in the name will be the most useful.
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
I think that we need to look at the logs. Do you have access to get onto the machines themselves via SSH?
I'm logged into Xen2 with SSH now.
Great. Let's get as many logs as we can. Likely what we need to know is going to be in there. The logs are under /var/log
Forgive me for the basic knowledge, how do I access logs from here? The first time I did anything with the Xen CLI was yesterday.
-
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
I think that we need to look at the logs. Do you have access to get onto the machines themselves via SSH?
I'm logged into Xen2 with SSH now.
Great. Let's get as many logs as we can. Likely what we need to know is going to be in there. The logs are under /var/log
Forgive me for the basic knowledge, how do I access logs from here? The first time I did anything with the Xen CLI was yesterday.
This is a standard BASH environment. Nothing "Xen" needs to be known.
To get to the log directory:
cd /var/log ls
That will list out the available logs.
-
Then use a command like this to get just the end of the main log file:
tail -n 100 messages
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@CitrixNewbJD said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
I think that we need to look at the logs. Do you have access to get onto the machines themselves via SSH?
I'm logged into Xen2 with SSH now.
Great. Let's get as many logs as we can. Likely what we need to know is going to be in there. The logs are under /var/log
Forgive me for the basic knowledge, how do I access logs from here? The first time I did anything with the Xen CLI was yesterday.
This is a standard BASH environment. Nothing "Xen" needs to be known.
To get to the log directory:
cd /var/log ls
That will list out the available logs.
Yeah, I've heard of BASH, but I've never used it. I've been fortunate enough (or unfortunate, as the case seems to be today) to have been in primarily GUI environments. Here's the screenshots of the log listings.
-
@scottalanmiller said in XenServer 6.2 servers down. I have no Xen skill. Most likely networking? Help!:
Then use a command like this to get just the end of the main log file:
tail -n 100 messages
Dec 27 10:23:04 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: sdh - rdac checker reports path is up Dec 27 10:23:04 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:04 xen1 multipathd: 8:112: reinstated Dec 27 10:23:04 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: queue_if_no_path enabled Dec 27 10:23:04 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Recovered to normal mode Dec 27 10:23:04 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 1 Dec 27 10:23:04 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: switch to path group #2 Dec 27 10:23:05 xen1 kernel: [ 6268.202264] sd 5:0:0:3: [sdh] Unhandled sense code Dec 27 10:23:05 xen1 kernel: [ 6268.202267] sd 5:0:0:3: [sdh] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 27 10:23:05 xen1 kernel: [ 6268.202269] sd 5:0:0:3: [sdh] Sense Key : Hardware Error [current] Dec 27 10:23:05 xen1 kernel: [ 6268.202272] sd 5:0:0:3: [sdh] <<vendor>> ASC=0x84 ASCQ=0x0ASC=0x84 ASCQ=0x0 Dec 27 10:23:05 xen1 kernel: [ 6268.202275] sd 5:0:0:3: [sdh] CDB: Read(16): 88 00 00 00 00 01 6e 59 1f 80 00 00 00 08 00 00 Dec 27 10:23:05 xen1 kernel: [ 6268.202282] end_request: I/O error, dev sdh, sector 6146301824 Dec 27 10:23:05 xen1 kernel: [ 6268.202298] device-mapper: multipath: Failing path 8:112. Dec 27 10:23:05 xen1 multipathd: 8:112: mark as failed Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Entering recovery mode: max_retries=15 Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 0 Dec 27 10:23:05 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: sde - rdac checker reports path is ghost Dec 27 10:23:05 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:05 xen1 multipathd: 8:64: reinstated Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: queue_if_no_path enabled Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Recovered to normal mode Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 1 Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: switch to path group #1 Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: sdn - rdac checker reports path is ghost Dec 27 10:23:05 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:05 xen1 multipathd: 8:208: reinstated Dec 27 10:23:05 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 2 Dec 27 10:23:05 xen1 kernel: [ 6268.701997] sd 4:0:0:3: rdac: array MDS-Spindle01, ctlr 0, queueing MODE_SELECT command Dec 27 10:23:06 xen1 kernel: [ 6268.970907] sd 4:0:0:3: rdac: array MDS-Spindle01, ctlr 0, MODE_SELECT completed Dec 27 10:23:06 xen1 xapi: [ info|xen1|131517 UNIX /var/xapi/xapi|session.login_with_password D:e266c465d84c|xapi] Session.create trackid=cc8a9f3a4e457a8654c34f1b256320d4 pool=false uname=root is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 27 10:23:06 xen1 kernel: [ 6269.499770] sd 7:0:0:3: [sdn] Unhandled sense code Dec 27 10:23:06 xen1 kernel: [ 6269.499773] sd 7:0:0:3: [sdn] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 27 10:23:06 xen1 kernel: [ 6269.499775] sd 7:0:0:3: [sdn] Sense Key : Hardware Error [current] Dec 27 10:23:06 xen1 kernel: [ 6269.499778] sd 7:0:0:3: [sdn] <<vendor>> ASC=0x84 ASCQ=0x0ASC=0x84 ASCQ=0x0 Dec 27 10:23:06 xen1 kernel: [ 6269.499781] sd 7:0:0:3: [sdn] CDB: Read(16): 88 00 00 00 00 01 6e 59 1f 80 00 00 00 08 00 00 Dec 27 10:23:06 xen1 kernel: [ 6269.499788] end_request: I/O error, dev sdn, sector 6146301824 Dec 27 10:23:06 xen1 kernel: [ 6269.499805] device-mapper: multipath: Failing path 8:208. Dec 27 10:23:06 xen1 multipathd: 8:208: mark as failed Dec 27 10:23:06 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 1 Dec 27 10:23:06 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:07 xen1 kernel: [ 6270.032558] sd 4:0:0:3: [sde] Unhandled sense code Dec 27 10:23:07 xen1 kernel: [ 6270.032562] sd 4:0:0:3: [sde] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 27 10:23:07 xen1 kernel: [ 6270.032566] sd 4:0:0:3: [sde] Sense Key : Hardware Error [current] Dec 27 10:23:07 xen1 kernel: [ 6270.032571] sd 4:0:0:3: [sde] <<vendor>> ASC=0x84 ASCQ=0x0ASC=0x84 ASCQ=0x0 Dec 27 10:23:07 xen1 kernel: [ 6270.032577] sd 4:0:0:3: [sde] CDB: Read(16): 88 00 00 00 00 01 6e 59 1f 80 00 00 00 08 00 00 Dec 27 10:23:07 xen1 kernel: [ 6270.032589] end_request: I/O error, dev sde, sector 6146301824 Dec 27 10:23:07 xen1 kernel: [ 6270.032610] device-mapper: multipath: Failing path 8:64. Dec 27 10:23:07 xen1 xapi: [ info|xen1|131545 UNIX /var/xapi/xapi|session.login_with_password D:d69e66c86e06|xapi] Session.create trackid=39eeefd838b77788b3ccc71c51581cd6 pool=false uname=root is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 27 10:23:07 xen1 multipathd: 8:64: mark as failed Dec 27 10:23:07 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Entering recovery mode: max_retries=15 Dec 27 10:23:07 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 0 Dec 27 10:23:07 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:08 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Entering recovery mode: max_retries=15 Dec 27 10:23:08 xen1 ntpd[6793]: sendto(208.75.88.4) (fd=18): Invalid argument Dec 27 10:23:08 xen1 xapi: [ info|xen1|131573 UNIX /var/xapi/xapi|session.login_with_password D:b7fd4c2e410c|xapi] Session.create trackid=4706ed452324a154ea9a85d6841d8507 pool=false uname=root is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 27 10:23:09 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: sdk - rdac checker reports path is ghost Dec 27 10:23:09 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:09 xen1 multipathd: 8:160: reinstated Dec 27 10:23:09 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: queue_if_no_path enabled Dec 27 10:23:09 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Recovered to normal mode Dec 27 10:23:09 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 1 Dec 27 10:23:09 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: switch to path group #3 Dec 27 10:23:09 xen1 kernel: [ 6271.802004] sd 6:0:0:3: rdac: array MDS-Spindle01, ctlr 1, queueing MODE_SELECT command Dec 27 10:23:09 xen1 xapi: [ info|xen1|131597 UNIX /var/xapi/xapi|session.login_with_password D:f77b1779ddc8|xapi] Session.create trackid=85bc3f2b6b5d775d09f80b9eba992380 pool=false uname=root is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 27 10:23:09 xen1 kernel: [ 6272.077698] sd 6:0:0:3: rdac: array MDS-Spindle01, ctlr 1, MODE_SELECT completed Dec 27 10:23:09 xen1 kernel: [ 6272.597687] sd 6:0:0:3: [sdk] Unhandled sense code Dec 27 10:23:09 xen1 kernel: [ 6272.597690] sd 6:0:0:3: [sdk] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 27 10:23:09 xen1 kernel: [ 6272.597692] sd 6:0:0:3: [sdk] Sense Key : Hardware Error [current] Dec 27 10:23:09 xen1 kernel: [ 6272.597695] sd 6:0:0:3: [sdk] <<vendor>> ASC=0x84 ASCQ=0x0ASC=0x84 ASCQ=0x0 Dec 27 10:23:09 xen1 kernel: [ 6272.597699] sd 6:0:0:3: [sdk] CDB: Read(16): 88 00 00 00 00 01 6e 59 1f 80 00 00 00 08 00 00 Dec 27 10:23:09 xen1 kernel: [ 6272.597705] end_request: I/O error, dev sdk, sector 6146301824 Dec 27 10:23:09 xen1 kernel: [ 6272.597722] device-mapper: multipath: Failing path 8:160. Dec 27 10:23:09 xen1 multipathd: 8:160: mark as failed Dec 27 10:23:09 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Entering recovery mode: max_retries=15 Dec 27 10:23:09 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 0 Dec 27 10:23:09 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:10 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Entering recovery mode: max_retries=15 Dec 27 10:23:10 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: sdh - rdac checker reports path is up Dec 27 10:23:10 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:10 xen1 multipathd: 8:112: reinstated Dec 27 10:23:10 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: queue_if_no_path enabled Dec 27 10:23:10 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Recovered to normal mode Dec 27 10:23:10 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 1 Dec 27 10:23:10 xen1 xapi: [ info|xen1|131625 UNIX /var/xapi/xapi|session.login_with_password D:9820afdb1736|xapi] Session.create trackid=e1073887bd1d5f6e1a101d7313609e20 pool=false uname=root is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 27 10:23:10 xen1 xcp-rrdd: [error|xen1|0 monitor|main|rrdd_server] Failed to process plugin: xcp-rrdd-xenpm (Rrdd_server.Plugin.No_update) Dec 27 10:23:10 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: switch to path group #2 Dec 27 10:23:10 xen1 kernel: [ 6273.763175] sd 5:0:0:3: [sdh] Unhandled sense code Dec 27 10:23:10 xen1 kernel: [ 6273.763179] sd 5:0:0:3: [sdh] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Dec 27 10:23:10 xen1 kernel: [ 6273.763183] sd 5:0:0:3: [sdh] Sense Key : Hardware Error [current] Dec 27 10:23:10 xen1 kernel: [ 6273.763187] sd 5:0:0:3: [sdh] <<vendor>> ASC=0x84 ASCQ=0x0ASC=0x84 ASCQ=0x0 Dec 27 10:23:10 xen1 kernel: [ 6273.763194] sd 5:0:0:3: [sdh] CDB: Read(16): 88 00 00 00 00 01 6e 59 1f 80 00 00 00 08 00 00 Dec 27 10:23:10 xen1 kernel: [ 6273.763206] end_request: I/O error, dev sdh, sector 6146301824 Dec 27 10:23:10 xen1 kernel: [ 6273.763228] device-mapper: multipath: Failing path 8:112. Dec 27 10:23:11 xen1 multipathd: 8:112: mark as failed Dec 27 10:23:11 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Entering recovery mode: max_retries=15 Dec 27 10:23:11 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: remaining active paths: 0 Dec 27 10:23:11 xen1 multipathd: Path event for 360024e800070ed06000007f54dba7bfb, request call of mpathcount Dec 27 10:23:11 xen1 multipathd: 360024e800070ed06000007f54dba7bfb: Entering recovery mode: max_retries=15