Windows Failover Cluster

dafyre

Hi All,

Got an oddball problem that kicked my tail for most of last week.

I have a Windows Failover Cluster setup for SQL Server 2012. I configured the SQL Server, and added both nodes, and everything was happy. However, when I try to (manually) fail from Node1 to Node2, the system gives me an error when trying to mount the shared storage:

upload-e9e5666f-b7a2-4eb5-9bbe-e456d81547d3

The Computer Account for the cluster (The cluster management point) has been marked as FULL CONTROL for all Files & Folders on the disk. I have not added the individual Cluster Nodes to the full control for all files & folders yet. The Disks's Owner is marked as "SYSTEM".

When I move everything back to NODE1 again, the disk comes up with no problem. Anybody have any ideas what is going on here?

Lakshmana

Whether you have mounted the partition at raw disk

dafyre

@Lakshmana I did. I can mount it on one machine, but not another. The disk was formatted for NTFS.

PSX_Defector

iSCSI or FC or what?

Might need to ensure that the host can see the stores, sounds as though it might have a problem seeing the disk. I'm assuming it can see the quorum and such.

dafyre

@PSX_Defector Thanks for the reminder, it's an iSCSI connection.

I've narrowed the problem down to persistent reservations. (SCSI-3 Persistent Reservations). Apparently, that option is turned off on my LUN. I don't have access to the storage systems to fix it yet, so I got the next guy up the totem pole looking at it for me.

dafyre

If I remove the iSCSI disk from the cluster, I can bring it on and offline on either node with no trouble. The problem only happens when I have it as part of the failover cluster.

MattSpeller

All I can picture is some tiny dude waving flags around in your server

dafyre

@MattSpeller said:

All I can picture is some tiny dude waving flags around in your server

/me hides flags

What gives you that impression?

/me waits for you to look away and kicks server

PSX_Defector

@dafyre said:

If I remove the iSCSI disk from the cluster, I can bring it on and offline on either node with no trouble. The problem only happens when I have it as part of the failover cluster.

To both servers?

dafyre

No, lol. It will actually work fine on NODE1... But when I fail it over to NODE2, the disk resource won't come online.

If I remove the disk from the Role, and then remove it from the Available Disks pool, then I can mount it on either server with no problem.

When I add it back to the cluster again, it only works on NODE1...

PSX_Defector

That's what I was looking for. So the pathing isn't a problem, since you can mount on both servers one at a time.

Sounds to me like some kind of presentation issue. Fibre Channel does this very easily, iSCSI not as much. I would have my SAN guy spin me up a LUN, make sure it is presented to both servers, setup a new cluster resource, make sure it can failover, then migrate your data over to it. Yeah, more of a pain, but at least you will know it will failover properly.

Same DC right? You not doing something stupid like WAN clustering using iSCSI.

dafyre

@PSX_Defector I have narrowed it down to the Persistent Reservations. This is a Nimble Storage system, and I believe that can be en/disabled per lun. AFAIK, this will be the only Windows Cluster accessing the Nimble systems in a Failover Cluster, so they probably don't have it enabled.

And yes, everything is in the same DC. Most likely in the same rack, lol... and no, not doing anything fun with iSCSI over the WAN.... although that does give me an idea, for new things to try on my own servers, lol.