Avoiding a Split Brain Scenario with DRBD
-
So as the time draws closer and closer to my project kicking off I wanted to confirm a few things with the ML crew. Specifically Split Brain detection and prevention on a 2-node setup.
Things like "auto-power on after power restore" is clearly a big no no.
Only shutting down a single host at a time (or powering up a single host at one time) is a requirement if pulling the power is required.What else is recommended to avoid a DRBD Split Brain?
What are some ways to test that the functionality is working as intended? Ideally I don't want to go pulling power or ethernet to new equipment as it arrives.
-
@DustinB3403 said in Avoiding a Split Brain Scenario with DRBD:
So as the time draws closer and closer to my project kicking off I wanted to confirm a few things with the ML crew. Specifically Split Brain detection and prevention on a 2-node setup.
Things like "auto-power on after power restore" is clearly a big no no.
Only shutting down a single host at a time (or powering up a single host at one time) is a requirement if pulling the power is required.What else is recommended to avoid a DRBD Split Brain?
What are some ways to test that the functionality is working as intended? Ideally I don't want to go pulling power or ethernet to new equipment as it arrives.
To get a real feel for it... You may be left with only pulling the plugs as a real test... Sure you could reboot one host at a time, but that wouldn't be a real test, because in some situations, your UPS may die before you can issue the shutdown commands.
-
Are you using HA-Lizard? They handle this stuff as part of the script, I thought.
-
They do and the plan is to use halizard, I just wanted to follow up and see what else might be worth checking.
-
Most people use a STONITH system with DRBD when building your own HA file server, for example.
-
@scottalanmiller said in Avoiding a Split Brain Scenario with DRBD:
Most people use a STONITH system with DRBD when building your own HA file server, for example.
Shoot The Other Node In The Head.... yet another memorable, wrong, yet absolutely valid tech term. I wonder why I don't remember hearing it before now. Apparently GlusterFS docs don't mention it.
-
I don't think Gluster (it's no longer GlusterFS) uses STONITH because they don't go for the two node configuration where it is needed. They use a quorum system.