Unsolved Heartbeat Problem
-
Dear All,
i am working on a cluster environment using ( Heartbeat and DRBD ) for 2 Nodes and after configuring everything i found that the heartbeat does not mount the /dev/drbd0 on the primary node and after checking the log file it showed that the 2 Nodes were in standby state , however i tried to run this script on the primary node " /usr/share/heartbeat/hb_takeover " and this script on the Secondary Node " /usr/share/heartbeat/hb_standby "
but i found the same issue , and the below is my /etc/ha.d/ha.cf configuration file :
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
bcast eth0
auto_failback off
node node1.server.local
node node2.server.local -
Just checking some basics here. Ping works between both hosts with the node#.server.local? What gets reported in /var/log/ha-log after starting the DRBD service?
-
@travisdh1 yes , ping works between the two nodes and also in the log file " /var/log/ha-log" shows that no connectivity lost between 2 nodes and the two nodes are in the standby state
-
Any help please
-
@AlyRagab said in Heartbeat Problem:
Any help please
Not without more information. We can't tell you why the nodes are in standby without getting those logs. Preferably restart the drbd service and just show us the section from when you restart it.
-
ok , Now after restarting the drbd service i got these logs in /var/log/messages
block drbd0: disk( Attaching -> UpToDate )
block drbd0: conn( StandAlone -> Unconnected )
block drbd0: Starting receiver thread (from drbd0_worker [11459])
block drbd0: receiver (re)started
block drbd0: conn( Unconnected -> WFConnection )
block drbd0: Handshake successful: Agreed network protocol version 97
block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC
block drbd0: conn( WFConnection -> WFReportParams )
block drbd0: Starting asender thread (from drbd0_receiver [11472])
block drbd0: data-integrity-alg: <not-used>
block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )also # cat /proc/drbd
version: 8.3.15 (api:88/proto:86-97)
GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by [email protected], 2013-03-27 16:01:26
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 -
When i Restart the drbd service then check the status # cat /proc/drbd
it shows ( Connected ro:Secondary/Secondary ) , so after doing the below:drbdadm disconnect r0
drbdadm connect r0
drbdadm primary r0and do the same on the other node but # drbdadm secondary r0
still the same i have no /dev/drbd0