Heartbeat Problem

  • Dear All,
    i am working on a cluster environment using ( Heartbeat and DRBD ) for 2 Nodes and after configuring everything i found that the heartbeat does not mount the /dev/drbd0 on the primary node and after checking the log file it showed that the 2 Nodes were in standby state , however i tried to run this script on the primary node " /usr/share/heartbeat/hb_takeover " and this script on the Secondary Node " /usr/share/heartbeat/hb_standby "
    but i found the same issue , and the below is my /etc/ha.d/ha.cf configuration file :
    debugfile /var/log/ha-debug
    logfile /var/log/ha-log
    logfacility local0
    keepalive 2
    deadtime 30
    warntime 10
    initdead 120
    udpport 694
    bcast eth0
    auto_failback off
    node node1.server.local
    node node2.server.local

  • Just checking some basics here. Ping works between both hosts with the node#.server.local? What gets reported in /var/log/ha-log after starting the DRBD service?

  • @travisdh1 yes , ping works between the two nodes and also in the log file " /var/log/ha-log" shows that no connectivity lost between 2 nodes and the two nodes are in the standby state

  • Any help please 🙂

  • @AlyRagab said in Heartbeat Problem:

    Any help please 🙂

    Not without more information. We can't tell you why the nodes are in standby without getting those logs. Preferably restart the drbd service and just show us the section from when you restart it.

  • ok , Now after restarting the drbd service i got these logs in /var/log/messages

    block drbd0: disk( Attaching -> UpToDate )
    block drbd0: conn( StandAlone -> Unconnected )
    block drbd0: Starting receiver thread (from drbd0_worker [11459])
    block drbd0: receiver (re)started
    block drbd0: conn( Unconnected -> WFConnection )
    block drbd0: Handshake successful: Agreed network protocol version 97
    block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC
    block drbd0: conn( WFConnection -> WFReportParams )
    block drbd0: Starting asender thread (from drbd0_receiver [11472])
    block drbd0: data-integrity-alg: <not-used>
    block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate )

    also # cat /proc/drbd
    version: 8.3.15 (api:88/proto:86-97)
    GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by [email protected], 2013-03-27 16:01:26
    0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

  • When i Restart the drbd service then check the status # cat /proc/drbd
    it shows ( Connected ro:Secondary/Secondary ) , so after doing the below:

    drbdadm disconnect r0
    drbdadm connect r0
    drbdadm primary r0

    and do the same on the other node but # drbdadm secondary r0
    still the same i have no /dev/drbd0

Log in to reply