CentOS 7 VM on Hyper-V losing DHCP assigned address
-
I'm amazed that just restarting the service fixes the issue. I wonder if the hardware clock isn't being updated and the skew is too great so Chrony just gives up.
If you install ntpdate and enable it, it will update the hw clock every 11 minutes. I don't think it's installed by default with the minimal installs.
You could also just go back to ntpd. It's still available and might not have the issue. It also has a file under /etc/sysconfig/ntpd where you can set ntpd to update the hw clock.
-
Another server this morning.
/var/log/messages:Jan 31 02:22:57 nginxproxy chronyd[649]: Frequency -15.399 +/- 0.022 ppm read from /var/lib/chrony/drift /var/log/messages:Jan 29 21:37:01 nginxproxy chronyd[649]: Selected source 208.75.88.4 /var/log/messages:Jan 29 21:37:01 nginxproxy chronyd[649]: System clock wrong by -103565.091462 seconds, adjustment started /var/log/messages:Jan 29 21:37:01 nginxproxy chronyd[649]: System clock was stepped by -103565.091462 seconds /var/log/messages:Jan 29 21:38:07 nginxproxy chronyd[649]: Selected source 138.236.128.112 /var/log/messages:Jan 29 21:39:13 nginxproxy chronyd[649]: Selected source 208.75.88.4 /var/log/messages:Jan 29 23:34:35 nginxproxy chronyd[649]: Selected source 10.254.0.1 /var/log/messages:Jan 30 06:16:14 nginxproxy chronyd[649]: Can't synchronise: no selectable sources /var/log/messages:Jan 30 08:35:35 nginxproxy chronyd[649]: Source 138.236.128.112 offline /var/log/messages:Jan 30 08:35:35 nginxproxy chronyd[649]: Source 10.254.0.1 offline /var/log/messages:Jan 30 08:35:35 nginxproxy chronyd[649]: Source 204.2.134.162 offline /var/log/messages:Jan 30 08:35:35 nginxproxy chronyd[649]: Source 208.75.88.4 offline /var/log/messages:Jan 30 08:35:37 nginxproxy chronyd[649]: Source 138.236.128.112 online /var/log/messages:Jan 30 08:35:37 nginxproxy chronyd[649]: Source 10.254.0.1 online /var/log/messages:Jan 30 08:35:37 nginxproxy chronyd[649]: Source 204.2.134.162 online /var/log/messages:Jan 30 08:35:37 nginxproxy chronyd[649]: Source 208.75.88.4 online /var/log/messages:Jan 30 08:35:37 nginxproxy chronyd[649]: Selected source 10.254.0.1 /
server 10.254.0.1 iburst server 1.centos.pool.ntp.org iburst server 2.centos.pool.ntp.org iburst server 3.centos.pool.ntp.org iburst # Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html).
-
I am slowly but surely changing all the servers to the following
# These servers were defined in the installation: server 0.us.pool.ntp.org iburst server 1.us.pool.ntp.org iburst server 2.us.pool.ntp.org iburst server 3.us.pool.ntp.org iburst # Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html).
-
So, the brand new server I setup on Sunday was just offline.
The uptime on this server is only 15 hours because I shut it down yesterday to make a change to memory. and it was set to use the us pool when created.
[root@bnasc ~]# uptime 09:05:30 up 15:13, 3 users, load average: 0.02, 0.02, 0.05 [root@bnasc ~]# cat /etc/chrony.conf # These servers were defined in the installation: server 0.us.pool.ntp.org iburst server 1.us.pool.ntp.org iburst server 2.us.pool.ntp.org iburst server 3.us.pool.ntp.org iburst # Use public servers from the pool.ntp.org project. # Please consider joining the pool (http://www.pool.ntp.org/join.html).
[root@bnasc ~]# grep chrony /var/log/messages* Jan 29 12:02:30 bnasc chronyd[628]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Jan 29 12:02:30 bnasc chronyd[628]: Generated key 1 Jan 29 11:52:40 bnasc chronyd[628]: Selected source 74.120.8.2 Jan 29 11:52:40 bnasc chronyd[628]: System clock wrong by -599.208892 seconds, adjustment started Jan 29 11:52:40 bnasc chronyd[628]: System clock was stepped by -599.208892 seconds Jan 29 11:54:52 bnasc chronyd[628]: Selected source 4.53.160.75 Jan 29 11:55:45 bnasc yum[10204]: Updated: chrony-2.1.1-4.el7.centos.x86_64 Jan 29 11:55:58 bnasc chronyd[628]: chronyd exiting Jan 29 11:55:59 bnasc chronyd[11345]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Jan 29 11:55:59 bnasc chronyd[11345]: Frequency -15.874 +/- 5.866 ppm read from /var/lib/chrony/drift Jan 29 11:56:04 bnasc chronyd[11345]: Selected source 216.218.254.202 Jan 29 11:57:11 bnasc chronyd[11345]: Selected source 132.163.4.102 Jan 29 17:23:45 bnasc chronyd[11345]: Source 38.229.71.1 offline Jan 29 17:23:45 bnasc chronyd[11345]: Source 108.59.2.24 offline Jan 29 17:23:45 bnasc chronyd[11345]: Source 216.218.254.202 offline Jan 29 17:23:45 bnasc chronyd[11345]: Source 132.163.4.102 offline Jan 29 17:23:45 bnasc chronyd[11345]: Can't synchronise: no selectable sources Jan 29 17:23:46 bnasc chronyd[11345]: Source 38.229.71.1 online Jan 29 17:23:46 bnasc chronyd[11345]: Source 132.163.4.102 online Jan 29 17:23:46 bnasc chronyd[11345]: Source 108.59.2.24 online Jan 29 17:23:46 bnasc chronyd[11345]: Source 216.218.254.202 online Jan 29 17:23:47 bnasc chronyd[11345]: Selected source 132.163.4.102 Jan 29 23:39:31 bnasc chronyd[622]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Jan 29 23:39:31 bnasc chronyd[622]: Frequency -15.386 +/- 0.007 ppm read from /var/lib/chrony/drift Jan 29 17:46:38 bnasc chronyd[622]: Selected source 71.210.146.228 Jan 29 17:46:38 bnasc chronyd[622]: System clock wrong by -21182.172665 seconds, adjustment started Jan 29 17:46:38 bnasc chronyd[622]: System clock was stepped by -21182.172665 seconds Jan 30 01:17:52 bnasc chronyd[616]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Jan 30 01:17:52 bnasc chronyd[616]: Frequency -15.397 +/- 0.041 ppm read from /var/lib/chrony/drift Jan 29 21:37:17 bnasc chronyd[616]: Selected source 45.127.112.2 Jan 29 21:37:17 bnasc chronyd[616]: System clock wrong by -13244.251559 seconds, adjustment started Jan 29 21:37:17 bnasc chronyd[616]: System clock was stepped by -13244.251559 seconds Jan 29 21:49:12 bnasc chronyd[616]: Selected source 4.53.160.75 Jan 30 01:13:39 bnasc chronyd[616]: Selected source 45.127.112.2 Jan 30 06:23:50 bnasc chronyd[616]: Can't synchronise: no selectable sources Jan 30 07:32:52 bnasc chronyd[616]: Selected source 45.127.112.2 Jan 30 14:38:54 bnasc chronyd[616]: chronyd exiting Jan 31 07:37:15 bnasc chronyd[615]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Jan 31 07:37:15 bnasc chronyd[615]: Frequency -15.340 +/- 0.046 ppm read from /var/lib/chrony/drift Jan 30 14:42:50 bnasc chronyd[615]: Selected source 167.88.117.204 Jan 30 14:42:50 bnasc chronyd[615]: System clock wrong by -60875.125649 seconds, adjustment started Jan 30 14:42:50 bnasc chronyd[615]: System clock was stepped by -60875.125649 seconds Jan 30 14:46:04 bnasc chronyd[615]: Selected source 173.255.246.13 Jan 30 14:50:52 bnasc chronyd[614]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Jan 30 14:50:52 bnasc chronyd[614]: Frequency -15.237 +/- 0.675 ppm read from /var/lib/chrony/drift Jan 30 14:50:44 bnasc chronyd[614]: Selected source 74.120.81.219 Jan 30 14:50:44 bnasc chronyd[614]: System clock wrong by -16.271980 seconds, adjustment started Jan 30 14:50:44 bnasc chronyd[614]: System clock was stepped by -16.271980 seconds Jan 30 14:51:50 bnasc chronyd[614]: Selected source 204.2.134.163 Jan 30 14:52:55 bnasc chronyd[614]: Selected source 74.120.81.219 Jan 30 17:02:29 bnasc chronyd[614]: chronyd exiting Jan 30 19:05:34 bnasc chronyd[633]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Jan 30 19:05:34 bnasc chronyd[633]: Frequency -15.245 +/- 0.393 ppm read from /var/lib/chrony/drift Jan 30 17:04:13 bnasc chronyd[633]: Selected source 45.79.111.114 Jan 30 17:04:13 bnasc chronyd[633]: System clock wrong by -7289.265261 seconds, adjustment started Jan 30 17:04:13 bnasc chronyd[633]: System clock was stepped by -7289.265261 seconds Jan 31 07:57:55 bnasc chronyd[633]: Selected source 184.105.182.7 Jan 31 13:50:56 bnasc chronyd[633]: Selected source 45.79.111.114 Jan 31 17:50:41 bnasc chronyd[633]: chronyd exiting Feb 1 18:29:46 bnasc chronyd[632]: chronyd version 2.1.1 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +DEBUG +ASYNCDNS +IPV6 +SECHASH) Feb 1 18:29:47 bnasc chronyd[632]: Frequency -15.386 +/- 0.029 ppm read from /var/lib/chrony/drift Jan 31 17:52:24 bnasc chronyd[632]: Selected source 216.218.254.202 Jan 31 17:52:24 bnasc chronyd[632]: System clock wrong by -88651.453210 seconds, adjustment started Jan 31 17:52:24 bnasc chronyd[632]: System clock was stepped by -88651.453210 seconds Jan 31 17:53:29 bnasc chronyd[632]: Selected source 199.102.46.76 Jan 31 18:04:17 bnasc chronyd[632]: Source 173.71.80.235 replaced with 74.82.59.143 Feb 1 02:26:09 bnasc chronyd[632]: Can't synchronise: no selectable sources Feb 1 09:00:20 bnasc chronyd[632]: Source 107.151.174.199 offline Feb 1 09:00:20 bnasc chronyd[632]: Source 216.218.254.202 offline Feb 1 09:00:20 bnasc chronyd[632]: Source 199.102.46.76 offline Feb 1 09:00:20 bnasc chronyd[632]: Source 74.82.59.143 offline Feb 1 09:00:22 bnasc chronyd[632]: Source 107.151.174.199 online Feb 1 09:00:22 bnasc chronyd[632]: Source 216.218.254.202 online Feb 1 09:00:22 bnasc chronyd[632]: Source 199.102.46.76 online Feb 1 09:00:22 bnasc chronyd[632]: Source 74.82.59.143 online Feb 1 09:00:22 bnasc chronyd[632]: Selected source 199.102.46.76 [root@bnasc ~]#
So WTF is chrony shit or what?
-
The DHCP server is set to a lease time of 6 hours.
Here is
grep dhcp /var/log/messages
and you can see the log timestamp going way ahead.Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9740] dhcp4 (eth0): address 10.254.0.36 Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9742] dhcp4 (eth0): plen 24 (255.255.255.0) Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9742] dhcp4 (eth0): gateway 10.254.0.1 Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9742] dhcp4 (eth0): server identifier 10.254.0.21 Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9743] dhcp4 (eth0): lease time 21600 Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9743] dhcp4 (eth0): nameserver '10.254.0.21' Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9743] dhcp4 (eth0): nameserver '10.254.0.27' Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9743] dhcp4 (eth0): domain name 'ad.bundystl.com' Jan 31 15:42:17 bnasc NetworkManager[662]: <info> [1485898937.9743] dhcp4 (eth0): state changed bound -> bound Jan 31 15:42:17 bnasc nm-dispatcher: req:1 'dhcp4-change' [eth0]: new request (3 scripts) Jan 31 15:42:17 bnasc nm-dispatcher: req:1 'dhcp4-change' [eth0]: start running ordered scripts... Feb 1 18:29:49 bnasc NetworkManager[659]: <info> [1485995389.2647] dhcp-init: Using DHCP client 'dhclient' Feb 1 18:29:49 bnasc NetworkManager[659]: <info> [1485995389.9799] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds) Feb 1 18:29:49 bnasc NetworkManager[659]: <info> [1485995389.9932] dhcp4 (eth0): dhclient started with pid 761 Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2462] dhcp4 (eth0): address 10.254.0.36 Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2462] dhcp4 (eth0): plen 24 (255.255.255.0) Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2463] dhcp4 (eth0): gateway 10.254.0.1 Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2463] dhcp4 (eth0): server identifier 10.254.0.21 Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2463] dhcp4 (eth0): lease time 21600 Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2463] dhcp4 (eth0): nameserver '10.254.0.21' Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2463] dhcp4 (eth0): nameserver '10.254.0.27' Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2463] dhcp4 (eth0): domain name 'ad.bundystl.com' Feb 1 18:29:50 bnasc NetworkManager[659]: <info> [1485995390.2464] dhcp4 (eth0): state changed unknown -> bound Feb 1 09:00:20 bnasc NetworkManager[659]: <info> [1485961220.8147] dhcp4 (eth0): canceled DHCP transaction, DHCP client pid 761 Feb 1 09:00:20 bnasc NetworkManager[659]: <info> [1485961220.8147] dhcp4 (eth0): state changed bound -> done Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.0533] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds) Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.0577] dhcp4 (eth0): dhclient started with pid 20673 Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1079] dhcp4 (eth0): address 10.254.0.36 Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1080] dhcp4 (eth0): plen 24 (255.255.255.0) Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1080] dhcp4 (eth0): gateway 10.254.0.1 Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1080] dhcp4 (eth0): server identifier 10.254.0.21 Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1080] dhcp4 (eth0): lease time 21600 Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1080] dhcp4 (eth0): nameserver '10.254.0.21' Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1080] dhcp4 (eth0): nameserver '10.254.0.27' Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1081] dhcp4 (eth0): domain name 'ad.bundystl.com' Feb 1 09:00:22 bnasc NetworkManager[659]: <info> [1485961222.1081] dhcp4 (eth0): state changed unknown -> bound
-
Nothing in the DHCP Server event logs except the reservation creation and deletion informational events.
-
I have no clue where to go with this now. Any recommendations would be greatly appreciated.
-
Look what I found..
grep clock /var/log/messages* /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: rtc_cmos 00:00: setting system clock to 2017-02-28 13:17:53 UTC (1488287873) /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource_tsc_page
How in the fuck do I fix this?
Time sync from the host has never been enabled. I realize it always gets it at boot and I did reboot last night because I updated the Hyper-V host.
The host also has the correct time.
C:\>time /t 04:03 PM C:\>date /t Mon 02/13/2017 C:\Users\bnaadmin>
-
Do I blame integration services for this one? Microsoft has a different line of Integration services that can be installed, but then it breaks the built in package method.
-
This would also be why @scottalanmiller never sees this. I do not believe that he has hyper-v running anywhere with a bunch of CentOS systems on it, that are also using DHCP.
-
@stacksofplates said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
You could also just go back to ntpd. It's still available and might not have the issue. It also has a file under /etc/sysconfig/ntpd where you can set ntpd to update the hw clock.
Looks like
chrony
was actually keeping things good. It was just not catching the weird skew from the hardware good enough. -
I dont have a specific answer for you, but I always seem to have a problem with pool.org ntp servers. So much so that i dont use them anymore.
I use nist servers instead.
time.nist.gov global address for all servers Multiple locations
time-nw.nist.gov 131.107.13.100 Microsoft, Redmond, Washington
There are probably other regional ones nearer to you -
@momurda said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
I dont have a specific answer for you, but I always seem to have a problem with pool.org ntp servers. So much so that i dont use them anymore.
Problem seems to not be with the ntp.org servers. It is the hardware clock seemingly being random as f***.
-
@JaredBusch said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
Look what I found..
grep clock /var/log/messages* /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: rtc_cmos 00:00: setting system clock to 2017-02-28 13:17:53 UTC (1488287873) /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource_tsc_page
I just noticed the time skew here 1488287873 seconds is the real time, inseconds, since Unix Time began. Is it possibly something wrong with the clock on the hw itself? Battery, something else not easily fixable.
edit: unless the time wasnt actually 00:00 jan 1 1970 when it was changed
edit2: your clock was actually in the future here and set back in time a couple weeks. ignore what i say. -
@momurda said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
@JaredBusch said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
Look what I found..
grep clock /var/log/messages* /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: rtc_cmos 00:00: setting system clock to 2017-02-28 13:17:53 UTC (1488287873) /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource_tsc_page
I just noticed the time skew here 1488287873 seconds is the real time, inseconds, since Unix Time began.
That is not displaying a skew, that is displaying the time it is setting it to.
-
A little more digging, but no answers yet.
cat /sys/devices/system/clocksource/clocksource0/current_clocksource hyperv_clocksource_tsc_page cat /sys/devices/system/clocksource/clocksource0/available_clocksource hyperv_clocksource_tsc_page hyperv_clocksource acpi_pm
-
@JaredBusch said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
@stacksofplates said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
You could also just go back to ntpd. It's still available and might not have the issue. It also has a file under /etc/sysconfig/ntpd where you can set ntpd to update the hw clock.
Looks like
chrony
was actually keeping things good. It was just not catching the weird skew from the hardware good enough.Ya I have found chronyd to be more reliable than ntpd, but that's anecdotal. I can't say I've seen this happen, but I don't have any Hyper-V servers at all. Even though I don't have DHCP setup (I want to move all of our static stuff to reservations though), I would at least notice because we authenticate to all of the RHEL systems with Kerberos
-
@JaredBusch said in CentOS 7 VM on Hyper-V losing DHCP assigned address:
Look what I found..
grep clock /var/log/messages* /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: rtc_cmos 00:00: setting system clock to 2017-02-28 13:17:53 UTC (1488287873) /var/log/messages-20170213:Feb 28 07:17:53 jaredweb kernel: Switched to clocksource hyperv_clocksource_tsc_page
How in the fuck do I fix this?
Time sync from the host has never been enabled. I realize it always gets it at boot and I did reboot last night because I updated the Hyper-V host.
The host also has the correct time.
C:\>time /t 04:03 PM C:\>date /t Mon 02/13/2017 C:\Users\bnaadmin>
Can you remind me again why you will not or can not use the Hypervisor time? Does the time match the host if you actually enable the "Time Synchronization" service?
-
Question: have you tried setting up the DHCP server as the time master, and having the Centos system sync time to the DHCP server (with said DHCP server syncing to NIST or whatever time source you prefer) instead of the NTP Pool? That way theoretically DHCP server and Centos should be at the same time. Just a thought.