Server 2019 randomly DNS stops
-
DNS stops working randomly although the services continue to run. Cannot ping any external hosts.
First began last November, sometimes it happens multiple times per day sometimes it only happens once a week.
One Server 2019 (standard 1809) host with two vms, one AD, DNS, DHCP and file shares the other is an RDP server. A reboot of the host always gets it working again, a reboot of the AD vm doesn't (which I think is odd).
NTP time is out by -00.0016250ms and seems to be stable.
I do see this in my RMM when it goes and then opens when it comes back.Monitor: Port Closed Port closed.PortNo : '53', Protocol : 'TCP', ServiceName : 'DNS(dns.exe)
It happened just now and I ran
nltest.exe /dsregdns
As I had a load of NETLOGON errors in event viewer and it said to run that. It then started working again (this is the first time I've run that and first time I got it working again without a reboot).
I also have an inkling that it is when the w32time services runs as that does sometimes appear to be running when it goes.
Any ideas or experience with something like this shared would be much appreciated.
-
I run into this all the time. So far no resolution.
But to get back running again, I run this, and it works every time. I believe I suffer from DNS poisoning very often and I don't know why. If you are suffering the same, give it a try and report back. This is NOT the same as ipconfig /flushdns.
dnscmd /clearcache
You can also do this from the UI:
-
Thank you for the info, much appreciated. It just went again and I tried this with no joy. w32time service was not running this time.
I tried running this again as well but it didn't help either..
nltest.exe /dsregdns
This is the wording of the Ping response. I did a net start/stop on dns too
Ping request could not find host bbc.co.uk. Please check the name and try again.
Went for a host reboot in the end.
-
Event logs, dns event logs? What do they say?
-
@Obsolesce Lots of informational logs but not error or critical logs. I've looked through those and most of them say "I can't find x" because no DNS.
There are a lot of informational logs for directory sync but I'm not able to spot any specific events that look like a trigger.
-
Before going off on tangents, are you 100% sure that AD DNS is properly set up for your environment?
-- DHCP is only serving your Domain DNS servers with the leases
-- Clients are ONLY attempting to use the Domain DNS
-- DNS server / Domain Controller is set to loopback on the NIC
-- DNS forwarders are set to reliable, known external DNS (nothing internal on the forwarders)It's strange that a full host reboot fixes things while rebooting the guest / DC doesn't.
Could there be a firewall / edge device in the mix that's blocked based on source or rate-limiting and the extra time taken by rebooting the whole host is enough to allow it to clear it's block list / criteria?
There's a whole lot of funky things that could be happening here.....
-
@notverypunny said in Server 2019 randomly DNS stops:
DNS server / Domain Controller
No I inherited this unfortunately ! I'm sure we've all been here..
- DHCP scope is configured with Router, DNS Servers (as the AD DNS ONLY) and DNS domain name of domain.co.uk - I mean it looks correct.
- Clients get DNS from AD DNS server through DHCP as above and to my knowledge noone is capable of changing it on their desktops.
- AD DNS server isn't set to loopback no, its set to its own IP (which is what I thought was properly configured)
- Forwarders are set to Google and OpenDNS
I totally agree about the host reboot thing, its is in my opinion the most puzzling thing. Maybe I should retest that theory encase the times it hasn't worked is a fluke...the internet is a wireless link provided by a small ISP here in the UK and I'm not familiar with their service.
The whole thing is a huge headache. I've tried uninstalling my RMM tool encase that is the issue. I have noticed that when I teamviewer in sometimes that seems to either trigger it or I'm super (un)lucky...
-
@choppy_sea said in Server 2019 randomly DNS stops:
@Obsolesce Lots of informational logs but not error or critical logs. I've looked through those and most of them say "I can't find x" because no DNS.
There are a lot of informational logs for directory sync but I'm not able to spot any specific events that look like a trigger.
I meant on the DNS server, the DNS operational logs.
-
@choppy_sea said in Server 2019 randomly DNS stops:
Forwarders are set to Google and OpenDNS
Can u show the screen where this is set
-
Is DNS not resolving only for the RDS server or does it happen for any device on the network? i.e. laptop trying to use DNS directly.
-
Is the DNS Server the Domain Controller?
Is the VM Host a member of that domain? -
@choppy_sea said in Server 2019 randomly DNS stops:
@notverypunny said in Server 2019 randomly DNS stops:
DNS server / Domain Controller
No I inherited this unfortunately ! I'm sure we've all been here..
- DHCP scope is configured with Router, DNS Servers (as the AD DNS ONLY) and DNS domain name of domain.co.uk - I mean it looks correct.
- Clients get DNS from AD DNS server through DHCP as above and to my knowledge noone is capable of changing it on their desktops.
- AD DNS server isn't set to loopback no, its set to its own IP (which is what I thought was properly configured)
- Forwarders are set to Google and OpenDNS
I totally agree about the host reboot thing, its is in my opinion the most puzzling thing. Maybe I should retest that theory encase the times it hasn't worked is a fluke...the internet is a wireless link provided by a small ISP here in the UK and I'm not familiar with their service.
The whole thing is a huge headache. I've tried uninstalling my RMM tool encase that is the issue. I have noticed that when I teamviewer in sometimes that seems to either trigger it or I'm super (un)lucky...
Yeah, sorting out an inherited mess is never fun.
When things stop working, can you still ping out to known good IPs? I.E. 8.8.8.8 1.1.1.1 etc? Maybe DNS isn't the problem. You mention that it's a small WISP, maybe their CPE can't handle the connection load and similar to my rate-limiting theory it's just a coincidence that the time taken to reboot the host and guests is enough to clear the CPE's session table.....
I'll add my vote to those strongly recommending a deep dive on the DNS server's logs, and I'll throw the Host system's logs in there too for good measure.
-
@Obsolesce DNS logs show one interesting one linked, the log says that its transferred the master role from itself to itself https://imgur.com/a/4I75qnB if you mean somewhere else I apologise!
@Dashrender It happens for every device on the network!
@JasGot Yes the AD server does DNS and DHCP too, yes the Host on the domain
@notverypunny When I ping a known good IP i.e. 8.8.8.8 I get "...unreachable" rather than the "Ping request could not..."
-
This post is deleted! -
Do you lose dns and/or network abilities on the DNS server too?
-
@Obsolesce yes, the only way I can connect to it is through the hosts Hyper-V and the shares on the AD server drop
-
Tell us about your network setup... switches firewall, APs
-
If noting on the network is working, that makes me think a bad switch.
Or possibly a bad NIC taking down your switch. -
@Dashrender Draytek 2860 (being replaced this week) - 3 * unifi switches (fibre linked) and a few rando APs dotted around. It's quite a simple setup, the APs are a bit sus. I could pull them all out...
-
@choppy_sea said in Server 2019 randomly DNS stops:
@Dashrender Draytek 2860 (being replaced this week) - 3 * unifi switches (fibre linked) and a few rando APs dotted around. It's quite a simple setup, the APs are a bit sus. I could pull them all out...
Why are the AP's sus?