Server 2019 randomly DNS stops
-
@Obsolesce Lots of informational logs but not error or critical logs. I've looked through those and most of them say "I can't find x" because no DNS.
There are a lot of informational logs for directory sync but I'm not able to spot any specific events that look like a trigger.
-
Before going off on tangents, are you 100% sure that AD DNS is properly set up for your environment?
-- DHCP is only serving your Domain DNS servers with the leases
-- Clients are ONLY attempting to use the Domain DNS
-- DNS server / Domain Controller is set to loopback on the NIC
-- DNS forwarders are set to reliable, known external DNS (nothing internal on the forwarders)It's strange that a full host reboot fixes things while rebooting the guest / DC doesn't.
Could there be a firewall / edge device in the mix that's blocked based on source or rate-limiting and the extra time taken by rebooting the whole host is enough to allow it to clear it's block list / criteria?
There's a whole lot of funky things that could be happening here.....
-
@notverypunny said in Server 2019 randomly DNS stops:
DNS server / Domain Controller
No I inherited this unfortunately ! I'm sure we've all been here..
- DHCP scope is configured with Router, DNS Servers (as the AD DNS ONLY) and DNS domain name of domain.co.uk - I mean it looks correct.
- Clients get DNS from AD DNS server through DHCP as above and to my knowledge noone is capable of changing it on their desktops.
- AD DNS server isn't set to loopback no, its set to its own IP (which is what I thought was properly configured)
- Forwarders are set to Google and OpenDNS
I totally agree about the host reboot thing, its is in my opinion the most puzzling thing. Maybe I should retest that theory encase the times it hasn't worked is a fluke...the internet is a wireless link provided by a small ISP here in the UK and I'm not familiar with their service.
The whole thing is a huge headache. I've tried uninstalling my RMM tool encase that is the issue. I have noticed that when I teamviewer in sometimes that seems to either trigger it or I'm super (un)lucky...
-
@choppy_sea said in Server 2019 randomly DNS stops:
@Obsolesce Lots of informational logs but not error or critical logs. I've looked through those and most of them say "I can't find x" because no DNS.
There are a lot of informational logs for directory sync but I'm not able to spot any specific events that look like a trigger.
I meant on the DNS server, the DNS operational logs.
-
@choppy_sea said in Server 2019 randomly DNS stops:
Forwarders are set to Google and OpenDNS
Can u show the screen where this is set
-
Is DNS not resolving only for the RDS server or does it happen for any device on the network? i.e. laptop trying to use DNS directly.
-
Is the DNS Server the Domain Controller?
Is the VM Host a member of that domain? -
@choppy_sea said in Server 2019 randomly DNS stops:
@notverypunny said in Server 2019 randomly DNS stops:
DNS server / Domain Controller
No I inherited this unfortunately ! I'm sure we've all been here..
- DHCP scope is configured with Router, DNS Servers (as the AD DNS ONLY) and DNS domain name of domain.co.uk - I mean it looks correct.
- Clients get DNS from AD DNS server through DHCP as above and to my knowledge noone is capable of changing it on their desktops.
- AD DNS server isn't set to loopback no, its set to its own IP (which is what I thought was properly configured)
- Forwarders are set to Google and OpenDNS
I totally agree about the host reboot thing, its is in my opinion the most puzzling thing. Maybe I should retest that theory encase the times it hasn't worked is a fluke...the internet is a wireless link provided by a small ISP here in the UK and I'm not familiar with their service.
The whole thing is a huge headache. I've tried uninstalling my RMM tool encase that is the issue. I have noticed that when I teamviewer in sometimes that seems to either trigger it or I'm super (un)lucky...
Yeah, sorting out an inherited mess is never fun.
When things stop working, can you still ping out to known good IPs? I.E. 8.8.8.8 1.1.1.1 etc? Maybe DNS isn't the problem. You mention that it's a small WISP, maybe their CPE can't handle the connection load and similar to my rate-limiting theory it's just a coincidence that the time taken to reboot the host and guests is enough to clear the CPE's session table.....
I'll add my vote to those strongly recommending a deep dive on the DNS server's logs, and I'll throw the Host system's logs in there too for good measure.
-
@Obsolesce DNS logs show one interesting one linked, the log says that its transferred the master role from itself to itself https://imgur.com/a/4I75qnB if you mean somewhere else I apologise!
@Dashrender It happens for every device on the network!
@JasGot Yes the AD server does DNS and DHCP too, yes the Host on the domain
@notverypunny When I ping a known good IP i.e. 8.8.8.8 I get "...unreachable" rather than the "Ping request could not..."
-
This post is deleted! -
Do you lose dns and/or network abilities on the DNS server too?
-
@Obsolesce yes, the only way I can connect to it is through the hosts Hyper-V and the shares on the AD server drop
-
Tell us about your network setup... switches firewall, APs
-
If noting on the network is working, that makes me think a bad switch.
Or possibly a bad NIC taking down your switch. -
@Dashrender Draytek 2860 (being replaced this week) - 3 * unifi switches (fibre linked) and a few rando APs dotted around. It's quite a simple setup, the APs are a bit sus. I could pull them all out...
-
@choppy_sea said in Server 2019 randomly DNS stops:
@Dashrender Draytek 2860 (being replaced this week) - 3 * unifi switches (fibre linked) and a few rando APs dotted around. It's quite a simple setup, the APs are a bit sus. I could pull them all out...
Why are the AP's sus?
-
@DustinB3403 they're cheapo TP-Link routers acting as APs (they have an AP mode) I think they are sus because technically they are capable of DNS and DHCP themselves...could be one going rouge
-
@choppy_sea said in Server 2019 randomly DNS stops:
@DustinB3403 they're cheapo TP-Link routers acting as APs (they have an AP mode) I think they are sus because technically they are capable of DNS and DHCP themselves...could be one going rouge
Oh! Sorry I misread, I thought you said Unifi AP's but you said Unifi switches.
-
@choppy_sea said in Server 2019 randomly DNS stops:
@Obsolesce DNS logs show one interesting one linked, the log says that its transferred the master role from itself to itself https://imgur.com/a/4I75qnB if you mean somewhere else I apologise!
@Dashrender It happens for every device on the network!
@JasGot Yes the AD server does DNS and DHCP too, yes the Host on the domain
@notverypunny When I ping a known good IP i.e. 8.8.8.8 I get "...unreachable" rather than the "Ping request could not..."
OK, so if you can't even get out by IP, then strictly speaking DNS isn't the issue. Lower level TCP/IP or something else in the network is a problem before DNS even comes into play. Even if your DNS is completely offline you should be able to ping 8.8.8.8 or 1.1.1.1
I'd setup a standalone machine on the network with a static IP and have it pointed to external DNS. If it stops working when everything else does, then you know that it's something in your LAN > WAN setup. If it keeps working when everything else goes sideways then you're looking at the possibility of something wrong along the lines of the rogue DHCP that you've alluded to or other LAN-side gremlins. Don't rule out the possibility of a user having connected something that's doing all kinds of fun DHCP garbage.... Users can be... "special"
-
@notverypunny said in Server 2019 randomly DNS stops:
@choppy_sea said in Server 2019 randomly DNS stops:
@Obsolesce DNS logs show one interesting one linked, the log says that its transferred the master role from itself to itself https://imgur.com/a/4I75qnB if you mean somewhere else I apologise!
@Dashrender It happens for every device on the network!
@JasGot Yes the AD server does DNS and DHCP too, yes the Host on the domain
@notverypunny When I ping a known good IP i.e. 8.8.8.8 I get "...unreachable" rather than the "Ping request could not..."
OK, so if you can't even get out by IP, then strictly speaking DNS isn't the issue. Lower level TCP/IP or something else in the network is a problem before DNS even comes into play. Even if your DNS is completely offline you should be able to ping 8.8.8.8 or 1.1.1.1
I'd setup a standalone machine on the network with a static IP and have it pointed to external DNS. If it stops working when everything else does, then you know that it's something in your LAN > WAN setup. If it keeps working when everything else goes sideways then you're looking at the possibility of something wrong along the lines of the rogue DHCP that you've alluded to or other LAN-side gremlins. Don't rule out the possibility of a user having connected something that's doing all kinds of fun DHCP garbage.... Users can be... "special"
Rogue DHCP won't cause a universal issue like this all at once unless all the leases came up for renewal at the same time. Then on top of that, unless the rogue is on the VM host, he's not indicated he's done anything that would remove it from the network - like rebooting all APs/switches.. only reboot mentioned is the VM host.
I'm wondering if you have a bad NIC causing the switch that connects to your firewall to overload? or the switch itself is bad and flaky, but again, no mention of rebooting the switch to make things work again.