ZeroTier and DNS issues
-
@dafyre Well here I am! (Author/founder of ZeroTier)
Reading the above, it seems the issue is active directory DNS. While I know tons about networking, I am not unfortunately an AD expert.
Pertino it seems highjacks DNS. This stuff is in the category of things we want to avoid-- ugly, nasty hacks that fix one thing but likely break everything else. This "enterprise" approach is how Windows networking got in such a bad state to begin with -- in digging into Windows one can see how this or that hack was put in place to make this or that work in an "enterprise" environment, and each hack results in a fractal explosion of edge cases that in turn demand more and more ugly hacks, and so on, until the entire thing becomes the ridiculous ball of garbage that it is today.
But in some cases we have simply been forced to do it. In all such cases we've tried to build such hacks as far from the ZeroTier core as possible. Here's one from WindowsEthernetTap:
https://github.com/zerotier/ZeroTierOne/blob/master/osdep/WindowsEthernetTap.cpp#L902
So let me explain my understanding of this Windows AD DNS issue:
Windows AD DNS likes to automatically register DNS entries for all adapters in the system. When ZT adapters are added, these can collide with, override, or pollute the DNS space with undesired entries. Is this the problem?
If not, can someone explain the issue in a bit more detail? What precisely is going on under the hood? Maybe we can figure out and document a fix that's more elegant.
-
@adam.ierymenko Welcome aboard! 8-)
-
@adam.ierymenko said:
So let me explain my understanding of this Windows AD DNS issue:
Windows AD DNS likes to automatically register DNS entries for all adapters in the system. When ZT adapters are added, these can collide with, override, or pollute the DNS space with undesired entries. Is this the problem?
Sounds about right.
-
@colliver How are these DNS records "learned?" What is the mechanism? Does the client computer enumerate its interface addresses to AD, or does AD learn them by listening to the network?
Also in the scenario described above:
(1) Is the desire to use the ZeroTier network as the main "company LAN" and run AD over that?
(2) Or is the desire to use ZeroTier for other purposes and keep the main company LAN from getting polluted by its addresses?
These are obviously different use cases. I imagine #1 might actually be easier than #2, but that's a greenfield approach.
-
@adam-ierymenko gets plenty of bonus points for clearly documenting Windows Hacks. 8-)
-
@adam.ierymenko said:
@colliver How are these DNS records "learned?" What is the mechanism? Does the client computer enumerate its interface addresses to AD, or does AD learn them by listening to the network?
Also in the scenario described above:
(1) Is the desire to use the ZeroTier network as the main "company LAN" and run AD over that?
(2) Or is the desire to use ZeroTier for other purposes and keep the main company LAN from getting polluted by its addresses?
These are obviously different use cases. I imagine #1 might actually be easier than #2, but that's a greenfield approach.
I don't use ZeroTier at the moment. I'm pretty sure the AD client sets their own DNS entry when they register with the domain controller (someone please correct me if I'm wrong).
I think @Dashrender was going for option 2 in this case.
-
Hmm... so perhaps the problem is that the AD client is enumerating its addresses and choosing the wrong one. If so, I think the thing to do would be to look into Windows' logic for choosing the "primary" interface and/or AD's logic for choosing which IP address(es) reported by clients to name as their primary.
You would think Windows would be smart enough to set AD DNS to IPs within the IP block managed by AD, but that probably assumes too much.
-
@adam.ierymenko said:
Hmm... so perhaps the problem is that the AD client is enumerating its addresses and choosing the wrong one. If so, I think the thing to do would be to look into Windows' logic for choosing the "primary" interface and/or AD's logic for choosing which IP address(es) reported by clients to name as their primary.
You would think Windows would be smart enough to set AD DNS to IPs within the IP block managed by AD, but that probably assumes too much.
In the past with Pertino we were able to change the default adapter on the Pertino clients to the physical adapter (moving it up in the list). That seemed to fix the issue for us... not sure if that would solve the issue here or not.
-
Hmm... so the question is: how does Windows determine a priority list for adapters and which one is 'default?' Answering that question seems more elegant than highjacking DNS.
-
Root of the problem is that none of these protocols (DNS, AD, DHCP, etc.) were designed for a world in which a client can belong to more than one network.
-
??? Could this perhaps be helpful?
-
@adam.ierymenko said:
??? Could this perhaps be helpful?
Yep, that's what I ended up doing. It worked out for what we needed.
-
@scottalanmiller I wonder then: why all the DNS magic if the issue can be solved by simply setting connection priority? Or did the under the hood DNS magic solve a different issue?
-
Adam, Welcome to ML! thanks for taking the time to join and post!
Here's my setup.
I have a DC (DC2) that is also a file server that has ZT installed on it. I have one client laptop that also has ZT installed and connects to the DC/Fileserver just fine.
The problem, as you surmised, is that the ZT adapter registered itself with DNS when it came online. My DNS server now has two IP addresses for this DC.
Hold on to you seat, I'm going to spill a whole lot more information.
I discovered a problem when I was on DC1 trying to ping DC2. When I typed
ping dc1
I received the IP address for the ZT adapter, and the request timed out (there's no route to get there on my network).
Even pingingping dc1.domain.com
didn't work, same result.
At first I had completely forgotten about my installation of ZT on DC2 and even though IP provided by the above ping tests wasn't something listed in my external DNS (Split brain DNS) I recall having weird DNS issues in the past when IPv6 was enabled. I disabled IPv6 on DC1, tried the ping test again, wala! it worked.I started digging around a bit more trying to figure out where this oddball 10.x.x.x address was coming from and then I bumped into my install of ZT. OK mystery solved.
I jumped into DNS and found the second DNS record for DC2 with the ZT IP.
Now I'm asking myself - what keeps DNS from giving the ZT IP to internal Windows clients when they are querying for DC2?
This lead me to the desire to remove the second entry, but I couldn't just delete it from DNS, the next time the adapter refreshes on DC2 it will simply be re-added. So someone suggested removing the checkmark next to "Register this connection's addresses in DNS" under Adapter settings > ZT adapter > IPv4 > Advanced button > DNS tab. Sadly that didn't work. Why you ask? Because Windows won't allow you to access the DNS tab if you don't manually assign an IP address to the adapter, and ZT is using DHCP. Now I'm guessing there is a registry key I could change - but before I went that far, I started this thread.
Upon further consideration I also realized that I don't want to remove the ZT IP from DNS, because then my ZT client would no longer be able to use DNS to find DC2.
Hopefully this is enough to get started.
-
@adam.ierymenko said:
??? Could this perhaps be helpful?
This would not have solved my issue, ZT wasn't installed on DC1, so there was nothing to change.
-
"Upon further consideration I also realized that I don't want to remove the ZT IP from DNS, because then my ZT client would no longer be able to use DNS to find DC2."
Oh my. The horror. Let me check my understanding. What you want (ideally) is:
(1) Clients on the regular network get regular network addresses when they resolve things, especially the DCs.
(2) Clients on the ZeroTier network get ZeroTier network addresses when they resolve things, especially the DCs.
... but DNS and AD were not designed for multi-path or multi-network use.
I think I might understand Pertino's hack now. They hacked multipath into DNS by rewriting DNS queries or responses based on which networks you belong to. (???)
If this were a greenfield deployment I'd suggest using ZeroTier as the company LAN and binding AD only to that. This is what we'd call the "fully virtualized network." We have some distributed teams doing that and it works fine, though they had to do a bit of hackery to coax AD into preferring the ZT interface.
Here's an idea, though we have not tried this:
Don't run ZeroTier on the domain controllers. Instead, set up a Linux VM and bridge the DC's network to the ZeroTier network. Then set up the ZT network to assign IPs within the main network's range but in a region that will not be handed out on the main network by its DHCP servers. Now when clients join ZT they get another main network IP address and from the perspective of any clients on the main network they now have two connections to it. It looks as if they have two network cards with two cables plugged into the same switch (which is legal, and each will get a different IP).
Then set the physical interface to higher priority on the clients. When connected to the LAN, clients will go over that. When off-site, clients will go through ZT.
Now you no longer have two address spaces, so DNS will just have one IP.
-
That solution looks good for a primarily mobile user, so you'll have little concern about having two DNS entries for the client in DNS. This is a problem when you are trying to manage the client devices, you can run into the same problem as my two IP addressed DC.
But I see the potential for a lot of problems for someone who is in and out, and finding themselves most of the time having two DNS entries.
-
On further thought, I wonder if it would work if ZT were given the same IP scheme as the main network, but were set to a lower priority on all machines. Then it would be used as an alternate path and only if the main network were not available.
This might be something we'd want to officially support: "shadow network" use case?
-
@Dashrender Yes, I can see issues as well. Unfortunately I can't see a clean solution that doesn't involve either changing the layout of things or some form of client-side hackery. But I want to think about this a bit longer.
-
I'll agree that DNS doesn't handle mulit-homed computers well - well that's to say that our ability to use DNS effectively when a device has more than one registered IP is poor at best.
AD itself doesn't care about IP space other than it's ability to reference DNS to find a device, which is a mandate, but not what I would call a failing or falter on the part of AD.