Yesterday, I spent most of my day wondering what was wrong with my unbound configuration…. as a TL;DR, if you’re creating a new TLD (e.g. foo.lan) you may need to disable DNSSEC checks on it within unbound’s config using the ‘domain-insecure‘ setting –
For history, we deploy unbound on servers to try and gloss over the fact that the Azure DNS service appeared unreliable when we first moved to the platform – we would see some DNS failures when trying to make outgoing HTTP requests. Perhaps normal people wouldn’t notice this, but we probably make a few million HTTP requests a day …. at which point a small percentage of failures becomes noticeable. Initially we tried systemd-resolved to paper over the problem, but that would sometimes crash in weird and wonderful ways …. so we moved to unbound.
Anyway, all was good … and over time I learnt to add in a manual override for a specific internal service (e.g. MySQL) which might look a bit like this :
server:
interface: 127.0.0.1
access-control: 127.0.0.0/8 allow
access-control: ::1/128 allow
forward-zone:
name: "."
forward-addr: 9.9.9.9
forward-addr: 1.1.1.1
forward-addr: 1.0.0.1
forward-zone:
name: "azure.com."
forward-addr: 168.63.129.16
So at least with my limited understanding of unbound, that’s a “listen on localhost, and forward anything you get to 9.9.9.9 or 1.1.1.1 or 1.0.0.1 …. additionally, if the thing you’re trying to lookup is under the azure.com domain, then just go straight to the internal server – 168.63.129.16”.
So that seems to work as expected. With azure, some addresses can only be resolved internally (especially if they’re through the privatelink service) – e.g. a MySQL instance with private link only connections would have a hostname like e.g. mysql-foo-bar.privatelink.mysql.database.azure.com which you can resolve internally through the platform, but not through a public DNS service – i.e.
$ host mysql-foo-bar.privatelink.mysql.database.azure.com 8.8.8.8
Host orlo-mysql-uk-south.privatelink.mysql.database.azure.com not found: 3(NXDOMAIN)
but this does work :
$ host mysql-foo-bar.privatelink.mysql.database.azure.com 168.63.129.16
mysql-foo-bar.privatelink.mysql.database.azure.com has address 172.1.2.3
Fast forward a few months/years, and I’m setting up a new (hopefully duplicated!) environment, and I decide to have an internal Private DNS Zone that all VMs should register on (you can only have one within a virtual network) – e.g. foo.lan
So, I added the ‘forward-zone’ stuff to unbound, in the same way, but it didn’t work … WTF is going on….
* Lookup directly on the plaform DNS server – 168.63.129.16 – works ( host myvm.foo.lan 168.63.129.16
– resolves )
* But if the request went through unbound, I’d get a SERVFAIL immediately. ( host myvm.foo.lan 127.0.0.1
– fails )
* Other names worked fine.
Eventually, I tried having a different TLD – and used (the equivalent of) ‘private.mycorp.com’ …. and that worked.
So what’s special about having a .lan ? well…. I did eventually discover that “dig @127.0.0.1 server.foo.lan +cd” would work …. where the ‘+cd’ is telling dig to NOT do checking on the DNSSEC validation of the response.
which finally led me to realising I needed a ‘domain-insecure: “foo.lan”‘ in the server block of unbound’s config – leading to this … which thankfully works.
server:
interface: 127.0.0.1
access-control: 127.0.0.0/8 allow
access-control: ::1/128 allow
domain-insecure: "foo.lan"
# bookworm's unbound library doesn't seem to cope with ssl verification? does it need openssl3?
forward-zone:
name: "."
# forward-tls-upstream: yes
# forward-addr: 9.9.9.9@853#dns.quad9.net
# forward-addr: 1.1.1.1@853#cloudflare-dns.com
# forward-addr: 1.0.0.1@853#cloudflare-dns.com
forward-addr: 9.9.9.9
forward-addr: 1.1.1.1
forward-addr: 1.0.0.1
forward-zone:
name: "azure.com"
forward-addr: 168.63.129.16
forward-zone:
name: "azure.net"
forward-addr: 168.63.129.16
# https://learn.microsoft.com/en-us/azure/virtual-network/what-is-ip-address-168-63-129-16?tabs=linux
# we've got a private-dns zone of foo.lan, which all VMs will auto-register with.
forward-zone:
name: "foo.lan"
forward-addr: 168.63.129.16
So, yes … the problem is always DNS …. and hopefully this may help someone else NOT waste half a day head scratching.