Monday, 6 August 2012

Bug 1010724: Why doesn't dnsmasq listen on both IPv4 and IPv6?

Dnsmasq currently only listens on 127.0.0.1; that's done on purpose. If the only nameserver you have is 127.0.0.1, both IPv4 and IPv6 queries will go through it. It doesn't listen on an IPv6 address. We'll likely change the actual address to '127.0.1.1' as soon as this is possible with dnsmasq, there are changes coming up upstream that should support this.

Letting dnsmasq listen on IPv6 is definitely something I wouldn't mind to see working; but it's unfortunately not as simple as adding '--listen-address=::1' to the parameters passed to dnsmasq by NetworkManager. (Actually, it could be, see below)

I understand some may want to disable all IPv4 on their systems, but that's not advisable, at least for the time being and for the loopback interface and dnsmasq specifically. You absolutely can have an IPv6-only system with no IPv4 addresses on any of the physical interfaces, yet retain the use of 127.0.0.1 on the loopback interface for dnsmasq and others -- DNS resolution will still work for both IPv4 and IPv6 without issues, and you will simply not be able to access IPv4 addresses (since it would be an IPv6-only system for the physical interfaces).

The reason just '--listen-address' can't be used is because we've already had reports about dnsmasq listening on 127.0.0.1 being an issue. It's one we want to address. When installed from the 'dnsmasq' package on Ubuntu/Debian; dnsmasq ships an init script that listens on that loopback IPv4 address as well; causing issues for those who genuinely want to run a system-wide instance of dnsmasq that can be interrogated via loopback (thus serving the local machine), or users who haven't changed any of the default configuration for dnsmasq.

In the case of 127.0.0.1, the fix is relatively simple because we can switch to using 127.0.0.2 or 127.0.1.1; but for IPv6, there doesn't seem to be any such thing other than ::1 specifically meant to be used as a loopback address. In IPv4, it's actually a whole subnet that is available to the loopback interface; while in IPv6 you only have one address (::1/128) (see http://tools.ietf.org/html/draft-smith-v6ops-larger-ipv6-loopback-prefix-00).

I'm very open to suggestions; at this point I'm looking for great ideas on how to best fix this and avoid concurrency issues with other applications; but given the rather minimal return of enabling it vs. the impact on other software running on the machine, and because we ran into precisely this kind of issue (multiple applications listening on the same address on port 53) already, I'd be inclined to have a real good alternative before changing things.

Consider the following two strace outputs for 'ping6 www.google.com'. The first one was run with dnsmasq started (manually, for testing purposes, but with the same parameters as NetworkManager uses) to listen on IPv4:

read(3, "# Dynamic resolv.conf(5) file fo"..., 4096) = 183
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f45cba80000, 4096)            = 0
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "\r\347\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\34\0\1", 32, MSG_NOSIGNAL, NULL, 0) = 32
poll([{fd=3, events=POLLIN}], 1, 5000)  = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [90])                = 0
recvfrom(3, "\r\347\201\200\0\1\0\2\0\0\0\0\3www\6google\3com\0\0\34\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 90
close(3)                                = 0
socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET6, sin6_port=htons(1025), inet_pton(AF_INET6, "2001:4860:800a::93", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)


The network is unreachable only because I didn't have IPv6 access at the time. You can see that the request was sent and the address was properly discovered as "2001:4860:800a::93". The most important part is the first connect() using AF_INET as family, and "127.0.0.1" as the address -- that was libc trying to reach the nameserver defined in /etc/resolv.conf.

Now consider the following strace output, which is for the same request sent while dnsmasq was configured to listen only on ::1; and with ::1 defined as the nameserver in /etc/resolv.conf:

socket(PF_INET6, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET6, sin6_port=htons(53), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "\220]\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\34\0\1", 32, MSG_NOSIGNAL, NULL, 0) = 32
poll([{fd=3, events=POLLIN}], 1, 5000)  = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [90])                = 0
recvfrom(3, "\220]\201\200\0\1\0\2\0\0\0\0\3www\6google\3com\0\0\34\0\1"..., 1024, 0, {sa_family=AF_INET6, sin6_port=htons(53), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 90
close(3)                                = 0
socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET6, sin6_port=htons(1025), inet_pton(AF_INET6, "2001:4860:800a::93", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)


Very much the same behavior as above. This time, the entry in /etc/resolv.conf was ::1, so that's what was used for the first connect(); and because that's an IPv6 address, AF_INET6 was used as sa_family.

Both IPv4 and IPv6 queries were the first to run, and returned pretty much instantly.

One alternative to allow dnsmasq to listen on both IPv4 and IPv6 could be adding a loopback interface (or a tap interface) and using a limited scope IPv6 address, but there remains gotchas with this particular course of action -- for instance, dnsmasq currently appears to bind to *both* the specified link-local address added to lo as well as the "primary" IPv6 address defined for lo (::1/128).

Furthermore, it seems rather clumsy to me to include both the IPv4 and IPv6 addresses in /etc/resolv.conf when they refer to the same software instance. It's not going to bring much.

If you don't care at all about these details, whether ::1 shows up in /etc/resolv.conf automatically, don't run other instances of dnsmasq and want to experiment with custom configurations; in Quantal you'll be able to add configuration settings to files in /etc/NetworkManager/dnsmasq.d and tweak the settings as required.