DNS 1: Research and Encoding

On a personal level, I’ve been interested in DNS resolution and how it actually works for a while. Whenever it’s been described to me, it sounds simple, and yet, consulting with tutorials consistently led to such results as: “we’ll use a library for parsing DNS packets”, “let’s implement a subset”, and such. Of course, there were some truly excellent guides out there, but I decided that if I really wanted to do it, I should do it properly and hit the books (or, more accurately, the Request for Comments). Thus, as I develop it, I’m going to go step by step and document what I find.

How complicated could it really be—

oh my

How DNS Resolution Works

RFC 1035

So, good news for me. The number of updates was misleading, thankfully. I was fully committed either way, but a cursory glance through them told me most didn’t matter or were implementation details I didn’t care about yet. So let’s actually get somewhere first, and then we’ll care about those.

Underlyingly, DNS is pretty simple. There are “zones”, which approximately correlate to subtrees in an internet-wide sort of file hierarchy (practically speaking, the little chunks separated by dots). In order to know what is inside a “zone”, you have to do a send a query to an authoritative source—the nameserver responsible for that zone¹. It’ll give you a response, and based on what you get, you either are referred to the authoritative for a lower subtree or you have ana answer to your query (if there is one). All resolvers start out with knowledge of at least one authoritative to refer to (the spec seems to assume this is a root nameserver as managed by IANA, but obviously this isn’t usually strictly true).

Instead of directly querying from top to bottom down the tree hierarchy, a client can (and pretty much always does) query a recursive nameserver. These follow down the tree, and then cache results for durations specified in the responses to be able to respond to future queries without doing the whole process again.²

Interestingly, though the spec distinguishes between resolvers and recursive nameservers, resolvers running as a daemon on the host system can act similarly to a recursive nameserver, caching included—this is referred to in one place as a “DNS stub”. For an example, see systemd-resolved. All things considered, I would like to try to build something along these lines I can run locally, but also something I could deploy to my personal server, so I’m opting to build a library and then a resolver application dependent on it.

Regardless, for a normal, nothing-cached-whatsoever request, you’d query one of the root nameservers³, which would then give you an NS (nameserver) response to send you to the TLD nameserver (e.g., .com, .net), and that would send you to an authoritative, and eventually something will hand you an A or a CNAME or whatever. Or nothing at all. What are those? Well.

Huh, There’s a Lot of Response Types

All communication in DNS is done in terms of messages. This is a very simple structure consisting of a header and four optional data fields. The format is helpfully and intuitively defined in terms of octets, helpfully avoiding any semantics about the edgiest of edge cases about what the smallest addressable unit is on some systems that may or may not exist. It then consists of a header, zero or ~~more~~ one questions⁴, zero or more answers, zero or more authorities, and zero or more additionals. Of these, answers, authorities, and additionals all share a format known as a “resource record” or RR.

The header structure is fairly straightforward:

Section	Size	Purpose
ID	2 octets	Identifier to match up responses to queries
QR	1 bit	Set if message is response, else query
OPCODE	4 bits	Specifies the kind of query
AA	1 bit	Set if response is from authoritative
TC	1 bit	Set if message was truncated (i.e., INCOMPLETE, not compressed)⁵
RD	1 bit	Optional for implementation, if set in query it requests recursive behavior
RA	1 bit	Set in response if nameserver supports recursive resolution
Z	3 bits	Reserved, should be zeroed
RCODE	4 bits	Response codes
QDCOUNT	2 octets	Number of entries in question section
ANCOUNT	2 octets	Number of entries in answer section
NSCOUNT	2 octets	Number of entries in authority section
ARCOUNT	2 octets	Number of entries in additional section

And so, for that matter, is a question field, except for two details:

Section	Size	Purpose
QNAME	Variable, up to 255 octets	The domain to be resolved
QTYPE	2 octets	Specifies the qtype of query
QCLASS	2 octets	Specifies the qclass of query

Those details being types and classes, which are shared between questions and response records (though questions have a slightly larger set for both). The class describes the network the request should be regarding (e.g., in essentially every modern situation, class 1 = IN = the Internet; though it looks like Hesiod might still matter in some niche situations), and the type describes the type of record being sought (ranging from very recognizable ones like A records, CNAME records, and MX (mail) records to much more niche ones I’ve personally never heard of like PTR “domain name pointer”, WKS “well known service description”, and SOA “start of zone of authority”. I don’t know what most of them mean, but I’ll definitely research them as I move forwards and put it in a future post.

Regardless, the full list in C macro form works out to:

#define DNSL_DEFINE_COMMON_TYPE_VALUES(name) \
    DNSL_##name##_A = 1, \
    DNSL_##name##_NS = 2, \
    DNSL_##name##_OBSOLETE_MD = 3, \
    DNSL_##name##_OBSOLETE_MF = 4, \
    DNSL_##name##_CNAME = 5, \
    DNSL_##name##_SOA = 6, \
    DNSL_##name##_EXPERIMENTAL_MB = 7, \
    DNSL_##name##_EXPERIMENTAL_MG = 8, \
    DNSL_##name##_EXPERIMENTAL_MR = 9, \
    DNSL_##name##_EXPERIMENTAL_NULL = 10, \
    DNSL_##name##_WKS = 11, \
    DNSL_##name##_PTR = 12, \
    DNSL_##name##_HINFO = 13, \
    DNSL_##name##_MINFO = 14, \
    DNSL_##name##_MX = 15, \
    DNSL_##name##_TXT = 16

#define DNSL_DEFINE_COMMON_CLASS_VALUES(name) \
    DNSL_##name##_IN = 1, \
    DNSL_##name##_OBSOLETE_CS = 2, \
    DNSL_##name##_CH = 3, \
    DNSL_##name##_HS = 4
...
enum dnsl_question_type {
    DNSL_DEFINE_COMMON_TYPE_VALUES(QUESTION_TYPE),
    DNSL_QUESTION_TYPE_AXFR = 252,
    DNSL_QUESTION_TYPE_MAILB = 253,
    DNSL_QUESTION_TYPE_OBSOLETE_MAILA = 254,
    DNSL_QUESTION_TYPE_ALL = 255
};

enum dnsl_question_class {
    DNSL_DEFINE_COMMON_CLASS_VALUES(QUESTION_CLASS),
    DNSL_QUESTION_CLASS_ALL = 255
};
...
enum dnsl_rrecord_type {
    DNSL_DEFINE_COMMON_TYPE_VALUES(RRECORD_TYPE)
};

enum dnsl_rrecord_class {
    DNSL_DEFINE_COMMON_CLASS_VALUES(RRECORD_CLASS)
};

With that, all we have left to look into structure wise is resource records. They’re not too dissimilar from questions, but much more complicated in practice:

Section	Size	Purpose
NAME	Variable, up to 255 octets	The domain name to which the record pertains
TYPE	2 octets	Specifies the type of the response
CLASS	2 octets	Specifies the class of the response
TTL	4 octets	Time interval in seconds before cached response must be expired
RDLENGTH	2 octets	Length of the RDATA field
RDATA	Variable	Describes the resource depending on TYPE and CLASS

So… what is the RDATA format?

That depends. And varies quite a bit. So we’ll get to that in a future post when I finish implementing responses. For the meantime, it’s essentially the result of the query. For instance, an A query returns a 4 octet IP address, while CNAME returns a variable-length domain name.

Odds and Ends

With that, there’s two more minor details to cover in the spec before getting to implementation and such.

Domain Name Format

Domains run through a quick processing step before being inserted into queries. For each “label” in the original domain name—i.e., each portion separated by the dots—it should be prefixed with its length, and null terminated. For instance, www.paterissa.net would become 0x03www0x09paterissa0x03net0x00. Don’t think this can always be treated as a C string, though, since technically \0x00 can legally occur inside a label itself⁶. I’m not entirely certain how that interacts with internationalization/IDN names yet. I’ll come back to that at the tail end of a future post. Finally—and this is important for the next section—labels are strictly restricted to a maximum size of 63 octets. This is because the two high bits are reserved.

Message Compression

To reduce message size, DNS uses a compression scheme to eliminate the repetition of domain names across multiple values in a single response. The spec is a bit difficult to parse on this: it says

The compression scheme allows a domain name in a message to be
represented as either:

a sequence of labels ending in a zero octet

a pointer

a sequence of labels ending with a pointer

So… what does that actually mean? It’s pretty simple, just described in an unnecessarily difficult fashion. Let’s say you have example.com, sub1.example.com, sub2.example.com, and sub1.sub2.example.com, in that order. That means:

example.com > example + com + 0x00
sub1.example.com > sub1 + pointer(example.com) or sub1 + example + com + 0x00
sub2.example.com > sub2 + pointer(example.com) or sub2 + example + com + 0x00
sub1.sub2.example.com > sub1 + pointer(sub2.example.com) or sub1 + sub2 + pointer(example.com) or sub1 + sub2 + example + com + 0x00

So it’s not really that complicated. The pointer format is a two octet sequence that is 0b11+ offset, where offset measures from the start of the header.

“each zone is the complete database for a particular ‘pruned’ subtree of the domain space” ↩︎
“In either case, resolvers are replaced with stub resolvers which act as front ends to resolvers located in a recursive server in one or more name servers known to perform that service” ↩︎
The list is available here ↩︎
So. Facts to keep you up at night. There is absolutely nothing forcing this to be 0 or 1. Indeed, there’s an entirely separate RFC noting that, hey, technically this can be as many as 65,535, but it’s fiiiine and it’s USUALLY one and indeed “several parameters specified for DNS response messages such as AA and RCODE have no defined meaning when the message contains multiple queries as there is no way to signal which question those parameters relate to” so really an implementation should “A DNS message with OPCODE = 0 and QDCOUNT > 1 MUST be treated as an incorrectly formatted message”. Cool? Cool. ↩︎
It’s not abundantly clear in the spec, but RFC 2181 specifies: “Where TC is set, the partial RRSet that would not completely fit may be left in the response. When a DNS client receives a reply with TC set, it should ignore that response, and query again, using a mechanism, such as a TCP connection, that will permit larger replies.” ↩︎
Sigh. Yes, this also means that to be compliant you can’t simply stop at 0x00. ↩︎

Paterissa