The sane library for decoding DNS packets

Monday, Debtember 06, 2010

Every so often over the past twenty years or so, I've had the need to make DNS queries other than the standard A type; you know, like MX records, or PTR records, but the standard calls available under Unix are lacking. And each time I've investigated other DNS resolving libraries, they've been horrendously complex while at the same time, only resolving a few record types.

So when my attention turned once again towards DNS, I decided that the best course of action was to buckle down and write my own library, which I did over this past Thanksgiving holiday weekend.

My approach to the problem I think is unique. At least, compared to all the other DNS resolving libraries I glanced over, it's unique. First off, I pretty much ignored the actual network aspect. Sure, I have a simple routine to send a query to a DNS server, but the code is dumb and frankly, not terribly efficient, seeing how it opens up a new socket for each request. Also, it only handles UDP based requests, and pretty much assumes there will be no network errors that drop the request (which as far as I can tell, hasn't actually happened).

Instead, I concentrated on the actual protocol aspect of the problem, decoding raw packets into something useful, and taking something useful and encoding it into a raw packet. I tackled encoding first, and it's an elegent solution—you fill in a structure with appropriate data and call an encoding routine which does all the dirty work.

dns_question_t domain; /* what we're asking about */
dns_query_t    query;  /* the "form" we fill out */
dns_packet_t   request[DNS_BUFFER_UDP]; /* the encoded query */
size_t         reqsize;
dns_rcode_t    rc;

domain.name  = "conman.org."; /* only fully qualified domains name */
domain.type  = RR_LOC;        /* let's see where this is located */
domain.class = CLASS_IN;

query.id        = random(); /* randomize the ID for security */
query.query     = true;     /* yes, this is a query */
query.rd        = true;     /* we're asking for a recursive lookup */
query.opcode    = OP_QUERY; /* obviously */
query.qdcount   = 1;        /* we're asking one question */
query.questions = &domain;  /* about this domain */

reqsize = sizeof(request);
rc      = dns_encode(request,&reqsize,&query);

if (rc != RCODE_OKAY)
{
  /* Houston, we have a problem! */
}

Once encoded, we can send it off via the network to a DNS server. Now, you may think that's quite a bit of code to make a single query, and yes, it is. And yes, it may seem silly to mark the query as a query, and have to specify the actual operation code as a query, but there are a few other types of operations one can specify and besides, this beats the pants of setting up queries in all the other DNS resolving libraries.

And this approach to filling out a structure, then calling an explicit encoding routine is not something I've seen elsewhere. At best, you might get a library that lets you fill out a structure (but most likely, it's a particular call for a particular record type) but then you call this “all-dancing, all-singing” function that does the encoding, sending, retransmissions on lost packets, decoding and returns a single answer. My way, sure, there's a step for encoding, but it allows you to handle the networking portion as it fits into your application.

Anyway, on the backend, once you've received the binary blob back from the DNS server, you simply call one function to decode the whole thing:

dns_packet_t   reply[DNS_BUFFER_UDP];
size_t         replysize;
dns_decoded_t  decoded[DNS_DECODEBUF_4K];
size_t         decodesize;
dns_query_t   *result;

/* reply contains the packet; replysize is the amount of data */

decodesize = sizeof(decoded);
rc         = dns_decode(decoded,&decodesize,reply,replysize);

if (rc != RCODE_OKAY)
{
  /* Houston, we have another problem! */
}

/* Using the above query for the LOC resource record */

result = (dns_query_t *)decoded;

if (result->ancount == 0)
{
  /* no answers */
}

printf(
  "Latitude:  %3d %2d %2d %s\n"
  "Longitude: %3d %2d %2d %s\n"
  "Altitude:  %ld\n",
  result->answers[0].loc.latitude.deg,
  result->answers[0].loc.latitude.min,
  result->answers[0].loc.latitude.sec,
  result->answers[0].loc.latitude.nw ? "N" : "S",
  result->answers[0].loc.longitude.deg,
  result->answers[0].loc.longitude.min,
  result->answers[0].loc.longitude.sec,
  result->answers[0].loc.longitude.nw ? "W" : "E",
  result->answers[0].loc.altitude
);

Yes, it really is that simple, and yes, I support LOC records, along with 29 other DNS record types (out of 59 defined DNS record types). So I'm fully decoding half the record types. Which sounds horrible (“only 50%?”) until you compare it to the other DNS resolving libraries, which typically only handle around half a dozen records, if that. Even more remarkable is the amount of code it takes to do all this:

Lines of Code to decode DNS records
Library	Lines of code	Records decoded
`spcdns`	1321	30
`c-ares`	1452	7
`udns`	872	6
`adns`	1558	13
`djbdns`	1276	5

(I should also mention that the 1,321 lines for spcdns include the encoding routine; line counts for all the other libraries exclude such code; I'm too lazy to separate out the encoding routine in my code since it's all in one file.)

How was I able to get such densities of code? Well, aside from good code reuse (really! Nine records are decoded by one routine; another twelve by just three routines) perhaps it was just the different approach I took to the whole problem.

Another feature of the code is its lack of memory allocation. Yup, the decoding routine does not once call malloc(); nope, all the memory it uses is passed to it (in the example above, it's the dns_decoded_t decoded[DNS_DECODEBUF_4K] line). I've found through testing that a 4K buffer was enough to decode any UDP-based result. And by giving the decode routine a block of memory to work with, not only can it avoid a potentially expensive malloc() call, it also avoids fragmenting memory and keeping the memory cache more coherent (when Mark saw an earlier version he expressed concern about alignment issues, as at the time, I was passing in a character buffer; I reworked the code such that the dns_decoded_t type forced the memory alignment, but because an application may be making queries via TCP, which means they can be bigger, I didn't want to hardcode a size into the type; it would either be too big, thus wasting memory, or too small, which makes it useless; the way I have it now, with the array, you can adjust the memory to fit the situation).

And from that, the code itself is thread safe; there are no globals used, unlike some other protocol stacks I've been forced to use, so there should be absolutely no issues with incorporating the library into an application.

Oh, and I almost forgot the Lua bindings I made for the library. After all, a protocol stack should make things easy, right?

Update on Saturday, October 26^th, 2013

I was brought to my attention (thank you, Richard E. Brown) that I should link to the source code so that it's easier to find.

So linked.

The Boston Diaries

Monday, Debtember 06, 2010