Thursday, March 18, 2021
Leave it to the phone companies to make simple ideas complex
These telephone DNS records are typically expected to contain a record with the type NAPTR. NAPTR, generally speaking, is intended to map addresses (more properly URNs) to other types of addresses. For example, E.164 to SIP. So the NAPTR record, typically, would provide an address type of E2U+sip which indicates ENUM (E.164 number mapping) to the SIP protocol.
Fascinatingly, the actual payload of an NAPTR record is… a regular expression. The regular expression specifies a transformation (with capture groups and the whole nine yards) from the original address (the E.164 number) to the new address type. In theory, this allows optimization of NAPTR records at higher levels of the hierarchy if components of the original address are also used in the new address. This is one of many areas of DNS that are trying perhaps too hard to be clever.
Computers Are Bad: can I get your number domain
I'm very surprised I haven't mentioned NAPTR records before, because I have to deal with them at The Corporation. The service we provide to our customers, the Oligarchic Cell Phone Companies, is to translate a phone number, such as “867-5309” to “Jenny,” and that is done via DNS using NAPTR records. I was surprised at first, because I was expecting the Oligarchic Cell Phone Companies to use some esoteric and proprietary protocol to handle such information, but nope—it's via DNS. One complication here is that each phone company keeps its own database of numbers to names, and you have to negotiate with, and more importantly, pay, each one to query their data. Now, because The Corporation has all these contracts in place, all an Oligarchic Cell Phone Company has to do is pay us to get access to the name databases of the other Oligarchic Cell Phone Companies.
A classic middleman situation here.
But getting back to the article. The author uses the example
+18002662278
as an example of an E.164 global number (where
E.164 is a standard from ITU), but the example is a bad one, because 800 numbers are
not global numbers! You can only dial 800 numbers from North
America, so to present +18002662278
as a global number is
incorrect. This is an issue for us, because Project: Sippy-Cup has to deal with both global and
“local” numbers (where “local” means—intra-country code calling). and yes, we
get a ton of 800 numbers sent in as global numbers (then again, we get an
amazing assortment of garbage numbers from the Oligarchic Cell Phone
Companies, including all zeros, all ones, numbers with international dialing
prefixes, short codes, service codes, and it wouldn't surprise me if we get
ZIP codes from time to time—you'd
think the Oligarchic Cell Phone Companies would filter out such garbage, but
hey, they're the Oligarchic Cell Phone Companies, they don't have to
care).
I also agree with the author about the use of regular expressions in the NAPTR record. It complicates the spec and in the decade I've been working at The Corporation, I've yet to see anyone use the regular expression transformation, and in the source code parse NAPTR records is this comment from the original developer: “Skip the insanely stupid regexp pattern; we are ignoring it.”
The name lookups are not using the “E2U+sip” addressing, but
“E2U+pstndata:cnam” addressing, based off a draft
standard from twelve years ago. And reading that specification is a hoot
in the light of the harsh reality of … um … reality. It defines the
pstndata:
scheme as:
pstndatauri = "pstndata:" datatype ["/" telephone- subscriber ] ";" content datatype = "cnam" ; Other datatypes can be defined by adding ; alternative values. content = [ mediatype ] [ ":base64" ] "," data mediatype = [ type "/" subtype ] *( ";" parameter ) data = *urlchar parameter = attribute "=" value where "telephone-subscriber" is imported from RFC 3966 [19], "urlchar" is imported from RFC 2396 [20], and "attribute" and "value" are imported from RFC 2045 [21].
LIES! ALL LIES!
Okay, it's not quite that bad, but having dealt with all three
RFCs mentioned, I'm not sure the
author of the specification carefully read the them, or considered deeply how
they might interact. The definition of telephone-subscriber
contains not only the “phone number” but also parameters separated by
semicolons! I have code to parse telephone-
subscriber
but I can't use it here because in reality, it
conflicts with this definition (why yes, the whole notion of parsing this
data came up recently at work, why do you ask?) It's always fun when reality
trumps theory (no, not really).
So as the author states, “DNS does handle phone numbers!” But it's not quite as simple as IP addresses.