The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Tuesday, May 03, 2022

The legality of double slashes in URIs

Martin Chang replied to my musings on processing malformed Gemini requests, saying that double slashes in URIs are illegal, and pointed out the ABNF grammar from the URI specification to back up his claim:

path          = path-absolute   ; begins with "/" but not "//"
path-absolute = "/" [ segment-nz *( "/" segment ) ]
segment-nz    = 1*pchar
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

But he didn't quote the segment rule:

segment       = *pchar

which translated says, “0 or more pchar rules.”

So the ABNF he quoted does indeed rule out //­boston/­2018/­07/­04.2. It doesn't rule out /­boston//­2018/­07/­04.2, since by the time we hit the double slash, we're in the *( "/" segment ) part of the path-absolute rule, and segment can have 0 characters. But what he quoted only applies to relative links, what I receive is an abolute link. If you follow the ABNF from that perspective:

URI-reference = URI / relative-ref
URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part     = "//" authority path-abempty
                 / path-absolute
                 / path-rootless
                 / path-empty

path-abempty  = *( "/" segment )

; other rules omitted

not only does this allow gemini://­gemini.conman.org/­/­boston/­2018/­07/­04.2 but gemini://­gemini.conman.org/­/­/­/­/­/­/­/­/­/­/­boston/­2018/­07/­04.2.

I can understand why this was done—to simplify the grammar as the various path- rules generally end with *( "/" segment ) allows one to end a URI with a trailing slash or not. I don't think the intent was to allow long strings of slashes, but that's the end result of a lax grammar. Martin is also correct that multiple slashes are treated as a single slash on POSIX (basically, any Unix system), that's not the case across all operating systems. One exception I can think of AmigaOS, where each slash represents a parent directory. This command, cd /// on AmigaOS is the same as cd ‥/‥/‥ on a POSIX system. Crazy, I know. And maybe not even relevant these days, but I thought I should mention it.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.