Friday, May 20, 2022
If you have to embrace the stupid, you might as well do it well
Our customer, The Oligarchic Cell Phone Company, wants us to do a demo of a new feature for a certain class of clients. “Project: Lumbergh” will receive a URL along with the name and reputation of a phone number it gets from elsewhere. “Project: Lumbergh” will then pass this along to “Project: Sippy-Cup.” We already have to deal with URLs from elsewhere. The only change we have to make is allowing URLs to be passed along to the certain class of clients, which formerly did not get URLs. So far, so good.
But then I saw code being added to “Project: Lumbergh” to check the URLs to see if the path portion ended in .bmp
.
I enquired about this,
because to me,
that makes no sense—we're just a conduit for data;
the source of the URL should already know what it can and can't send to the client.
I was told that the certain class of clients only support BMP files while other clients that can receive URLs can't support BMP files,
so we have to ensure that BMPs only go the subset of clients that can support them.
I countered with the fact that we include information about the client to the data source when we query them,
and they should have the logic to handle this on their end—why are we suddenly reponsible for this?
I was told that the LOF for the data source would be too large to handle by the demo deadline, that we had to handle it,
that the code that just looks anywhere in the URL for a literal “.bmp” is Good Enough™,
and to stop with the questions.
Now the URL we're given is “percent-encoded”—we get something like: https%3A%2F%2Fexample.com%2Fpicture.bmp
.
Nevermind the fact that that is an invalid URL to begin with
(you aren't supposed to encode characters that are defined as delimiters in URLs if they are,
in fact,
delimiting fields),
that's what we get and pass along.
Only now
(a few years after we started passing URLs along like this)
the clients can't properly decode them
(surprise!),
so of course we have to do that.
I asked why we even had to do that and was told that the LOF for the data source would be too large to handle by the demo deadline, we had to handle it,
and to stop with the questions.
I then complained about the code doing that was doing too much,
as it would decode the so-called “unsafe characters” from RFC-3986
(which aren't defined in the RFC,
but can be derived by a careful reading between the lines),
like the dreaded space character.
There was then much back and forth between me and my manager
(it's not who I thought it was but that's another rant for another time)
about what should and shouldn't be decoded.
I kept saying that if we have to embrace the stupid,
we might as well do it right,
but my manager was arguing against doing that and we should just decode %3A
and %2F
since that's all that's being asked of us today.
I countered with “What about tomorrow,
when we're asked to decode %3F
(‘?’) and %40
(‘@’)?”
(which are delimiter characters per RFC-3986)
I was told to stop with the questions.
And then all hell breaks loose when we get https%3A%2F%2Fexample.com%2FThings%2520Go%2520Boom%2521
.
Sigh.