Monday, July 20, 2015
Tumbling through code, part II
The oldest bit of code in mod_blog
deals with what I call “tumblers.”
They're not true tumblers as defined by Ted Nelson,
but my implementation does allow a range to be specified.
It's a date range rather than an arbitrary set of characters though.
And the reason the code hasn't changed in fifteen years is because it was hard to write
the code,
given the features I wanted to support at the time.
Not only did I want the ability to specify a single entry,
like 2015/07/18.2
,
and a range of entries,
like 2015/07/01-09
but also have the ability to specify multiple,
non-sequential entries,
like 2000/08/11.1,2000/08/12.6,2000/08/13.4
while at the same time supporting the ability to specify ranges in reverse,
like 2015/07/09-01
,
all at the same time,
like 2000/08/10.2-15.5,2015/07/09-01,2003/07/04.2
.
So while I got the code to support a forward range and a reversed range,
in order to get the code out the door,
I dropped the non-sequential selection of entries.
The code just got too messy.
But over the past fifteen years,
a slew of issues popped up.
First and foremost was the issue of redundant links:
2015/7/4.1
and 2015/07/04.1
both refer to the same page.
This normally wouldn't be that big an issue except for Google,
which eventually would penalize a website with duplicate content under different links
(why do I even care about this?
Do I really care about my Google page rank?
Enough to let it rule my code?
Apparently I do.
Sigh).
Sure, I can tell that 2015/7/4.1
and 2015/07/04.1
are the same thing,
and you can probably tell the same thing,
but to a computer,
it can't.
Those are two distinct pages that just happen to have the same content.
So I had to hack in code to generate redirects to address this issue.
Now, 2015/7/4.1
would generate a redirect to the canonical version,
2015/07/04.1
. But there were still some corner cases I didn't cover,
such as 2015/07/04.01
,
which should redirect to 2015/07/04.1
,
or 2000/8/10.2-15.5
which should redirect to 2000/08/10.2-15.5
.
I finally had enough and decided over the weekend to fix the issue by rewriting the code. Yes, I know, you never rewrite code! But in this case, it's not the entire program but a portion (granted, it's about the only portion that hasn't been rewritten) and it's not like the code is entirely bug free (it's not). It mostly works but the code as written was just too convoluted to salvage (in my opinion). Besides, I felt that a more straightforward, “parse it a piece at a time” approach over the “be clever and as geneneral as you can be” approach would be better.
It was,
although it took several further revisions to work out all the corner cases,
such as the difference between 2015/04-2015/05
and 2015/04/05-05/06
(the first ending portion specifies a year and a month, whereas the second ending portion is a month and a day),
checking the dates for validity,
and
fixing the ranges when reversed
(it wasn't as straightforward as I thought it would be—can you tell I haven't used this option all that much?).
And not only did I manage to get 2000/8/10.2-15.5
to redirect properly,
but the reverse range 2000/8/15.5-10.2
as well
(check the links after clicking on them),
all the while reminding myself “don't be afraid of special cases.”