Monday, November 13, 2017
It shouldn't be this hard to support another syndication feed format
A few days ago I came across a new syndication feed format (like RSS or Atom)—JSON Feed:
We — Manton Reece and Brent Simmons — have noticed that JSON has become the developers’ choice for APIs, and that developers will often go out of their way to avoid XML. JSON is simpler to read and write, and it’s less prone to bugs.
So we developed JSON Feed, a format similar to RSS and Atom but in JSON. It reflects the lessons learned from our years of work reading and publishing feeds.
See the spec. It’s at version 1, which may be the only version ever needed. If future versions are needed, version 1 feeds will still be valid feeds.
It's not like I need another syndication format, and it's still unclear just how popular JSON Feed really is, but hey, I thought, it should be pretty easy to add this. It looks simple enough:
{ "version": "https://jsonfeed.org/version/1", "title": "My Example Feed", "home_page_url": "https://example.org/", "feed_url": "https://example.org/feed.json", "items": [ { "id": "2", "content_text": "This is a second item.", "url": "https://example.org/second-item" }, { "id": "1", "content_html": "<p>Hello, world!</p>", "url": "https://example.org/initial-post" } ] }
I just need to add another entry to the template section of the configuration file, create a few templates files, and as they say in England, “the brother of your mother is Robert” (how they know my mother's brother is Robert, I don't know—the English are weird like that).
But the issue is filling in the content_text
field. The first
issue—JSON is encoded using
UTF-8. For
me, that's not an issue, as I'm using UTF-8 (and even before I switched to
using UTF-8, I was using ASCII, which is valid UTF-8 by
design). But in theory, someone could be using mod_blog
with some
other encoding scheme, which means an invalid JSON Feed unless fed through a character set conversion
routine, which I don't support in mod_blog
.
But even assuming I did, that still doesn't mean I'm out of the water.
Suppose this was my content:
<p>"Hello," said the politician, lying.</p> <p>"Back up!" I said, using my left hand to quickly cover my wallet in my back pocket. "You aren't getting any money from me!"</p>
If you check the syntax of
JSON, you'll see that the
double quote character "
needs to be converted to
\"
. A similar transformation is required for the blank line,
being converted to \n
. And I have no code written in
mod_blog
for such conversions.
It's not like it would be that much code to write. When I added support for RSS and Atom, I had to write code. But it irks me that I have to special case a lot of string processing.
Yes, yes, I know—mod_blog
is written in C, which is a
horrible choice for string processing. But even if I picked a better language
suited to the task, I would still have to write code to manually
transform strings from, say, ISO-8859-1 to UTF-8
and code to convert HTML to a form of non-HTML:
<p>"Hello," said the politician, lying.</p> <p>"Back up!" I said, using my left hand to quickly cover my wallet in my back pocket. "You aren't getting any money from me!"</p>
(Not to get all meta, but to display the first example HTML, I had to encode it into the
non-HTML you see above, and to
display the non-HTML you see
above, I have to encod the non-HTML into non-non-HTML—or in other words, convert the output yet again. So, to
show a simple &
in this page, I have to encode it as
&
, and to show that, I have to encode it as
&amp
, in ever deepening layers of Inception-like encoding.
By the way, that was encoded as &amp;amp;
—just for your
information.)
I spent way too much time trying to generalize a solution, only to ultimately reject the code. I'll probably just add the code I need to support JSON Feed and call it a day, because solving the issue once and for all is just too much work.
It shouldn't be this hard to deploy a new version
I spent more time fighting git
and Github than I did in writing the code to support the
JSON Feed.
Yeah,
the code was straightforward and I had it done rather quickly.
Deploying the code was something else entirely.
So I finished the code,
and the new templates to generate the feed and my tests were good and life was fine.
I then committed my changes to git
and that's where the first problem occured—not all the changes were committed!
The issue came down to an overbroad directive to git
to ignore a certain file—it
was a general “ignore all files of the given name” instead of “ignore this one file” that I forgot about.
So I tagged the release (version 5.0.0
),
pushed the changes to Github
(for public consumption of the source code)
and then went to update the copy of mod_blog
on the server.
It was there when I discovered the critical missing file (one of the templates for the new JSON Feed).
Sigh.
I had been a bit too hasty in pushing the code out to Github,
so now I was stuck with releasing version 5.0.1
.
Only now something got munged up with my copy of the code on the server since it compiled the program with a version number of 5.0.0-1-gd096362
instead of a version number of 5.0.1
.
A bit of background: I use git
to tag versions,
and in the Makefile
I have the following bit of code:
VERSION := $(shell git describe --tag) ... override CFLAGS +=-DPROG_VERSION='"$(VERSION)"'
so I don't have to update the version number in code by hand
(the override
exists so I can specify different compiler flags and still have the version information propagated to the program;
I also handle the case when git
isn't available,
but that comes in later in my tale of woe).
Running git tag
showed a tag of 5.0.1
,
but git describe --tag
was only coming up with 5.0.0-1-gd096362
.
Wat?
Did I not update things properly?
Was it a problem with the version of git
on the server?
Did I lose the signal?
Was it lost in translation?
Wat?
Some quick changes,
try version 5.0.2
.
That worked—kind of.
Now on the server the version was reporting back version 5.0.1
.
Then I discovered another issue.
I have code in the Makefile
to handle the case when the version number isn't available through git
—it's just
a check to see if git describe --tag
returned anything and if not,
use a hard coded version specified in the Makefile
.
Now,
to prevent me from pushing an update to Github with an incorrect version number in the Makefile
,
I have a script that is supposed to run when I push changes to Github
(specifically, when I push changes to a remote host).
Only the last change to that script rendered it non-executable,
so it wasn't running.
The version on Github had the wrong version number specified in the Makefile
,
and I was still having this weird “one version back” problem on the server.
I was still having problems with version 5.0.3
when I gave up.
I wanted a nice, clean, 5.0.0
release,
and instead,
I was on my way to version 5.0.137
the way things were going.
And all because I didn't check in a critical file because of a typo.
If only I hadn't pushed the code to Github so quickly.
If only there were a way to remove the tagged versions from Github,
but there didn't seem to be an obvious way to me.
As I eventually found out, there was a way—from the command line on my development machine, I just had to run this blindingly obvious sequence of commands:
GenericUnixPrompt> git tag -d 12345 GenericUnixPrompt> git push origin :refs/tags/12345
Obvious.
I'm surprised I didn't realize that sooner.
So I removed the tags for versions 5.0.3
, 5.0.2
, 5.0.1
, and 5.0.0
,
made sure I had all the files and whatnot,
and re-released version 5.0.0
.
Good Lord!
So now all is right with the world, and I have a new JSON Feed file.