The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Monday, November 13, 2017

It shouldn't be this hard to support another syndication feed format

A few days ago I came across a new syndication feed format (like RSS or Atom)—JSON Feed:

We — Manton Reece and Brent Simmons — have noticed that JSON has become the developers’ choice for APIs, and that developers will often go out of their way to avoid XML. JSON is simpler to read and write, and it’s less prone to bugs.

So we developed JSON Feed, a format similar to RSS and Atom but in JSON. It reflects the lessons learned from our years of work reading and publishing feeds.

See the spec. It’s at version 1, which may be the only version ever needed. If future versions are needed, version 1 feeds will still be valid feeds.

JSON Feed: Home

It's not like I need another syndication format, and it's still unclear just how popular JSON Feed really is, but hey, I thought, it should be pretty easy to add this. It looks simple enough:

{
    "version": "https://jsonfeed.org/version/1",
    "title": "My Example Feed",
    "home_page_url": "https://example.org/",
    "feed_url": "https://example.org/feed.json",
    "items": [
        {
            "id": "2",
            "content_text": "This is a second item.",
            "url": "https://example.org/second-item"
        },
        {
            "id": "1",
            "content_html": "<p>Hello, world!</p>",
            "url": "https://example.org/initial-post"
        }
    ]
}

I just need to add another entry to the template section of the configuration file, create a few templates files, and as they say in England, “the brother of your mother is Robert” (how they know my mother's brother is Robert, I don't know—the English are weird like that).

But the issue is filling in the content_text field. The first issue—JSON is encoded using UTF-8. For me, that's not an issue, as I'm using UTF-8 (and even before I switched to using UTF-8, I was using ASCII, which is valid UTF-8 by design). But in theory, someone could be using mod_blog with some other encoding scheme, which means an invalid JSON Feed unless fed through a character set conversion routine, which I don't support in mod_blog.

But even assuming I did, that still doesn't mean I'm out of the water.

Suppose this was my content:

<p>"Hello," said the politician, lying.</p>

<p>"Back up!" I said, using my left hand to quickly cover my wallet in my back pocket.
"You aren't getting any money from me!"</p>

If you check the syntax of JSON, you'll see that the double quote character " needs to be converted to \". A similar transformation is required for the blank line, being converted to \n. And I have no code written in mod_blog for such conversions.

It's not like it would be that much code to write. When I added support for RSS and Atom, I had to write code. But it irks me that I have to special case a lot of string processing.

Yes, yes, I know—mod_blog is written in C, which is a horrible choice for string processing. But even if I picked a better language suited to the task, I would still have to write code to manually transform strings from, say, ISO-8859-1 to UTF-8 and code to convert HTML to a form of non-HTML:

&lt;p&gt;&quot;Hello,&quot; said the politician, lying.&lt;/p&gt;

&lt;p&gt;&quot;Back up!&quot; I said, using my left hand to quickly cover my wallet in my back pocket.
&quot;You aren't getting any money from me!&quot;&lt;/p&gt;

(Not to get all meta, but to display the first example HTML, I had to encode it into the non-HTML you see above, and to display the non-HTML you see above, I have to encod the non-HTML into non-non-HTML—or in other words, convert the output yet again. So, to show a simple & in this page, I have to encode it as &amp;, and to show that, I have to encode it as &amp;amp, in ever deepening layers of Inception-like encoding. By the way, that was encoded as &amp;amp;amp;—just for your information.)

I spent way too much time trying to generalize a solution, only to ultimately reject the code. I'll probably just add the code I need to support JSON Feed and call it a day, because solving the issue once and for all is just too much work.


It shouldn't be this hard to deploy a new version

I spent more time fighting git and Github than I did in writing the code to support the JSON Feed. Yeah, the code was straightforward and I had it done rather quickly. Deploying the code was something else entirely.

So I finished the code, and the new templates to generate the feed and my tests were good and life was fine. I then committed my changes to git and that's where the first problem occured—not all the changes were committed! The issue came down to an overbroad directive to git to ignore a certain file—it was a general “ignore all files of the given name” instead of “ignore this one file” that I forgot about. So I tagged the release (version 5.0.0), pushed the changes to Github (for public consumption of the source code) and then went to update the copy of mod_blog on the server. It was there when I discovered the critical missing file (one of the templates for the new JSON Feed).

Sigh.

I had been a bit too hasty in pushing the code out to Github, so now I was stuck with releasing version 5.0.1. Only now something got munged up with my copy of the code on the server since it compiled the program with a version number of 5.0.0-1-gd096362 instead of a version number of 5.0.1.

A bit of background: I use git to tag versions, and in the Makefile I have the following bit of code:

VERSION := $(shell git describe --tag)

...

override CFLAGS  +=-DPROG_VERSION='"$(VERSION)"'

so I don't have to update the version number in code by hand (the override exists so I can specify different compiler flags and still have the version information propagated to the program; I also handle the case when git isn't available, but that comes in later in my tale of woe). Running git tag showed a tag of 5.0.1, but git describe --tag was only coming up with 5.0.0-1-gd096362.

Wat?

Did I not update things properly? Was it a problem with the version of git on the server? Did I lose the signal? Was it lost in translation? Wat?

Some quick changes, try version 5.0.2. That worked—kind of. Now on the server the version was reporting back version 5.0.1.

Then I discovered another issue. I have code in the Makefile to handle the case when the version number isn't available through git—it's just a check to see if git describe --tag returned anything and if not, use a hard coded version specified in the Makefile. Now, to prevent me from pushing an update to Github with an incorrect version number in the Makefile, I have a script that is supposed to run when I push changes to Github (specifically, when I push changes to a remote host). Only the last change to that script rendered it non-executable, so it wasn't running. The version on Github had the wrong version number specified in the Makefile, and I was still having this weird “one version back” problem on the server.

I was still having problems with version 5.0.3 when I gave up. I wanted a nice, clean, 5.0.0 release, and instead, I was on my way to version 5.0.137 the way things were going. And all because I didn't check in a critical file because of a typo. If only I hadn't pushed the code to Github so quickly. If only there were a way to remove the tagged versions from Github, but there didn't seem to be an obvious way to me.

As I eventually found out, there was a way—from the command line on my development machine, I just had to run this blindingly obvious sequence of commands:

GenericUnixPrompt> git tag -d 12345
GenericUnixPrompt> git push origin :refs/tags/12345

Obvious.

I'm surprised I didn't realize that sooner.

So I removed the tags for versions 5.0.3, 5.0.2, 5.0.1, and 5.0.0, made sure I had all the files and whatnot, and re-released version 5.0.0.

Good Lord!

So now all is right with the world, and I have a new JSON Feed file.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.