The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Monday, November 13, 2017

It shouldn't be this hard to support another syndication feed format

A few days ago I came across a new syndication feed format (like RSS or Atom)—JSON Feed:

We — Manton Reece and Brent Simmons — have noticed that JSON has become the developers’ choice for APIs, and that developers will often go out of their way to avoid XML. JSON is simpler to read and write, and it’s less prone to bugs.

So we developed JSON Feed, a format similar to RSS and Atom but in JSON. It reflects the lessons learned from our years of work reading and publishing feeds.

See the spec. It’s at version 1, which may be the only version ever needed. If future versions are needed, version 1 feeds will still be valid feeds.

JSON Feed: Home

It's not like I need another syndication format, and it's still unclear just how popular JSON Feed really is, but hey, I thought, it should be pretty easy to add this. It looks simple enough:

{
    "version": "https://jsonfeed.org/version/1",
    "title": "My Example Feed",
    "home_page_url": "https://example.org/",
    "feed_url": "https://example.org/feed.json",
    "items": [
        {
            "id": "2",
            "content_text": "This is a second item.",
            "url": "https://example.org/second-item"
        },
        {
            "id": "1",
            "content_html": "<p>Hello, world!</p>",
            "url": "https://example.org/initial-post"
        }
    ]
}

I just need to add another entry to the template section of the configuration file, create a few templates files, and as they say in England, “the brother of your mother is Robert” (how they know my mother's brother is Robert, I don't know—the English are weird like that).

But the issue is filling in the content_text field. The first issue—JSON is encoded using UTF-8. For me, that's not an issue, as I'm using UTF-8 (and even before I switched to using UTF-8, I was using ASCII, which is valid UTF-8 by design). But in theory, someone could be using mod_blog with some other encoding scheme, which means an invalid JSON Feed unless fed through a character set conversion routine, which I don't support in mod_blog.

But even assuming I did, that still doesn't mean I'm out of the water.

Suppose this was my content:

<p>"Hello," said the politician, lying.</p>

<p>"Back up!" I said, using my left hand to quickly cover my wallet in my back pocket.
"You aren't getting any money from me!"</p>

If you check the syntax of JSON, you'll see that the double quote character " needs to be converted to \". A similar transformation is required for the blank line, being converted to \n. And I have no code written in mod_blog for such conversions.

It's not like it would be that much code to write. When I added support for RSS and Atom, I had to write code. But it irks me that I have to special case a lot of string processing.

Yes, yes, I know—mod_blog is written in C, which is a horrible choice for string processing. But even if I picked a better language suited to the task, I would still have to write code to manually transform strings from, say, ISO-8859-1 to UTF-8 and code to convert HTML to a form of non-HTML:

&lt;p&gt;&quot;Hello,&quot; said the politician, lying.&lt;/p&gt;

&lt;p&gt;&quot;Back up!&quot; I said, using my left hand to quickly cover my wallet in my back pocket.
&quot;You aren't getting any money from me!&quot;&lt;/p&gt;

(Not to get all meta, but to display the first example HTML, I had to encode it into the non-HTML you see above, and to display the non-HTML you see above, I have to encod the non-HTML into non-non-HTML—or in other words, convert the output yet again. So, to show a simple & in this page, I have to encode it as &amp;, and to show that, I have to encode it as &amp;amp, in ever deepening layers of Inception-like encoding. By the way, that was encoded as &amp;amp;amp;—just for your information.)

I spent way too much time trying to generalize a solution, only to ultimately reject the code. I'll probably just add the code I need to support JSON Feed and call it a day, because solving the issue once and for all is just too much work.

Obligatory Picture

[The future's so bright, I gotta wear shades]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.