Thursday, September 25, 2025
This is my markup language. There are plenty of others, but this is mine
The Lobster's Blog Carnival is up, and the theme is “What have you made for yourself?” While I have plenty of programs I wrote, there's one that I specifically wrote for my own use: MOPML.
I wrote it to make writing blog entries easier for me.
For twenty years,
I was hand-crafting HTML for each entry and I finally got tired of it.
I wanted an easier way to make entries,
so I started down the path of implementing my own markup language.
Existing languages like Markdown or AsciiDOC didn't appeal to me and were a bit too generic in how they did things.
I also wanted to steal ideas from
TeX
and Org Mode,
as well as some ideas I had to support tags like <ABBR>
(which not many sites bother doing).
As I already had twenty years of entries in HTML, one design goal was not to store the entries as MOPML, but to keep them their final HTML-rendered state. This meant that I could play around with the syntax of MOPML and not have to worry about breaking existing entries. Besides, if I had to edit a post after publication, I can edit the HTML directly; I have been doing that for years anyway. For the impementation, I chose Lua, specifically so I could use LPEG.
The TeX inpsired syntax are for items like M-dashes, where I can type three dashes like --- and get a single M-dash on output: —. Or even type typographical quotes where I can type ``This is quoted'' and get “This is quoted”. I even extended that so that when I type “1/2” I get “½”. It's also easy to add new entries to the particular parsing rule.
And while I was inpired by Org Mode for things such as tables and block quotes, I did not care for the syntax, so I changed it to suit my needs. A table is easy to generate:
#+table This is a caption *header foo bar baz **footer foo bar baz Entry 1 3 14 15 Entry 2 92 62 82 Entry 3 8 -1 4 #-table
The #+table
starts a table defintion,
and is followed by an optional caption.
A header row is marked by a starting asterisk,
and a footer row is marked by two asterisks.
Each field is separated by a tab character.
The above example will produce the following table:
header | foo | bar | baz |
---|---|---|---|
footer | foo | bar | baz |
Entry 1 | 3 | 14 | 15 |
Entry 2 | 92 | 62 | 82 |
Entry 3 | 8 | -1 | 4 |
The above sample is yet another Org Mode inspried block:
#+source MOPML #+table This is a caption *header foo bar baz **footer foo bar baz Entry 1 3 14 15 Entry 2 92 62 82 Entry 3 8 -1 4 #-table #-source
(For the record,
I did have to go in after rendering this post and fix the above example,
but I never intended to nest #+source
blocks in the first place.)
I also have a defined block for when I quote email:
#+email From: John Doe <{{johndoe@example.net}}> To: sean@conman.org Subject: Re: Morbi in lorem ut lectus accumsan placerat. Morbi TLA enim id turpis Date: Mon, 1 Apr 2019 18:12:41 +0200 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec gravida justo et aliquam lobortis. #-email
The From:
header has another formatting quirk—the {{ and }} denote text that is to be censored in the output.
This is how I get those XXXXXXX censor bars in my posts.
The above will render the block as:
- From
- John Doe <XXXXXXXXXXXXXXXXXXX>
- To
- sean@conman.org
- Subject
- Re: Morbi in lorem ut lectus accumsan placerat. Morbi TLA enim id turpis
- Date
- Mon, 1 Apr 2019 18:12:41 +0200
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec gravida justo et aliquam lobortis.
The Markdown-inspried bits are mostly with inline markup,
such as *emphasis* for emphasis,
and `code` for code
.
I never did like the Markdown or Org Mode syntax for links,
so I played around with it for quite a bit until I got a syntax I like.
So when I want to link to this page,
I type {^https://www.conman.org/people/spc/about/ this page}
.
It's a minimal syntax that isn't likely to appear in normal text.
And if I ever do want to include a “}” in a link,
I can always escape it like {^/ sample \}
text} to get sample } text.
But I think the best feature is how I handle abbreviations.
HTML contans the <ABBR>
tag to semantically mark up TLAs and what not.
I wish most web authors would do this,
as it would make reading about the MPZ easier to understand,
and most browsers on the market will show a tooltip with the TITLE
attribute if you hover over it.
I was moaning about this way back in 2003, and I finally have a method I'm happy with. All I do is include a block of abbreviations at the top of the post:
abbr: HTML HyperText Markup Language MOPML My Own Private Markup Language LPEG Lua Parsing Expression Grammar URL Uniform Resource Locator TLS Three Letter Acronyms MPZ Medial Palisade Zone
The code will read this block and generate the LPEG code to recognize the acronym,
such as TLA,
and generate the appropriate HTML: <abbr title="Three Letter Acronym">TLA</abbr>
,
thus giving us our TLA with semantic markup.
I even solved what I called the IRA problem back in 2003, you know, when the IRA steals the IRAs from members of the IRA; or in other words, when you have the same TLA that maps to different meanings. And I can even mention IRA GERSHWIN without fear of it becoming Initial Risk Assessment GERSHWIN. The IRA problem is solved with yet another block definition at the top of the post:
abbr2: IRAa IRA Irish Republican Army IRAr IRA International Reading Association IRAm IRA Individual Retirement Account
So I type IRAa and the code will generate <abbr title="Irish Republican Army">IRA</abbr>
.
I suppose I could always include definitions of common TLAs I use in the code itself, but it hasn't been that big of an issue for me to just define the TLAs I use in the post itself.
That's pretty much all I have for a markup language. Yes, it's tailored to what I write and how I want to present it. I don't expect anyone to use this engine as it makes sense to me, but maybe not to you. And that's the point, this is for me to use. I made this for myself. And I'm lucky enough to be able to do so.