Thursday, March 09, 2006
RACTER, generative text and Lisp. Oh my!
It figures.
I make disparaging remarks about Lisp, and it turns out to possibly be the best language to use for a project I'm working on.
Sigh.
The project (or projects—there are a few things I want to work on that can use this) deals with computer generated text (or “generative text” as it's called). Textual mashups if you will, like Crow Haiku Generator or my own Quick and Dirty B-Movie Plot Generator. There are plenty more out there if you look.
A lot of these are nothing more than arrays of strings and a slew of code to sling the pieces together. Something like:
temp = Newsflash[rand(Newsflash.length)] + "\n" + Location[rand(Location.length)] + " was attacked at dawn by " + Monsters[rand(Monsters.length)] + " " + From[rand(From.length)] + " " + Lowersheol[rand(Lowersheol.length)] + ". Our hero " + Hero[rand(Hero.length)] + " " + Fought[rand(Fought.length)] + ".";
Which would generate output like:
******* FLASH! FLASH! FLASH! *******
Disneyland was attacked at dawn by giant green worms from the lowest levels of The IRS. Our hero Tim Conway told their mothers they were naughty.
This is probably one reason why so many generative text programs don't generate all that much different text—they're too tedious to program.
I wanted an easier way to create the data and sling the words around—I wanted a “program” I could write that could then be processed, much like INRAC, the compiler used to create RACTER (which supposedly wrote The Policeman's Beard is Half Constructed), but using INRAC, to generate something like this:
Bill Gates bit Winston Churchill, but Winston Churchill just smiled. Bill Gates snarled, “Saint Winston Churchill, I presume”. “That's a Bill Gatesesque remark” replied Winston Churchill.
You write this:
*story code section a %PEOPLE # b >HERO*person[&P] >VILLAIN*person[&N] # c $VILLAIN #RND3 bit robbed hit $HERO , # d but $HERO just #RND3 smiled laughed shrugged . # ' new: e $VILLAIN snarled >X=Saint,HERO "> $X , I presume <*. # f "That's a !Y=VILLAIN;esque remark" replied $HERO .
Um … yeah.
Thanks, but no thanks.
Building upon my own Quick-n-Dirty B-Movie Plot Generator, I wrote two further prototypes, one in Perl and one in C; both can use the same input files (and are not restricted to just generating B-movie plots). Why a version in C? Because the one in Perl consumes an insane amount of memory and takes almost 600 times longer to generate the output (since the C version can process the input files once into an intermediate format that is quicker to use whereas the Perl version has to re-read the input files for each invocation; but even when I include the preprocessing time in the C version it's still four times faster than Perl). The format I came up with is much easier to deal with:
# declare an array of text, which, when # referenced, will return one of the lines # at random. It ends with a semicolon alone # on a line. :MayYou I pray thou shalt I hope you will ; :HaveBadThingsHappen be plagued with gnats, flies and locusts be taunted by the king's concubines ; :OhYou thou O thou O ye ; :OfLittleFaith incompetent tax-collector lazy Babylonian ; :HearThis Listen Hear this Take heed ; # to reference an array, use %a-arrayname% # # one of the following lines will be used %a-MayYou% %a-HaveBadThingsHappen%, %a-OhYou% %a-OfLittleFaith%! %a-HearThis% %a-OhYou% %a-OfLittleFaith%, for you will %a-HaveBadThingsHappen%!
(I pulled these from a webpage that generated Biblical taunts, but I've since lost the link).
So that's the point I am now—I can replicate most of the generative text programs I've seen on webpages pretty quickly (about as quick as I can copy the text out and into this format). Star Trek technobabble, fake Nostradamus quatrains, crow haikus, all easy now.
Except I'm still not at the level of INRAC. I have no way of generating variables and article generation (“a” or “an”) is still a bit problematic. The syntax I'm currently using, while better than anything else I've seen, is still clumsy (it's a holdover from the initial Perl version. So I've been playing around with syntax a bit, making it easier, and adding variables and functions (say, to generate the appropriate article for a word, or a random number). So far, I like what I have for the new syntax:
# assume we're using the Bible Taunts # declare a variable TheProphet, which # is a randomly choosen Bible name {TheProphet=[biblenames]} [MayYou] [HaveBadThingsHappen] [OhYou] [OfLittleFaith]! [HearThis] [OhYou] [OfLittleFaith], for you will [HaveBadThingsHappen]! {TheProphet} had a dream-you will [HaveBadThingsHappen]. So says {TheProphet}.
But things are progressing to where I'm reaching for lex
and
yacc
to handle the parsing and processing of the input
files.
I am, in effect, writing a programming language.
Want to know something?
Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp.
Greenspun's Tenth Rule of Programming
For all my misgivings about Lisp, I'm finding myself starting to write an ad-hoc, informally-specified bug-ridden slow implementation of Lisp.
Heh.