Sunday, November 05, 2017
How come the “easy projects” never are?
My idea for NaNoGenMo 2015 was based off an idea that A. K. Dewdney had in 1985 [Yeah, that worked out so well. —Editor] [Shut up, you! —Sean], so I thought I would return to that well and implement an idea that Douglas Hofstadter had in 1983:
You can amuse yourself by looking up the definition of a common word in the dictionary and replacing the main words in it by their definitions. I once carried this process out for “love” (defined as “A strong affection for or attachment or devotion to a person or persons”), substituting for “strong”, ”affection”, “attachment”, “devotion”, and “person”, and coming up with this concoction:
A morally powerful mental state or tendency, having strength of character or will for, or affectionate regard, or loyalty, faithfulness, or deep affection to, a human being or beings, escpecially as distinguished from a thing or lower animal.
But not being satisfied with that, I carried the whole process one step further. This was my result:
A set of circumstances or attributes characterizing a person or thing at a given time in, whch, or by the conscious or unconscious together as a unit full of or having a specific ability or capacity in a manner relating to, dealing with, or capable of making the distinction between right and wrong in conduct, or an inclination to move or act in a particular direction or way, baving the state or quality of being strong in moral strength, self-discipline, or fortitude, or the act or process of volition for, or consideration, attention, or concern full of fond or tender feeling for, or the quality, state, or instance of being faithful to, those persons or ideals that one is under obligation to defend of support, or the condition, quality or state of being worthy of trust, or a strongly felt fond or tender feeling to a creature or creatures of or characteristic of a person, or persons, that lives or exists, or is assumed to do so, particularly as separated or marked off by differences from that which is conceived, spoken of, or referred to as existing as an individual entity, or from any living organism inferior in rank, dignity, or authority, typically capable of moving about but not of making its own food by photosynthesis.
Isn't it romantic? …
Metamagical Themas: Questing for the Essence of Mind and Pattern (hey, I may have gotten rid of the Amazon ads, but I still have my affiliate link)
It's a straightforward program:
- Set our corpus to a single word, “love.”
- For each word in our corpus, replace said word with its definition.
- If we haven't reached 50,000 words, repeat step 2.
It can't be that hard, right? It should only be an hour of work, at the most, right?
Two days later …
Well, that was easy! [See? —Editor] [SHUT UP! –Sean]
So it starts out with a dictionary I downloaded from Project: Gutenberg. Oh look—it's in some vague HTMLish markup language (even though the file says it's HTML, it's not HTML) so I should be able to parse what I want out of this. It can't be that much work. The format is straightforward:
otherstuff <hw> word </hw> otherstuff <def> definition </def> otherstuff
And I'm not two dozen words in when parsing fails. I check, and the text I'm up against is:
<hw><hw> word </hw> ... <hw> otherword </hw> ... <hw> ... <def> definition </def>
You have got to be kidding me! That is not even valid HTMLish markup! So I code, and I code and I code code code …
<mhw> ... <hw> word </hw> ... <hw> otherword </hw> ... </mhw> ... <def> definition </def>
It's not even consistently bad markup! So I code, and I code and I code code code …
<hw> word </hw> ... definition </def> <hw> word </hw> ... <hw> word </hw> ... <def> definition </def> ...
And I'm not even past “AD” in the dictionary!
I do what I should have done when I encountered the first problem and search for a better machine readable dictionary online. And I find one. The markup is sane! And documented! A few hours later and I can parse every one of the 106,622 definitions in the dictionary!
Now I can implement my idea.
Sheesh.