The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, June 05, 2025

Avoiding Roko's basilisk, part II

The other day I came across this comment on Lobsters:

On a personal level I have helped various people get value out of AI tools where they initially did not understand how to use it properly. But that setting is more of a 1:1 for a specific situation. For generic how to use agentic tools, there are so many articles already. Peter Steinberger has a multi hour talk online of him using an army of agents to write on his project.

If someone has a specific situation where they failed using an agent, ideally with some open source code, I would be happy to have a look at it. It’s just hard to engage on abstract “does not work for me” posts.

Comment on “AI Changes Everything”

I failed using an agent a few months ago. It was on an open source project of mine. Perhaps mitsuhiko would be happy to have a look at it. So I replied.

And mitsuhiko was happy to look at it.

Or rather, spend a few minutes telling his “coding agent” to look at the code and let it do its thing. So I took a look.

Development was done on a Mac, which doesn't have the vm86() system call, so his agent, “Claude,” started writing an 8086 emulator. Or I should say, an 80386 emulator since that's the most common architecture these days. It also came up with a few tests and once it those tests were working, it stopped.

When I tried the code, attempting to run RACTER.EXE, it just sat there, turning my computer into a space heater. Looking a bit further, I saw there was an option for debug output (but the option appears at the end of the command line, not after the command itself, like every other command on Unix). Then I saw line after line of

...
Execute: 2010:0020: 8B
Unhandled opcode at 2010:0020: 8B
Execute: 2010:0021: EC
Unhandled opcode at 2010:0021: EC
Execute: 2010:0022: 81
Unhandled opcode at 2010:0022: 81
Execute: 2010:0023: EC
Unhandled opcode at 2010:0023: EC
Execute: 2010:0024: 02
Unhandled opcode at 2010:0024: 02
Execute: 2010:0025: 00
Unhandled opcode at 2010:0025: 00
Execute: 2010:0026: 9A
Unhandled opcode at 2010:0026: 9A
Execute: 2010:0027: C2
Unhandled opcode at 2010:0027: C2
Execute: 2010:0028: 10
Unhandled opcode at 2010:0028: 10
Execute: 2010:0029: 52
Unhandled opcode at 2010:0029: 52
Execute: 2010:002A: 24
Unhandled opcode at 2010:002A: 24
Execute: 2010:002B: 9A
Unhandled opcode at 2010:002B: 9A
Execute: 2010:002C: A2
Unhandled opcode at 2010:002C: A2
Execute: 2010:002D: 19
Unhandled opcode at 2010:002D: 19
Execute: 2010:002E: 52
Unhandled opcode at 2010:002E: 52
...

To say I was underwhelmed is an understatement.

The thread somewhat petered out.

I noticed today that mitsuhiko gave it another attempt. He put the whole thing into Docker so he could run under a Linux VM, and the code now could run enough of RACTER.EXE to display the banner:

[spc]lucy:/tmp/racter>/tmp/NaNoGenMo-2015/C/msdos RACTER.EXE



          .-----------------------------------------------------,
          |                                                     |
          |            A CONVERSATION WITH RACTER               |
          |                                                     |
          |       COPYRIGHTED BY INRAC CORPORATION, 1984        |
          | PORTIONS COPYRIGHTED BY MICROSOFT CORPORATION, 1982 |
          |                   ...........                       |
          `-----------------------------------------------------'




Hello, I'm Racter.  You are?  
>Sean
Sean

But that's it. It's still chugging along, turning my computer into a space heater. I'm still unimpressed.

This isn't to fault mitsuhiko. I'm sure he finds value in AI agents coding for him, but I think this was way out of his bailiwick, which is why he didn't bother to understand what I was trying to attempt. “Claude” got to the point of printing the banner from RACTER.EXE and stopped, because I think that's all it was instructed to do, besides attempting to buffer the input.

I'll close this out with the last few comments in the thread:

Sean
What type of programming do you do? Or rather, what type of programming do you have Claude do for you? Because I am still unconvinced it will be any benefit to the programming I do.
mitsuhiko
Right now I’m building a backend for a prototype of the next project I’m working on. That is a rather complex web application using both Python and Rust. Over the last year or so I used it quite a bit to extend minijinja (but that wasn’t agentic yet).
Sean
Ah, stuff that is definitely over-represented in the training sets. Gotcha.
mitsuhiko
Considering that I’m doing a very fringe thing I’m not so sure that this is a very accurate assessment :)
Sean
Python, Rust and web applications are over-represented in the training sets. The 6809, RACTER.EXE and ANS Forth aren’t. What you are doing might be novel, but the tech being used isn’t. The stuff I described isn’t novel (well, maybe having RACTER and Eliza chat, but I was riffing on an article written in the 80s about doing that) but using tech that (in my opinion) is novel (that is, not mainstream). There’s a difference.

I do appreciate the attempt though.

Update on Friday, June 6th, 2025 at 3:06 AM

One last comment from mitsuhiko in the thread: “I had excellent results with completely niche technology too. For as long as you have a way for the machine to validate it’s outputs it can even program in languages that you just invented.”

I think I'll have to keep this in mind for next time.

Obligatory Picture

One is never too old for sparkly Bunster glasses! Never!

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2025 by Sean Conner. All Rights Reserved.