Thursday, March 21, 2013
A good idea in theory marred by the terrible reality of practice
I get the feeling sometimes that not enough is written about failed ideas—not bad ideas, the ones that shouldn't be done but the class of ideas that can't be done for one reason or another.
Today I had one such idea, but first, some back story.
Sometime last year, R, who runs the Ft. Lauderdale Office of The Corporation, was listening to me lament about The Protocol Stack From Hell™ and how I had this magical ability to break it by thinking bad thoughts about it (an amazing feat when you consider that the physical computers are several miles away in a data center and that any bad thoughts I had towards it had to travel over a remote command line interface).
R explained that SS7 networks are different than IP networks in that any SS7 endpoint that bounces up and down will effectively be ignored by the rest of the SS7 network (and will typically require manual intervention to re-establish a connection), so was there any way I could keep my testing program up and running.
I countered that I didn't think so, seeing how I had to test the testing program and as such, I had to stop and start the program as I found bugs in my own code while I was using it to find bugs in the code I was paid to find bugs in. R conceeded the point and that was that. I would keep doing what I was doing and if the SS7 stack on the machines needed to be restarted because I borked The Protocol Stack From Hell™ yet again, so be it.
Then today, I read about the reliability of the Tandem computer (link via programming is terrible).
Hi, is this Support? We have a problem with our Tandem: A car bomb exploded outside the bank, and the machine has fallen over … No, no it hasn't crashed, it's still running, just on its side. We were wondering if we can move it without breaking it.
Apocraphal story about a Tandem computer
[One other apocraphal story about the Tandem. About fifteen years ago I worked at a company that had a Tandem computer. It was said that one day a cooling fan for the Tandem computer just showed up at the receptionist's desk with no explaination. When she called Tandem about the apparent mistaken delivery, they said that the Tandem computer had noticed its cooling fan was marginal and had ordered a replacement fan.]
I had an idea.
I can't say exactly what triggered the idea—it just hit me.
The idea was to write a very small, and very simple program that established an SS7 endpoint—a “master control program” if you will. It would also listen in on a named pipe for commands. One command would start the testing program, passing the SS7 endpoint to the testing program to use (another command would be to stop the testing program). The SS7 endpoint that is created is a Unix file descriptor (a file descriptor is an integer value used to refer to an open file under Unix, but more importantly, we have the source code to The Protocol Stack From Hell™ and the fact that the SS7 endpoint is a file descriptor is something I can verify). Open file descriptors are inherited by child processes. Closing a file descriptor in a child process does not close it in the parent process, so the test program can crash and burn, but because the SS7 endpoint is still open in the “master control program” it's still “up” to the rest of the SS7 network.
It's a nice idea.
It won't work.
That's because the user library we use to establish an SS7 endpoint keeps static data based on the file descriptor (and no, it doesn't use the integer value as an index into an array, which would be quick—oh no, it does a linear search, multiple times for said private data—I really need a triple facepalm picture for this) and there's no way to establish this static data given an existing file descriptor.
Sigh.