The Boston Diaries

Thursday, July 01, 2021

I finally get the regression test working, just in time to rewrite the entire thing

I finally got the validation logic working! And it only took me … um … a bit longer than I expected.

I just took it slow and steady. First assumption—the code was correct, the regression test wasn't. Second, start with the one known condition required to pass the test. Check the failures, and add another condition. Repeat as needed, spending sometimes a bit too much time trying to figure out the common cause of “failure” (as reported by the incorrect regression test). It came down to, depending upon how you count, six or eighteen conditions (there are a base five, plus any one of three other conditions) for when Project: Lumbergh sends a message to Project: Cleese. Some might say the logic is too complicated. I might even agree. But then, I'm not the one defining the business logic here.

Now I can continue with adding the new tests, and it's here that I might have to rethink how the regression test works. Maybe a secondary regression test that tests some of the non-business logic stuff, like making sure we handle the database replies out of order (A returns, then B, or B returns, then A). I don't know, it's something I have to think about.

Friday, July 02, 2021

A small heads up to my D&D group

Some background: I've been running the “every-other-week” D&D game for two years now, and my players are suspicious of everything I toss out.

So I have to wonder what it says about me that most of the ideas from Probably Bad RPG Ideas (link via Kirk Israel) actually sound fun. You know, like “the big bad of your campaign—the quest-giving innkeeper from the starting tavern.” Or “rolling a natural 1 on any skill check counts as a natural 20 on an opposite skill check (i.e. a 1 on Persuasion counts as a 20 on intimidation, etc.).”

That can be fun.

And funnily enough, this idea: “a D&D item that gives a really high bonus to a choosen stat, but it also makes you brutally honest about absolutely everything,” is something I already do (as my group knows all too well). And I was a player in a game where “Disney is secretly run by a True Fae because the two elements of Arcadia are stories and contracts so by using complex contracts to get control of all stories it can become the most powerful being in Faerie” (and as a side note—isn't that true already?).

And perhaps I shouldn't mention this one because my players already know about the tarrasque (using the D&D spelling here) beneath the city but (if you are a player in my game—skip this bit) it's too good not to contemplate: “make the tarrasque closer to its mythic origins by saying if hit with a ranged touch attack by holy water, it's now your best friend and fetches you slippers and rolls over so you can pet its belly.”

Monday, July 05, 2021

A reimplementation of a web idea in Gemini

Way back in the day, “ping servers” were popular and I added support for it to mod_blog, only to remove it a few years later when the one I was using (the most well known one—I think it was weblogs.com but it's been over fifteen years or so) stopped working. I then removed the support from mod_blog because I didn't feel it was worth keeping a feeature around I didn't use.

Then about a week or so ago, a similar service just started up for Gemini based blogs, and given that I serve up my blog via Gemini I figured I could support that feature again. But as I started, I had an idea—instead of hardcoding support for Antenna in mod_blog, why not support the idea of a “hook” (like git)? It would be easier to write a script to update Antenna (since I already support TLS and Gemini URLs in Lua) than to try to do that all in C.

It wasn't that hard. One function to execute a script and some judicious calls to said function and I have a way to notify external services of new updates.

I wish I had thought of this concept back then. It would have saved me the trouble of adding, and later, removing, code to support various notification sites as they came and went.

Ah well … live and learn.

It's still a sane library for decoding DNS packets, but now it's a sane library for encoding as well

It's software update day here at Chez Boca. In addition to an updated mod_blog, there's also an updated SPCDNS—version 2.0.0!

It only took about ten years (and about four years after it was first requested) but the library now supports full encoding of 30 DNS records. The reason I didn't include such support intially is that the encoding of DNS domain names in the records is not straightforward. It uses a scheme to compress the data such that it's easy to parse, but not so easy to encode without allocating memory which is something I wanted to avoid. A result is that there is a hardcoded limit to the number of domain names that can be encoded, but just how many domain names is … hard to say due to the compression scheme. And because of this, the encoding routine does use a bit more memory than it previously did.

That, and an API change with the Lua wrapper (returning an integer error number instead of a string error message) is why I decided to make this a 2.0.0 release. Users at the C level shouldn't see any change though.

Tuesday, July 06, 2021

Thanks, Facebook!

Even though I can automatically notify Antenna, I still manually update FaceMeLinkedInstaMySpaceBookWeInGram by hand. Yes, it's annoying. Yes, I hate it. Yes, most people don't bother to check the website, or subscribe to the ATOM feed (or the RSS feed or even the JSON feed) but they sure as XXXX use LinkedInstaMyFaceMeWeInGramSpaceBook.

No, I'm not bitter.

Anyway, as I was updating the link on Facebook, I just now received a notification of a message from 2016!

Um … yeah.

There's a reason why I still use email.

Wednesday, July 07, 2021

To unit test or not to unit test, that is the question

There are two parts of your code. Code that can be unit tested and code that can't be unit tested.

Code that can't be unit tested is simple. Any code that has to touch IO can't be unit tested. Period. Any code that doesn't touch IO can be unit tested. It's that simple.

…

Keep ALL IO segregated from the rest of your code. Keep IO functions and methods super small. Do not inject IO polluted objects into other parts of your code.

Nothing should ever be mocked. Period. If you don't agree you likely don't under... | Hacker News

In hindsight, this is obvious. The issue I have with unit testing projects like “Project: Lumbergh,” “Project: Sippy-Cup” or “Project: Cleese” is that they're nearly all IO with very little logic (with the caveat that “Project: Lumbergh” implements all the business logic interspaced with IO).

With that said, even though the author also stated not to mock at all, I do. We have a few network services that “Project: Lumbergh” relies upon, and I have written my own versions of those services that basically respond with a canned answer for a particular query. It was easy since the services speak a common protocol (DNS in this case) and I don't have to implement the logic of those services to determine the answer.

I also decided to implement a new regression test and keep the one I have working as a separate test. This way, it'll be easier to implement tests like “data source B replies, data source A times out.” I'm also keeping each test in its own file, so adding new ones should be way easier and hopefully, we won't end up with another 16,000 tests.

Discussions about this entry

Re: To unit test or not to unit test, that is the question [ 2021-07-08 ]

The search engine for text-heavy web sites

The Marginalia Search Engine (link via kontakt) is a fresh approach to search engines. Instead of Page Rank it uses a different method that probably does a better job than Google:

As a consequence, the closer to plain text a website is, the higher it'll score. The more markup it has in relation to its text, the lower it will score. Each script tag is punished. One script tag will still give the page a relatively high score, given all else is premium quality; but once you start having multiple script tags, you'll very quickly find yourself at the bottom of the search results.

Modern web sites have a lot of script tags. The web page of Rolling Stone Magazine has over a hundred script tags in its HTML code. Its quality rating is of the order 10^-51%.

Marginalia Search - Notes on Designing a Search Engine

The more markup, the lower the score. Javascript and the score falls through the floor. Neat.

And from the few tests I ran, it seems to be a pretty decent search engine for what I'd use it for.

Tuesday, July 20, 2021

There are reasons for operators

One particular area of software complexity is the degree to which unclear code can have unnoticed side effects, an effect that in his most recent blog post, Drew DeVault coins "Spooky code at a distance". Of the two example he gives its the first, of operator overloading, that I think is of greater interest and raises the question about what even the point of operators is and whether they make the code unnecessarily ambiguous.

…

All this is to say, do we really need them? Would it not be better to avoid mutation and to be explicit. If you want to add two numbers then call an add function. If you want to concatenate strings then call a concatenation function. If you want to define your own conditional subroutines, I guess use a lazy language, but then name them. I think its [sic] important to be explicit about the intent of code even if the behaviour is the same as it is easier for others to read and understand and will ultimately lead to fewer bugs. That seems to make sense, right?

Are Operators Even Necessary?

There's a reason why languages have operators—it makes the code easier to reason about. For instance, is this bit of code:

/* Why yes, the "=" is an operator as well */
fassign(&xn1,fmul(fmul(fadd(fmul(A,yn),B),xn),fsub(1.0,xn)))
fassign(&yn1,fmul(fmul(fadd(fmul(C,xn),D),yn),fsub(1.9,yn)))

or this?

xn1 = ((A * yn) + B) * xn * (1.0 - xn);
yn1 = ((C * xn) + D) * yn * (1.0 - yn);

(also, spot the bug!)

Drew comes down on the “no-operator overloading” camp because he finds it harder to debug potential issues, performance or otherwise. But that's a trade-off he's willing to make—others might make the trade-off the other way. For instance, I found it much easier to implement the Soundex system in LPeg, which uses (or maybe abuses?) Lua's operator overloading capability. I traded development time (well, minus the time it took me to get used to LPeg) for a bit of runtime and potential difficulty in debugging. There is something to be said for conciseness of expression. I can't find a source for this, but I have heard that the number of lines a programmer can write per day is constant, regardless of language. The above equation is two lines of C (or really, any higher level langauge) but in assembly?

	movss   xmm0,[yn]
	mulss   xmm0,[A]
	addss   xmm0,[B]
	mulss   xmm0,[xn]
	movss   xmm1,[const1]
	subss   xmm1,[xn]
	mulss   xmm0,xmm1
	movss   [xn1],xmm0

	movss   xmm0,[xn]
	mulss   xmm0,[C]
	addss   xmm0,[D]
	mulss   xmm0,[yn]
	movss   xmm1,[const1]
	subss   xmm1,[yn]
	mulss   xmm0,xmm1
	movss   [yn1],xmm0

Eight times the lines of code, which seems about right for assembly—not hard, just tedious. I would think if Robert wants to use functions over operators, it's easy enough to just do it. I do wonder though, how long until he goes back to using operators?

A most persistent spam

I'm wondering what's up with this spam?

From
Aleksandr <info@s9.mirengo.ru>

To
info@conman.org

Subject
Предложение

Date
Tue, 20 Jul 2021 22:39:01 +0000

Здравствуйте. Вас интересует продвижение сайта? Напишите мне в Skype пожалуйста WhatsApp/Viber владельца сайта и передайте мое предложение. Мой Skype для быстрой связи: mayboroda_aleks Преимущества при сотрудничестве со мной:

…

Я предоставляю отчетность, какие и где были куплены ссылки на Ваш сайт, какие есть ошибки на сайте, которые нужно исправить, увидеть реальный рост позиций сайта и многое другое. С уважением, Aleksandr, SEO специалист в области продвижения сайтов. Мой Skype для быстрой связи: mayboroda_aleks

I mean, yes, it's spam. It's in Russian. And I've been receiving this exact email for well over a year now. Usually two or three times a week. I tried rejecting it via the IP address (via my greylisting daemon) but to no avail—the source IP changes too often. I'd reject the domain, but that changes constantly too. The name? Always “Aleksandr.” The subject line? Always “Предложение.” Yes, I could just automatically delete the email upon receiving it, but I'd much rather reject it as soon as possible rather than accepting it to then just delete it.

But all that aside, I can't figure out the angle here. Is Aleksandr really that hard up for work that he'll spam anybody for work? (yes, I ran the email through Google Translate) Even people who don't speak Russian? Or is this a smear campaign against Aleksandr? Trying to drive him (her?) out of business? I can't tell.

There's no virus in the email. And even though there's an HTML version of the message included, there are no malicious links, as there are no links.

So I'm at a complete loss as to the game going on here.

Update on Wednesday, Debtember 22^ND, 2021

Wisdom of the Ancients—I did find a solution, but it might not be one you can use.

Friday, July 23, 2021

I'm surprised the Secret Service did not shut them down immediately

I recently read in The Transylvania Times (the paper of record for Brevard) that past issues have been digitized up through early 1975 and are available online. Way cool! So I picked an arbitrary issue to look through, January 9^TH, 1975 and learned a bit of the history of the bed and breakfast we stay in, and some ads for movies I've never heard of but it was page 14 that left me speechless—it was an add for printing jobs done by the Transylvania Times:

We Print Anything But Money—

But Our Presss Has Even Proved Itself Capable Of Doing That!

The Transylvania Times. (Brevard N.C.) 1931-current, January 09, 1975, SECTION A, Page 14, Image 14 · North Carolina Newspapers

It left me speechless, but only because I was laughing so hard.

Tuesday, July 27, 2021

The meetings will continue until morale improves

There is a strong cultural difference between how we (and by “we” I mean the team I work on) used to handle testing and how we handle testing now. Today's conversation du jour involved the checking of SIP headers and could we just simply compare the returned SIP message back with a “golden copy.” I countered that there are some fields, like the Call-ID: header that change per call. I was then asked about the custom headers we produce. The one header specifically talked about looks something like:

P-Foo-Custom: e=0; foo=this; bar=that; andthis=nothing

“The subfields,” I said, “don't have a set order. They can appear in any order.”

“Why?”

“Because of an implementation detail of Lua—that particular field is populated from a Lua hash table, and Lua doesn't guarantee any ordering on the hash table.”

“But … but … couldn't you add an option to keep an order?”

“The order shouldn't matter! Any client should be able to handle those subfields in any order.”

“But … but … couldn't you add code to maintain an order?”

“That would be yet more code to lock down what should be an implemetation detail! And I do parse those headers when checking.”

“But … but … you could just compare against a ‘golden copy!’”

“I already parse and check the subfields!”

“Oh.”

I also get the feeling that the tests are assumed to be 100% functionally correct and any deviation MUST be a problem in the code being tested. The notion that the test COULD BE WRONG just doesn't come up. We went down a deep rabbit hole today where the issue turned out to be a misconfiguration in “Project: Sippy-Cup” only it took over an hour to resolve it. Again, the test was incorrect, not the code (and the original regression test, which had the misconfiguration, passed every test—again, from what I understand reading up on this, you aren't supposed to test the tests, right?

Then there was the “That test failed!”

“Oh, that's just me playing with the new regression test framework—it's not a valid test.”

“But it failed! We must investigate.”

I have to adjust to the fact that I have a new job.

Thursday, July 29, 2021

I wonder how the unit test cultists would deal with the testing I do

I thought that as long as I'm going to such lengths to get “push-button testing” implemented, I might as well mention some of the techniques I've used just in the off chance that it might help someone out there. The techniques I use are probably only relevant to the stuff I work on and it may not apply elsewhere, but it certainly can't hurt to mention it.

So I don't have unit tests (whatever they are) per se, but I do have what is referred to as a “regression test,” which tests “Project: Sippy-Cup,” “Project: Lumbergh” and “Project: Cleese.” The reason is that taken individually, each of those projects can be considered a “unit,” but to, say, test “Project: Lumbergh” alone would require something to act like “Project: Sippy-Cup” (which feed requests into “Project: Lumbergh”) and “Project: Cleese” (which is notified by “Project: Lumbergh” in some circumstances), so why not run those as well? “Project: Lumbergh” also talks to two different DNS servers for various information about a phone number, so when running it, I need something to respond back. I also need an endpoint for “Project: Cleese” to talk to, so what's one more process? Oh, “Project: Lumbergh” will also talk to cell phones, or at least expect a cell phone to request data in some circumstances, so I have a “simulated cell phone” running as well.

Each test case is now a separate file, which describes how to set up the data for the test (the two phone numbers, what names, what feature we're testing, etc) as well as what the expected results are (we get a name, or the reputation, or a different phone number, depending upon what's being tested). This way, we can have the regression test run one test, some of the tests, or “all the things!” The regression test will read in all the test cases and generate all the data required to run them. It then will start the seven programs with configurations generated on the fly, and start feeding SIP messages into the maelstrom, recording what goes on and checking the results as each test runs. And when a test fails, the test case information is recorded in an output file for later analysis.

So far, nothing out of the ordinary. That's pretty much how the previous regression test worked, except it generated all 15,852 test cases. But it's how I test some of the wierder border cases that I want to talk about.

First up—ensuring something that's not supposed to happen didn't happen. In some circumstances, “Project: Lumbergh” will notify “Project: Cleese,” and I have to make sure it happens when it's supposed to, and not when it's not supposed to. I've already mentioned part of my solution to this, but the updated version of that is: the regression test has a side channel to the fake endpoint that “Project: Cleese” talks to. Before each test, the regression test will send that component the test data and whether it should expect a request or not. The fake endpoint will simply record this data for later use. If a request is made for that particular test case, it will also be noted for later. Once the regression test has finished running all the tests and waited a few seconds for any last second requests to clear, it “runs” one more test—it queries the fake endpoint for a count of requests it actually received and compares it to the number the regression test thinks should have happened (and output success or failure). Upon termination of the regression test (as everything is being shut down), the fake endpoint will then go through its list of tests it received from the regression test, and record any discrpancies (a query that was supposed to happen didn't, or a query that wasn't supposed to happen, did). This is recorded in another file for later analysis (which is why I send over all the data to the fake endpoint—just to make it easier to see the conditions of the test in one place).

Second—“Project: Lumbergh” talking to multiple DNS servers. It will generally send out both requests at once, given the rather demanding timing constraints on us, so we have to support reply A coming before reply B, reply B coming before reply A, reply A timing out but getting reply B, and reply B tming out but getting reply A. How to test for those nightmare scenarios automatically? Oh, “Project: Lumbergh” also maintains a continuous “heartbeat” to these services, and if those replies don't get though, the servers will be taken out of rotation by “Project: Lumbergh” and once the last one is gone, “Project: Lumbergh” effectively shuts down. The nightmare just got worse.

Well, again, I have written my own fake endpoints for these services (not terribly hard as the data is fixed, and it's not like I'm going for speed here). And again, I added a side channel for the regression test to communicate to the fake endpoints. After starting up the these fake endpoints, the regression test informs the endpoints what entry is considered the “heartbeat” so no delay what so ever will ever be applied to that query. Then before any test is run, the regression test will inform the endpoints how long to delay the response—all the way from “no delay” to “don't even respond” (except for the “heartbeat”—that will always happen) as it's part of the testing data.

Yes, it all works. Yes, it's a pain to write. Yes, it's a bunch of code to test. No, I don't have XXXXXXX unit tests or regression tests for the regression test—I'm only willing to go so far.

I just hope we don't have to implement 100% test code coverage, because I'm not looking forward to forcing system calls to fail.

“Would love to hear about your prior development method. Did adopting the new practices have any upsides?”

[The following is a comment I made on Lobsters when asked about our development methods. I think it's good enough to save, and what better place to save it than this here blog. So here it is.]

First off, our stuff is a collection of components that work together. There are two front-end pieces (one for SS7 traffic, one for SIP traffic) that then talk to the back-end (that implements the business logic). The back-end makes parallel DNS queries [1] to get the required information, muck with the data according to the business logic, then return data to the front-ends to ultimately return the information back to the Oligarchic Cell Phone Companies. Since this process happens as a call is being placed we are on the Oligarchic Cell Phone Companies network, and we have some pretty short time constraints. And due to this, not only do we have some pretty severe SLAs, but any updates have to be approved 10 business days before deployment by said Oligarchic Cell Phone Companies. As a result, we might get four deployments per year [2].

And the components are written in a combination of C89, C++98 [3], C99, and Lua [4].

So, now that you have some background, our development process. We do trunk based development (all work done on one branch, for the most part). We do NOT have continuous deployment (as noted above). When working, we developers (which never numbered more than three) would do local testing, either with the regression test, or another tool that allows us to target a particular data configuration (based off the regression test, which starts eight programs, five of which are just needed for the components being tested). Why not test just the business logic? Said logic is spread throughout the back-end process, intermixed with all the I/O it does (it needs data from multiple sources, queried at the same time).

Anyway, code is written, committed (main line), tested, fixed, committed (main line), repeat, until we feel it's good. And the “tested” part not only includes us developers, but also QA at the same time. Once it's deemed working (using both regression testing and manual testing), we then officially pass it over to QA, who walks it down the line from the QA servers, staging servers and finally (once we get permission from the Oligarchic Cell Phone Companies) into production, where not only devops is involved, but QA and the developer who's code is being installed (at 2:00 am Eastern, Tuesday, Wednesday or Thursday, never Monday or Friday).

Due to the nature of what we are dealing with, testing at all is damn near impossible (or rather, hideously expensive, because getting actual cell phone traffic through the lab environment involves, well, being a phone company (which we aren't), very expensive and hard to get equipment, and a very expensive and hard to get laboratory setup (that will meet FCC regulations, blah blah yada yada)) so we do the best we can. We can inject messages as if they were coming from cell phones, but it's still not a real cell phone, so there is testing done during deployment into production.

It's been a 10 year process, and it has gotten better until this past December.

Now it's all Agile, scrum, stories, milestones, sprints, and unit testing über alles! As I told my new manager, why bother with a two week sprint when the Oligarchic Cell Phone Companies have a two year sprint? It's not like we ever did continuous deployment. Could more testing be done automatically? I'm sure, but there are aspects that are very difficult to test automatically [5]. Also, more branch development. I wouldn't mind so much this, except we're using SVN (for reasons that are mostly historical at this point) and branching is … um … not as easy as in git. [6] And the new developer sent me diffs to ensure his work passes the tests. When I asked him why didn't he check the new code in, he said he was told by the new manager not to, as it could “break the build.” But we've broken the build before this—all we do is just fix code and check it in [8]. But no, no “breaking the build,” even though we don't do continuous integration, nor continuous deployment, and what deployment process we do have locks the build number from Jenkins of what does get pushed (or considered “gold”).

Is there any upside to the new regime? Well, I have rewritten the regression test (for the third time now) to include such features as “delay this response” and “did we not send a notification to this process.” I should note that is is code for us, not for our customer, which, need I remind people, is the Oligarchic Cell Phone Companies. If anyone is interested, I have spent June and July blogging about this (among other things).

Looking up NAPTR records to convert phone numbers to names, and another set to return the “reputation” of the phone number.
It took us five years to get one SIP header changed slightly by the Oligarchic Cell Phone Companies to add a bit more context to the call. Five years. Continuous deployment? What's that?
The original development happened in 2010, and the only developer at the time was a) very conservative, b) didn't believe in unit tests. The code is not written in a way to make it easy to unit test, at least, as how I understand unit testing.
A prototype I wrote to get my head around parsing SIP messages that got deployed to production without my knowing it by a previous manager who was convinced the company would go out of business if it wasn't. This was six years ago. We're still in business, and I don't think we're going out of business any time soon.
As I mentioned, we have multiple outstanding requests to various data sources, and other components that are notified on a “fire and forget” mechanism (UDP, but it's all on the same segment) that the new regime want to ensure gets notified correctly. Think about that for a second, how do you prove a negative? That is, something that wasn't supposed to happen (like a component not getting notified) didn't happen?
I think we're the only department left using SVN—the rest of the company has switched to git. Why are we still on SVN? 1) Because the Solaris [7] build servers aren't configured to pull from git yet and 2) the only redeeming feature of SVN is the ability to checkout a subdirectory, which given the layout of our repository, and how devops want the build servers configured, is used extensively. I did look into using git submodules, but man, what a mess. It totally doesn't work for us.
Oh, did I neglect to mention we're still using Solaris because of SLAs? Because we are.
Usually, it's Jenkins that breaks the build, not the code we checked in. Sometimes, the Jenkins checkout fails. Devops has to fix the build server [7] and try the call again.

Thursday, July 01, 2021

I finally get the regression test working, just in time to rewrite the entire thing

Friday, July 02, 2021

A small heads up to my D&D group

Monday, July 05, 2021

A reimplementation of a web idea in Gemini

It's still a sane library for decoding DNS packets, but now it's a sane library for encoding as well

Tuesday, July 06, 2021

Thanks, Facebook!

Wednesday, July 07, 2021

To unit test or not to unit test, that is the question

Discussions about this entry

The search engine for text-heavy web sites

Tuesday, July 20, 2021

There are reasons for operators

A most persistent spam

Update on Wednesday, Debtember 22^ND, 2021

Friday, July 23, 2021

I'm surprised the Secret Service did not shut them down immediately

We Print Anything But Money—

Tuesday, July 27, 2021

The meetings will continue until morale improves

Thursday, July 29, 2021

I wonder how the unit test cultists would deal with the testing I do

“Would love to hear about your prior development method. Did adopting the new practices have any upsides?”

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

The Boston Diaries

Discussions about this entry

Update on Wednesday, Debtember 22ND, 2021

We Print Anything But Money—

Obligatory Picture

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

Update on Wednesday, Debtember 22^ND, 2021