The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Friday, June 19, 2015

No product survives first contact with production

Many months ago, my manager S asked about a “health check service” for “Project: Sippy-Cup.” Something that operations could query to see if my component was still up and running. I rejected the idea of embedding a web server in the component as being complete overkill (and really, any embedded webserver would swamp the amount of code that actually does the useful work in my component, which just processes one SIP message.

So I did the simplest thing that could possibly work: a simple UDP service. It accepts a packet with the string “STATUS” and replies with “OKAY.” It was only a few lines of code, and with netcat I figured it would be a simple matter for operations to do a health check.

It seems that UDP is too confusing for operations to deal with, so I changed the underlying protocol to TCP. It's a bit more complicated to support as I now have to listen and accept connections, but then it should be even easier for operations to handle it with netcat. The protocol stills accept a string of “STATUS” and returns with “OKAY”.

And it's still apparently too much for operations to deal with. Operations actually asked if they could send a SIP message, and I was like, Wow! If it's easier for you guys to send a SIP message for a health check, more power to you! But my manager nixxed that idea and we stuck with the current TCP version, which he feels is the simplest thing that could work.

I'm not sure what operations is actually doing. My manager mentioned that my component was failing the health check, yet when check it was fine (using netcat of course). Yet the logs were filled with errors (“recvfrom: Bad file number” and “poll: Invalid argument”), probably from all the failed attempts by operations to do a health check.

I did ask operations what is sent and how often. What they're sending is right, but they're asking “Are­you­still­up?­Why­haven't­you­answered­me?­Are­yo­up?­Are­you­up?­McFly!­McFly!­Answer­me!” before my component has a chance to even answer. I think they're a bit too aggressive. They don't.

Sigh.

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.