The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, April 02, 2020

To block the bad guys, it helps to correctly specify all the addresses

Back when I had some server issues I took the time to have the hosting company modify the main firewall to allow all ssh traffic to my server instead of from a fixed set of IP addresses. There had been some times in the recent past (like when the DSL connection goes down and I can't log into the server) where that would have been a Good Thing™. The change went through, and as long as I have an ssh key (no passwords allowed) I can log in from anywhere.

Now, I run my own syslog daemon and one of its features is the ability to scan logs in real time and do things based on what it sees, like blocking IP addresses on failed ssh attempts. I do this on my home system and have currently blocked over 2,300 IP addresses (over the past 30 days—after said time the blocks are removed to keep the firewall from “filling up” so to speak). I enabled this feature on my server about a week ago and … it didn't work.

I could see entries being added to the firewall, but the attempts from some “blocked” IP addresses kept happening. It took me some time, but I spotted the problem—I was blocking 0.0.0.0 instead of 0.0.0.0/0. The former says “match the exact IP address of 0.0.0.0” (which is not a valid IP address on the Internet) while the later says “match all IP addresses.”

Sigh.

Once spotted, it was an easy fix. Then I noticed that the failed log message differed a bit between my home system and the server, so I had to fix the parser a bit to account for the differences. Hopefully, that should be it.

Saturday, April 04, 2020

I don't quite understand this attack

Blocking ssh login attempts is working, but I have noticed another odd thing—the large number of TCP connections in the SYN_RECV state. This is indicitive of a SYN flood, but what's weird is that it's not from any one source, but scores of sources. And it's not enough to actually bring down my server.

I spent a few hours playing “whack-a-mole” with the attacks, blocking large address spaces from connection to my server, only to have the attack die down for about five minutes then kick back up from a score of different blocks. The only thing in common is that all the blocks seem to be from Europe.

And this is what I don't understand about this attack. It's not large enough to bring down my server (although I have SYN cookies enabled and that might be keeping this at bay) and it's from all over European IP space. I don't get who's getting attacked here. It could easily be spoofed packets being sent, but what's the goal here?

It's all very weird.


I'd put this off, but I'm trying to procrastinate my procrastination

I tends towards procrastination. I hate it, yet I do it. Or don't do it … depending on how you want to look at things. I don't think I can verbalize why I do it, but this video on doing that one thing pretty much sums it up, I think. Only my one thing isn't the one thing in the video.

Anyway, to stop this habit, I might have to try The 10 Minute Rule, where you give a task 10 minutes a day. Over time, it'll get done.

Perhaps I'll start tomorrow.

Sunday, April 05, 2020

Is this attack a case of “why not?”

My friend Mark wrote back about the SYN attack to mention that he's also seeing the same attack on his servers. It's not enough to bring anything down, but it's enough to be an annoyance. He's also concerned that it might be a bit of a “dry run” for something larger.

A bit later he sent along a link to the paper “TCP SYN Cookie Vulnerability” which describes a possible motive for the attack:

TCP SYN Cookies were implemented to mitigate against DoS attacks. It ensured that the server did not have to store any information for half-open connections. A SYN cookie contains all information required by the server to know the request is valid. However, the usage of these cookies introduces a vulnerability that allows an attacker to guess the initial sequence number and use that to spoof a connection or plant false logs.

TCP SYN Cookie Vulnerability

The “spoofing of a connection” is amusing, as I don't have any private files worth downloading and spoofing a connection to an email server just nets me what? More spam? I already deal with spam as it is. And the same for the logs—I just don't have anything that requires legally auditable logs. I guess it's similar for most spam—it pretty must costs the same if you attempt 10 servers or 10,000,000 servers, so why not? And like Mark says, I hope this isn't a precursor of something larger.

And chasing down the references in the paper is quite the rabbit hole.

Tuesday, April 07, 2020

Some musings about some spooky actions from Google and Facebook

Periodically, I will go through my Gmail account just to see what has accumulated since the last time I checked. Usually it's just responding to emails saying stuff like “no, this is not the Sean Conner that lives in Indiana” or “I regret to inform you that I will not be attending your wedding” or even “I've changed my mind about buying the Jaguar and no, I'm not sorry about wasting your time.” But today I received an email from Google titled “Your March Search performance for boston.conman.org” and I'm like What?

I check, and yes ideed, it's a search performance report for my blog from Google. I don't use Google Analytics so I was left wondering just how Google associated my gmail account to my website. I know Google does a lot of tracking even sans Google Analytics, but they must have really stepped up their game in recent times to get around that lack.

But no, it appears that some time ago I must have set up my Google Search Console and I forgot about it. Fortunately, that moves the whole issue from “pants staining scary” to just “very spooky.” Poking around the site, I was amused to find that the three most popular pages of my blog are:

Even more amusing is the search query that leads to the top result—“⅚ cup equals how many cups”. What? … I … the answer is right there! I can't even fathom the thought process that even thought of that question.

Wow.

And speaking of “spooky web-based spying” I just realized that Facebook is adding a fbclid parameter to outgoing links. I noticed this the other day, and yes, it even shows up in my logs. I would have written about that, but it seems Facebook started doing this over a year and a half ago, so I'm very late to the game. But it still leaves one question unanswered—would such an action drag otherwise innocent web sites into GDPR non-compliance? It does appear to be a unique identifier and Facebook is spamming all across webservers. Or does Facebook somehow know a European website from a non-European website and avoid sending the fbclid to European websites? I'm just wondering …

Wednesday, April 22, 2020

The trouble of finding a small memory leak

The last time I mentioned GLV-1.12556 it was in reference to a bug that prevented large files from being transferred. I neglected to mention that I fixed the bug back in November where I was improperly checking a return code. Code fixed, issue no more.

But a problem I am seeing now is the ever growing memory usage of the server. I've written other servers that don't exhibit this issue so it's not Lua per se. I use valgrind to check and it does appear to be LibreSSL, but the output from valgrind isn't entirely helpful, as you can see from this snippit:

==27306== 96 bytes in 8 blocks are indirectly lost in loss record 8 of 21
==27306==    at 0x4004405: malloc (vg_replace_malloc.c:149)
==27306==    by 0x429E7FD: ???
==27306==    by 0x429E918: ???
==27306==    by 0x429F00A: ???
==27306==    by 0x435BF54: ???
==27306==    by 0x422D548: ???
==27306==    by 0x4236B14: ???
==27306==    by 0x420FD9C: ???
==27306==    by 0x421021B: ???
==27306==    by 0x420D3D0: ???
==27306==    by 0xD0808A: pthread_once (in /lib/tls/libpthread-2.3.4.so)
==27306==    by 0x420671D: ???

Some functions are decoded by their address, but not all. It doesn't help that LibreSSL is loaded dynamically so the addresses change from run to run. I want a stacktrace of each call to malloc() (and related functions) but I'd rather not have to modify the code just to get this information. Fortunately, I run Linux, and on Linux, I can take advantage of LD_PRELOAD and insert my own hacked versions of malloc() (and co.) to record the backtraces without having to rewlink everything. The simplest thing that could work is just to print a message with the backtrace, and so that's what I did. Given this simple program:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
  void *p = realloc(NULL,50);
  void *q = calloc(1,100);
  void *r = malloc(150);
  void *s = realloc(p,200);
  
  free(q);
  free(r);
  exit(0);
}

I can now get the following output:

! (nil) 0x96dd008 50    ./y [0x8048464] /lib/tls/libc.so.6(__libc_start_main+0xd3) [0xba4e93]   ./y [0x80483b5]
+ 0x96dd3c8 100 ./y [0x8048476] /lib/tls/libc.so.6(__libc_start_main+0xd3) [0xba4e93]   ./y [0x80483b5]
+ 0x96dd430 150 ./y [0x8048489] /lib/tls/libc.so.6(__libc_start_main+0xd3) [0xba4e93]   ./y [0x80483b5]
! 0x96dd008 0x96dd4d0 200       ./y [0x804849f] /lib/tls/libc.so.6(__libc_start_main+0xd3) [0xba4e93]   ./y [0x80483b5]
- 0x96dd3c8
- 0x96dd430

Allocations from malloc() and calloc() are signified by a “+” sign (followed by the address, size and callstack); allocations from realloc() are signified by a “!” sign (followed by the previous and new address, new size and callstack); calls to free() are signified by a “-” sign (which just contains the address—I don't care about the callstack for this call). Some post processing of this output can flag allocations that don't have a corresponding free call:

0x96dd4d0       200
        ./y [0x804849f]
        /lib/tls/libc.so.6(__libc_start_main+0xd3) [0xba4e93]
        ./y [0x80483b5]

Total memory    200
Total records   1

It's not perfect, but I gives a bit more information than valgrind does, as we can see from what I think is the same call as the above valgrind example showed:

0x98861f0	12
	/home/spc/JAIL/lib/libcrypto.so.45(lh_insert+0xea) [0x380156]
	/home/spc/JAIL/lib/libcrypto.so.45(OBJ_NAME_add+0x70) [0x3854c0]
	/home/spc/JAIL/lib/libcrypto.so.45(EVP_add_cipher+0x2d) [0x371889]
	/home/spc/JAIL/lib/libcrypto.so.45 [0x366d3f]
	/lib/tls/libpthread.so.0(__pthread_once+0x8b) [0xd0808b]
	/home/spc/JAIL/lib/libcrypto.so.45 [0x2ff307]
	/lib/tls/libpthread.so.0(__pthread_once+0x8b) [0xd0808b]
	/home/spc/JAIL/lib/libssl.so.47(OPENSSL_init_ssl+0x4b) [0x148ebb]
	/home/spc/JAIL/lib/libtls.so.19 [0xfa63ba]
	/lib/tls/libpthread.so.0(__pthread_once+0x8b) [0xd0808b]
	/usr/local/lib/lua/5.3/org/conman/tls.so(luaopen_org_conman_tls+0x18) [0x21871e]
	lua [0x804ef6a]
	lua [0x804f264]
	lua [0x804f2be]
	lua(lua_callk+0x37) [0x804d0eb]
	lua [0x8068deb]
	lua [0x804ef6a]
	lua [0x8058ab5]
	lua [0x804f27d]
	lua [0x804f2be]
	lua(lua_callk+0x37) [0x804d0eb]
	lua [0x8068deb]
	lua [0x804ef6a]
	lua [0x8058ab5]
	lua [0x804f27d]
	lua [0x804f2be]
	lua [0x804d146]
	lua [0x804e8ac]
	lua [0x804f6ec]
	lua(lua_pcallk+0x60) [0x804d1a8]
	lua [0x804b0e4]
	lua [0x804baba]
	lua [0x804ef6a]
	lua [0x804f264]
	lua [0x804f2be]
	lua [0x804d146]
	lua [0x804e8ac]
	lua [0x804f6ec]
	lua(lua_pcallk+0x60) [0x804d1a8]
	lua(main+0x55) [0x804bb91]
	/lib/tls/libc.so.6(__libc_start_main+0xd3) [0xba4e93]
	lua [0x804ae99]

I can see that this particular piece of leaked memory was allocated by tls_init() (by tracking down what the call at address luaopen_org_conman_tls+0x18 corresponds to). But this leads to another issue with tracking down these leaks—I don't care about allocations durring initialization of the program. Yes, it's technically a memory leak, but it happens once during program initialization. It's the memory loss that happens as the program runs that is a larger concern to me.

So yes, there's some 40K or so lost at program startup. What's worse is that it's 40K over some 2,188 allocations! I did see a further leak when I made several requests back to back—about 120 bytes over 8 more allocations, and it's those that have me worried—a slow leak. And given that the addresses of the heap and dynamically loaded functions change from run to run, it makes it very difficult to filter out those 2,188 allocations from initialization to find the 8 other allocations that are leaking. It would be easier to track down if I could LD_PRELOAD the modified malloc() et al. into the process after intialization, but alas, that is way more work (let's see—I need to write a program to stop the running process, inject the modified malloc() et al. into mapped but othersise unused executable memory, then patch the malloc() et al. vectors to point to the new code, and resume the program; then possibly reverse said changes when you no longer want to record the calls—doable but a lot of work) just to track down a bug in code that isn't even mine.

Sigh.

Update on Thursday, April 23RD, 2020

I think I may have found the leak.

Thursday, April 23, 2020

Of course talking about a bug means its easier to find and fix the bug. Odd how that happens

Of course, after I point the finger to LibreSSL for the memory leak, I find the leak … in my own code.

Sigh.

Not knowing what else to do, I thought I would go through my TLS Lua module to make sure I didn't miss anything. That's when I noticed that I was keeping a reference to a connection so that I can deal with the callbacks from libtls. I was expecting the __gc() method to clean things up, but with a (non-weak) reference, that was never going to happen.

Yes, just because you are using a garbage collected language doesn't mean you can't still have memory leaks.

I verified that, yes indeed, the references were being kept around after the request was finished. It was then straightforward to fix the issue.

That's not to say that libtls still isn't leaking memory—it is, but (it seems) only when you initialize it (which means it's not as bad). But I'll know in a day or two if I fixed the leak. I hope that was it.


Gopher selectors are OPAQUE people! OPAQUE!

Despite warning that gopher selectors are opaque identifiers and even setting up a type of redirect for gopher there are still gopher clients making requests with a leading “/”. This is most annoying with the automatic clients collecting phlog feeds. I expected that after six months people would notice, but nooooooooooooooo!

Sigh.

So I decided to make the selector /phlog.gopher valid, but to serve up a modified feed with a note about the opaque nature of gopher selectors. Yes, it's passive aggressive, but there's not much I can do about people not getting the memo. Maybe this will work …

Thursday, April 30, 2020

That was the leak—now it crashes!

A small recap to the memory leak from last week. Yes, that was the leak and the server in question has been steady in memory usage since. In addition to that, I fixed an issue with a custom module in my gopher server where I neglected to close a file. It wasn't a leak per se, as the files would be closed eventually but why keep them open due to some sloppy programming?

But now a new problem has cropped up—the formerly leaking program is now crashing. Hard! It's getting an invalid pointer, so off to track that issue down …

Sigh.

Obligatory Picture

An abstract representation of where you're coming from]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.