The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Thursday, August 21, 2025

“Bro, ban me at the IP level if you don't like me!”

More and more I think I'm coming around to Jihad Alex Schroeder's Butlerian Jihad. For reasons, I'm looking into web activity and so far, the top webbot this month is one identifying itself as “Thinkbot,” which may be related to this AI company but I can't be sure. Here's how it itentifies itself: “Mozilla/5.0 (compatible; Thinkbot/0.5.8; +In_the_test_phase,_if_the_Thinkbot_brings_you_trouble,_please_block_its_IP_address._Thank_you.)”.

Seriously, that's it. No URL to read up on it. It doesn't look at the robots.txt file. Just “bro, ban me at the IP level if you don't like me!”

Yeah, block its IP address. You mean the 74 unique addresses it used this month alone? Checking each IP address for the ASN it's from shows the 74 address coming from 41 (41!) network blocks!

A further check showed that all the network blocks are owned by one organization—Tencent. I'm seriously thinking that the CCP encourage this with maybe the hope of externalizing the cost of the Great Firewall to the rest of the world. If China scrapes content, that's fine as far as the CCP goes; If it's blocked, that's fine by the CCP too (I say, as I adjust my tin foil hat).

In any case, I added the following network blocks to my “badbots firewall rule set:”

43.130.0.0/18
43.130.64.0/18
43.130.128.0/19
43.130.160.0/19
43.131.0.0/18
43.132.192.0/18
43.133.64.0/19
43.134.128.0/18
43.135.0.0/18
43.135.64.0/18
43.135.192.0/19
43.153.0.0/18
43.153.192.0/18
43.154.64.0/18
43.154.128.0/18
43.154.192.0/18
43.155.0.0/18
43.155.128.0/18
43.156.192.0/18
43.157.0.0/18
43.157.64.0/18
43.157.128.0/18
43.159.128.0/19
43.163.64.0/18
43.164.192.0/18
43.165.128.0/18
43.166.128.0/18
43.166.224.0/19
49.51.132.0/23
49.51.140.0/23
49.51.166.0/23
101.32.0.0/20
101.32.48.0/20
101.33.64.0/19
119.28.64.0/19
119.28.128.0/20
129.226.160.0/19
150.109.32.0/19
150.109.96.0/19
170.106.32.0/19
170.106.176.0/20

The above list probably doesn't exhaustively enummerate Tencent's network block ownership, but it's a start. The above covers 476,590 unique IP addresses (excluding the base network and broadcast address for each network block). I think it's bad that I had to do this, but with the current landscape of the Internet, it seems inevitable. We can't have nice things it seems.


Commenting runtime state changes

As I was banning Thinkbot, I saw the previous entries in the “badbots firewall rule set”. The first one was banning a particularly bad Gemini bot that would make an invalid empty request only to immediately follow up with a valid request, for every request it made! That was the first bot I actually banned, and it was very recent ban too—June 19th.

But it was the second entry on the list that puzzled me:

Chain badbot (1 references)
    pkts      bytes target     prot opt in     out     source               destination         
       0        0 DROP       tcp  --  *      *       77.25.18.172         0.0.0.0/0           tcp dpt:1965 
     138     8280 DROP       all  --  *      *       185.177.72.0/24      0.0.0.0/0           

(the count of 0 for the first rule—I had to reboot my server recently for reasons I'm still trying to resolve). I will have to go through the log archives to see why I banned the 185.177.72.0/24 network, and that reminded me of an idea I had years ago but never did anything about it.

Twenty-eight years ago (sigh) I wrote the greylist daemon (source code, and for the record, I'm still using it). It tracks a tuple of sending host, from address, to address and the default is to just greylist (that is, artifically delay) a tuple never seen before. But you can override the default behavior for the hosts, from address and to address. So for instance, I can reject hosts:

gld-mcp>iplist reject 206.214.64.0/19

But now, years later, why did I ban that network? I mean, I did set it at some point:

gld-mcp>show iplist
       106 GREYLIST         0.0.0.0         0.0.0.0
         0   ACCEPT       64.12.0.0     255.255.0.0
         0   ACCEPT    64.233.160.0   255.255.224.0
         0   ACCEPT     66.94.224.0   255.255.224.0
         0   ACCEPT      66.102.0.0   255.255.240.0
         0   ACCEPT    66.163.160.0   255.255.224.0
         0   ACCEPT     66.218.64.0   255.255.224.0
         0   ACCEPT  66.220.144.128 255.255.255.128
         0   ACCEPT     66.249.80.0   255.255.240.0
         0   ACCEPT     66.249.64.0   255.255.224.0
         0   ACCEPT    66.252.224.0   255.255.252.0
         0   ACCEPT     69.63.176.0   255.255.240.0
         0   ACCEPT     69.147.64.0   255.255.192.0
         0   ACCEPT      70.34.16.0   255.255.240.0
         0   ACCEPT     72.14.192.0   255.255.192.0
         0   ACCEPT      74.125.0.0     255.255.0.0
         0   ACCEPT       127.0.0.1 255.255.255.255
         0   ACCEPT    140.211.11.3 255.255.255.255
         0   ACCEPT     149.174.0.0     255.255.0.0
         0   REJECT     172.128.0.0     255.128.0.0
         0   ACCEPT     192.168.0.0     255.255.0.0
         0   ACCEPT   204.127.217.0   255.255.255.0
         0   ACCEPT     204.127.0.0     255.255.0.0
         0   ACCEPT    205.152.58.0   255.255.254.0
         0   ACCEPT   205.188.156.0   255.255.254.0
         0   ACCEPT     205.188.0.0     255.255.0.0
         0   REJECT    206.214.64.0   255.255.224.0
         0   ACCEPT    207.115.11.0 255.255.255.192
         0   ACCEPT     207.115.0.0   255.255.192.0
         0   ACCEPT   207.171.188.0   255.255.255.0
         9   ACCEPT    209.85.128.0   255.255.128.0
         0   ACCEPT    209.131.32.0   255.255.224.0
         0   ACCEPT     216.39.48.0   255.255.240.0
         0   ACCEPT    216.239.32.0   255.255.224.0

but there's no indication of when, or why. A fews years of use, and I wish I had added a way to comment such entries. For instance, I blocked 172.128.0.0/16 at some point, but since then, the block is now owned by Microsoft in the United Kingdom. I think I can remove that block now (maybe?).

And I think that iptables (and related commands, I think the preferred firewall interface for Linux is now nftables? Good lord, the churn in this industry is insane) having a way to add comments might be nice, like:

# iptables -A badbots --comment "Thinkbot daring me to ban it 2025-08-21" -s 43.131.0.0/18 -j DROP

I don't know, it's just a random idea I have.

Obligatory Picture

… over many a quaint and curious volume of forgotten lore …

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

Obligatory AI Disclaimer

No AI was used in the making of this site, unless otherwise noted.

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site: https://boston.conman.org/, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

https://boston.conman.org/2000/08/01

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2025 by Sean Conner. All Rights Reserved.