Thursday, August 21, 2025
“Bro, ban me at the IP level if you don't like me!”
More and more I think I'm coming around to Jihad Alex Schroeder's Butlerian Jihad. For reasons, I'm looking into web activity and so far, the top webbot this month is one identifying itself as “Thinkbot,” which may be related to this AI company but I can't be sure. Here's how it itentifies itself: “Mozilla/5.0 (compatible; Thinkbot/0.5.8; +In_the_test_phase,_if_the_Thinkbot_brings_you_trouble,_please_block_its_IP_address._Thank_you.)”.
Seriously,
that's it.
No URL to read up on it.
It doesn't look at the robots.txt
file.
Just “bro,
ban me at the IP level if you don't like me!”
Yeah, block its IP address. You mean the 74 unique addresses it used this month alone? Checking each IP address for the ASN it's from shows the 74 address coming from 41 (41!) network blocks!
A further check showed that all the network blocks are owned by one organization—Tencent. I'm seriously thinking that the CCP encourage this with maybe the hope of externalizing the cost of the Great Firewall to the rest of the world. If China scrapes content, that's fine as far as the CCP goes; If it's blocked, that's fine by the CCP too (I say, as I adjust my tin foil hat).
In any case, I added the following network blocks to my “badbots firewall rule set:”
43.130.0.0/18 43.130.64.0/18 43.130.128.0/19 43.130.160.0/19 43.131.0.0/18 43.132.192.0/18 43.133.64.0/19 43.134.128.0/18 43.135.0.0/18 43.135.64.0/18 43.135.192.0/19 43.153.0.0/18 43.153.192.0/18 43.154.64.0/18 43.154.128.0/18 43.154.192.0/18 43.155.0.0/18 43.155.128.0/18 43.156.192.0/18 43.157.0.0/18 43.157.64.0/18 43.157.128.0/18 43.159.128.0/19 43.163.64.0/18 43.164.192.0/18 43.165.128.0/18 43.166.128.0/18 43.166.224.0/19 49.51.132.0/23 49.51.140.0/23 49.51.166.0/23 101.32.0.0/20 101.32.48.0/20 101.33.64.0/19 119.28.64.0/19 119.28.128.0/20 129.226.160.0/19 150.109.32.0/19 150.109.96.0/19 170.106.32.0/19 170.106.176.0/20
The above list probably doesn't exhaustively enummerate Tencent's network block ownership, but it's a start. The above covers 476,590 unique IP addresses (excluding the base network and broadcast address for each network block). I think it's bad that I had to do this, but with the current landscape of the Internet, it seems inevitable. We can't have nice things it seems.
Commenting runtime state changes
As I was banning Thinkbot, I saw the previous entries in the “badbots firewall rule set”. The first one was banning a particularly bad Gemini bot that would make an invalid empty request only to immediately follow up with a valid request, for every request it made! That was the first bot I actually banned, and it was very recent ban too—June 19th.
But it was the second entry on the list that puzzled me:
Chain badbot (1 references) pkts bytes target prot opt in out source destination 0 0 DROP tcp -- * * 77.25.18.172 0.0.0.0/0 tcp dpt:1965 138 8280 DROP all -- * * 185.177.72.0/24 0.0.0.0/0
(the count of 0 for the first rule—I had to reboot my server recently for reasons I'm still trying to resolve). I will have to go through the log archives to see why I banned the 185.177.72.0/24 network, and that reminded me of an idea I had years ago but never did anything about it.
Twenty-eight years ago (sigh) I wrote the greylist daemon (source code, and for the record, I'm still using it). It tracks a tuple of sending host, from address, to address and the default is to just greylist (that is, artifically delay) a tuple never seen before. But you can override the default behavior for the hosts, from address and to address. So for instance, I can reject hosts:
gld-mcp>iplist reject 206.214.64.0/19
But now, years later, why did I ban that network? I mean, I did set it at some point:
gld-mcp>show iplist 106 GREYLIST 0.0.0.0 0.0.0.0 0 ACCEPT 64.12.0.0 255.255.0.0 0 ACCEPT 64.233.160.0 255.255.224.0 0 ACCEPT 66.94.224.0 255.255.224.0 0 ACCEPT 66.102.0.0 255.255.240.0 0 ACCEPT 66.163.160.0 255.255.224.0 0 ACCEPT 66.218.64.0 255.255.224.0 0 ACCEPT 66.220.144.128 255.255.255.128 0 ACCEPT 66.249.80.0 255.255.240.0 0 ACCEPT 66.249.64.0 255.255.224.0 0 ACCEPT 66.252.224.0 255.255.252.0 0 ACCEPT 69.63.176.0 255.255.240.0 0 ACCEPT 69.147.64.0 255.255.192.0 0 ACCEPT 70.34.16.0 255.255.240.0 0 ACCEPT 72.14.192.0 255.255.192.0 0 ACCEPT 74.125.0.0 255.255.0.0 0 ACCEPT 127.0.0.1 255.255.255.255 0 ACCEPT 140.211.11.3 255.255.255.255 0 ACCEPT 149.174.0.0 255.255.0.0 0 REJECT 172.128.0.0 255.128.0.0 0 ACCEPT 192.168.0.0 255.255.0.0 0 ACCEPT 204.127.217.0 255.255.255.0 0 ACCEPT 204.127.0.0 255.255.0.0 0 ACCEPT 205.152.58.0 255.255.254.0 0 ACCEPT 205.188.156.0 255.255.254.0 0 ACCEPT 205.188.0.0 255.255.0.0 0 REJECT 206.214.64.0 255.255.224.0 0 ACCEPT 207.115.11.0 255.255.255.192 0 ACCEPT 207.115.0.0 255.255.192.0 0 ACCEPT 207.171.188.0 255.255.255.0 9 ACCEPT 209.85.128.0 255.255.128.0 0 ACCEPT 209.131.32.0 255.255.224.0 0 ACCEPT 216.39.48.0 255.255.240.0 0 ACCEPT 216.239.32.0 255.255.224.0
but there's no indication of when, or why. A fews years of use, and I wish I had added a way to comment such entries. For instance, I blocked 172.128.0.0/16 at some point, but since then, the block is now owned by Microsoft in the United Kingdom. I think I can remove that block now (maybe?).
And I think that iptables
(and related commands,
I think the preferred firewall interface for Linux is now nftables
?
Good lord, the churn in this industry is insane)
having a way to add comments might be nice,
like:
# iptables -A badbots --comment "Thinkbot daring me to ban it 2025-08-21" -s 43.131.0.0/18 -j DROP
I don't know, it's just a random idea I have.