Sunday, January 05, 2003
More Ins and Outs of Calculating Weblog Traffic
As I do occasionally, I run the stats for the Boston Diaries. I use some programs I wrote to pretty much manually go through the log files as I feel it gives me a better feel for the actual traffic I get than if I were to use a program like Analog. Besides, doing it this way I often times find interesting things going on with autonomous agents silently indexing websites for their own nefarious reasons (muahahahahaha!).
I suspect that most people who run their stats don't take the time to really look into the results, because it wouldn't surprise me if the reported stats for most bloggers is inflated quite a bit.
I ran the stats as I have in the past and noticed that I had a higher rate of traffic than normal; I usually get about 100 human hits per day but last month it looked more like 116 per day. Okay, not that big a spike but enough to make me curious as to what's going on. I look at some of the requests that are being counted as human hits and I see [output truncated somewhat]:
213.60.99.73 GET /2002/11/29 HTTP/1.0 200 Mozilla la2@unspecified.mail 213.60.99.73 GET /2002/11/29.1 HTTP/1.0 200 Mozilla la2@unspecified.mail 213.60.99.73 GET /2002/11/23.1 HTTP/1.0 200 Mozilla la2@unspecified.mail
Interesting … seems to be some unspecified robot. A quick query shows it to be from Spain, but other than that, no real information unless I want to track this down further. I'm not that curious, so add that to the list of agents to ignore and rerun the stats.
Still high—about 114 visits per day. Check the requests and find:
12.148.209.196 GET / HTTP/1.1 200 Mozilla/4.7 12.148.209.196 GET /2002/6 HTTP/1.1 200 Mozilla/4.7 12.148.209.196 GET /2001/10 HTTP/1.1 200 Mozilla/4.7 12.148.209.196 GET /2000/6 HTTP/1.1 200 Mozilla/4.7 12.148.209.196 GET /2002/5 HTTP/1.1 200 Mozilla/4.7
Now that is odd. Netscape 4.7 is usually a bit more verbose about what it is than just Mozilla/4.7. Looking up the address I see that it belongs to NameProtect®:
NameProtect, Inc.® is committed to setting the industry standard when it comes to trademark research and registration services. As one of the world's leading trademark research firms, we have helped thousands of entrepreneurs, businesses, attorneys, and other intellectual property professionals with trademark needs.
Oh how nice …
I probably wouldn't be so upset over these guys if they weren't tring to hide behind a browser, or if they respected the Robots Exclusion Protocol, but they don't do either (and I wonder what they'll think of my using their logo here? It won't be the first time I got a cease-and-desist letter for trademark violations—my first, and so far, only one was in September/October of 1998).
This section of your report includes information on generic top-level domain names (.com, .net, .org) and other country-specific domain name registrations that are similar to your name. Use this section to identify potential competitors and assess the potential for your web traffic to be diverted.
NameGuard Free Name Monitoring
Okay, so removing the “anonymous” NameProtect® robot and rerunning again, I see I'm now down to a more normal 106 human visits per day, but just on the safe side …
4.64.202.64 GET /2000/08/30 HTTP/1.0 200 Mozilla/3.0 (compatible) 4.64.202.64 GET /2000/08/28.2 HTTP/1.0 200 Mozilla/3.0 (compatible) 4.64.202.64 GET /2000/08/31.3 HTTP/1.0 200 Mozilla/3.0 (compatible) 4.64.202.64 GET /2000/08/19.1 HTTP/1.0 200 Mozilla/3.0 (compatible) 4.64.202.64 GET /2000/08/14.7 HTTP/1.0 200 Mozilla/3.0 (compatible) 4.64.202.64 GET /2000/08/15 HTTP/1.0 200 Mozilla/3.0 (compatible)
Large number of requests from this address. 143 to be exact, the majority on December 8th and requesting entries mostly from August of 2000. Hard to tell if this is an actual user or a robot someone is working on. If I filter these requests out, I get 101 human visits per day.
Which is about what I expect.