Tuesday, July 16, 2002
The Ins and Outs of Calculating Browser Usage
I spent the past few hours writing a program to parse the browser string from the web server log files. Why didn't I use an existing web analyizer package? I wanted the browser strings to be rewriten to have correct information, as well as being in a more consistent style. This meant changing it from, say:
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90;
Q312461)
to
MSIE/6.0 Windows/98
This also means I can generate decent stats about the popularity of certain browsers on the fly (using the Unix command line, I can pull out the browser string, feed that through the newly written program, then count unique browsers easier). An initial run through last month's log file for my blog:
# Hits | Browser/Version | OS/Version |
---|---|---|
1,228 | Googlebot/2.1 | -/- |
748 | MSIE/6.0 | WindowsNT/5.1 |
712 | MSIE/6.0 | Windows/98 |
641 | MSIE/6.0 | WindowsNT/5.0 |
476 | Mercator/2.0 | -/- |
371 | MSIE/5.5 | Windows/98 |
303 | MSIE/5.0 | Windows/98 |
302 | MSIE/5.5 | WindowsNT/5.0 |
238 | -/- | -/- |
216 | MSIE/5.01 | WindowsNT/5.0 |
137 | ia_archiver/- | -/- |
113 | Syndic8/1.0 | -/- |
101 | NCSA/- | -/- |
101 | MSIE/5.01 | Windows/98 |
100 | MSIE/6.0 | WindowsNT/4.0 |
99 | Mozilla/3.01 | -/- |
89 | Gecko/20020529 | Linux/i686 |
88 | Gecko/20020523 | WindowsNT/5.0 |
81 | MSIE/5.14 | Mac_PowerPC/- |
79 | Mozilla/5.0 | -/- |
68 | SlySearch/1.2 | -/- |
66 | MSIE/5.5 | Windows/95 |
62 | MSIE/5.5 | WindowsNT/4.0 |
62 | Gecko/20020529 | PPC/Mac |
61 | Openfind/- | -/- |
55 | MSIE/5.0 | Mac_PowerPC/- |
49 | Indy-Library/- | -/- |
48 | Gecko/20020510 | Linux/i686 |
42 | Mozilla/3.0 | -/- |
41 | sitecheck.internetseer.com/- | -/- |
40 | Gecko/20020311 | WindowsNT/5.1 |
38 | MSIE/5.01 | Windows/95 |
36 | bumblebee@relevare.com/- | -/- |
33 | Gecko/20020530 | WindowsNT/5.0 |
28 | bumblebee/1.0 | -/- |
28 | Gecko/20020510 | WinNT4.0/- |
27 | Opera/6.02 | Windows/2000 |
27 | MSIE/5.0 | WindowsNT/4.0 |
This gives a decent flavor for what's being used to view my site (out of the 7,943 hits last month, about 16% were from the Google spider) but one of the primary reasons I did this was to see just how many people are still using older browsers like Netscape 4x or Internet Explorer 4x (which would show up as Mozilla/4.x and MSIE/4.x respectively). So, strip out the operating system column, and look at only the major version numbers, we then get:
# Hits | Browser/major Version |
---|---|
2,210 | MSIE/6 |
1,671 | MSIE/5 |
1,228 | Googlebot/2 |
543 | Gecko/- |
476 | Mercator/2 |
238 | -/- |
142 | Opera/6 |
141 | Mozilla/3 |
137 | ia_archiver/- |
134 | Mozilla/4 |
113 | Syndic8/1 |
101 | NCSA/- |
79 | Mozilla/5 |
68 | SlySearch/1 |
61 | Openfind/- |
49 | Indy-Library/- |
45 | MSIE/4 |
41 | sitecheck.internetseer.com/- |
37 | Netscape6/6.2 |
36 | bumblebee@relevare.com/- |
28 | bumblebee/1 |
26 | linkhype.com/1 |
26 | Netscape/7 |
24 | BlogBot/1 |
22 | Win32/- |
22 | Konqueror/3.0 |
20 | Frontier/8.0 |
16 | Internet/- |
16 | Ask-Jeeves/- |
15 | Mozilla/- |
14 | Microsoft/- |
14 | Konqueror/2.2 |
12 | w3m/0.2 |
12 | obidos/bot |
12 | Mozilla/4.7C-CCK-MCD |
11 | myownhomeblogindexingservicecrawler/- |
11 | htdig/3.1 |
10 | Mozilla/3.x |
The bad news: 48% of the browsers were Internet Explorer 5x or 6x (although surprisingly enough, I did get five hits from a Mozilla based browser under OS/2). The good news though, is that 58% of the hits were from browsers capable of viewing CSS without crashing. And speaking of horrible browsers that can't support CSS, about 2.5% were running Netscape 4x or IE 4x (they can see the site, only it doesn't look that great).
I also checked the log file for Spring's site (Hi honey!). 53% of her visitors are using Internet Explorer 5 or higher, or Mozilla (or Netscape 6 and higher). Only about 3% are using Netscape 4x or Internet Explorer 4x, which is pretty much on par with my site (the rest are mostly robots or experiemental browsers).