The Boston Diaries

The ongoing saga of a programmer who doesn't live in Boston, nor does he even like Boston, but yet named his weblog/journal “The Boston Diaries.”

Go figure.

Friday, November 16, 2001

“Google owes me how much?”

I came across a micropayment scheme that is making the rounds: Penny per Page and it works just like it sounds—you pay one penny to view one page. Technically, it's possible. HTTP has provisions to expand for pay-for-reference (although no standard is mentioned) and some work has been done.

Obligatory Sidebar Quote

The fact that they don't pay for Web content is a historic anomaly. The benefits to be reaped by paying a very small amount of money for Web content are gigantic. Right now, people are actively denying themselves many of the most amazing things that the Web could provide because of the "totally free" World Wide Web.

How Penny Per Page Might Work page 4

The article even mentions how under this scheme, Google could easily make $350 million a year (assuming Google can maintain it's 100 million page hits per day) but see—there's a slight problem and it's a problem I haven't seen mentioned in any of the micropayment schemes I've read up on: search engines.

Ah yes, the Google Problem (as I've come to call it). The whole point of a search engine is to catalog your site so others can find it. If no one can find your site, it doesn't matter if you charge 1¢ or $1—you're not going to make money. And generally, sites don't mind if a search engine crawls through the site and indexes it. Heck, there are companies that make money submitting sites to search engines so they'll be crawled.

Now, how much of that fabled $350 million that Google makes will stay if Google has to pony up the 1¢ for each page it fetches?

Now, statistically speaking, using only my site and extrapolating from there makes poor science but hey, it's a starting point. A quick scan through the logs (of,, and which so far only covers November 1st through the very early morning hours of the 16th (it's 3:08 am as I'm writing this) I've had 986 visits from Googlebot but only 83 referals from Google itself.

Interesting! Under this hypothetical plan, Google lost $9.03 on spidering my site. If I check all the sites I host, Google lost $15.46 from all the spidering it did. Meanwhile, I made $10.69 from Google spidering just or if I consider all the sites: $22.54.

On a whim, I checked three other sites whose logs files I have access to to see if the rather ad-hoc theory I'm working under is valid. Two sites Google paid more to visit than they made in search results, but definitely came out ahead on the third (of course it's a sex-related site).

So it would be hard to say if Google would be able to keep the $350 million if it too was subject to paying out 1¢ per page it indexed.

The other side of the coin is for the search engines to be exempt from the penny-per-page charge—after all, they're driving visitors to the site after all. But then it becomes a problem of determining if what is going through the pages is a robot or not. If you base the decision on the User-Agent then what's to stop someone using Opera and changing its User-Agent string to say it's Googlebot? Authentication is one method, but it's hard enough getting robots.txt on all sites and that's a simple text file. Something as complicated as an anthentication scheme for robots is going to be tougher to sell.

Obligatory Picture

[“I am NOT a number, I am … a Q-CODE!”]

Obligatory Contact Info

Obligatory Feeds

Obligatory Links

Obligatory Miscellaneous

You have my permission to link freely to any entry here. Go ahead, I won't bite. I promise.

The dates are the permanent links to that day's entries (or entry, if there is only one entry). The titles are the permanent links to that entry only. The format for the links are simple: Start with the base link for this site:, then add the date you are interested in, say 2000/08/01, so that would make the final URL:

You can also specify the entire month by leaving off the day portion. You can even select an arbitrary portion of time.

You may also note subtle shading of the links and that's intentional: the “closer” the link is (relative to the page) the “brighter” it appears. It's an experiment in using color shading to denote the distance a link is from here. If you don't notice it, don't worry; it's not all that important.

It is assumed that every brand name, slogan, corporate name, symbol, design element, et cetera mentioned in these pages is a protected and/or trademarked entity, the sole property of its owner(s), and acknowledgement of this status is implied.

Copyright © 1999-2024 by Sean Conner. All Rights Reserved.