Monday, May 15, 2000
Stripping strips from a website
I started reading a new on-line strip, Player Versus Player. Seems promising but I'd like to read the archive, of which it reaches back to May of 1998, making it two full years of archives to go through.
It's a simple enough matter to write a program that downloads the entire archive of strips:
while(1) { sprintf(filename,"%d%02d%02d.gif",year,month,day); sprintf(url,"http://www.pvponline.com/archive/%d/pvp%s",year,filename); sprintf(cmd,"lynx -source %s >%s",url,filename); system(cmd); sleep(10); /* be nice on their server */ day ++; if (day > daysinmonth(year,month)) { day = 1; month++; if (month > 12) { month = 1; year ++; if (isthistoday(year,month,day)) break; } } }
I feel somewhat odd about doing that though, seeing how they get their revenue through advertising (not that I agree that's the best way to make money, but that's beside the point). Well, that and if they check their logs and see a bunch of requests for just the strips, every 10 seconds, well, in case I do end up liking the strip I don't want to be banned from their server.