Wednesday, July 07, 2021
The search engine for text-heavy web sites
The Marginalia Search Engine (link via kontakt) is a fresh approach to search engines. Instead of Page Rank it uses a different method that probably does a better job than Google:
As a consequence, the closer to plain text a website is, the higher it'll score. The more markup it has in relation to its text, the lower it will score. Each script tag is punished. One script tag will still give the page a relatively high score, given all else is premium quality; but once you start having multiple script tags, you'll very quickly find yourself at the bottom of the search results.
Modern web sites have a lot of script tags. The web page of Rolling Stone Magazine has over a hundred script tags in its HTML code. Its quality rating is of the order 10-51%.
Marginalia Search - Notes on Designing a Search Engine
The more markup, the lower the score. Javascript and the score falls through the floor. Neat.
And from the few tests I ran, it seems to be a pretty decent search engine for what I'd use it for.