Search engines and implicit censorship?

Well, since I beat Slashdot to the punch with two of my last three posts, I feel no pangs of redundancy in quoting and linking a Slashdot article now.

"Dan Gillmor is reporting on the White House website's use of its robots.txt file to disable search engines from crawling certain material. Many excluded items in the robots.txt file involve mentions of Iraq, possibly to prevent people from finding changes to past statements and information when archived elsewhere."

Slashdot: White House Website Limits Iraq-related crawling.

The geekword-filter: a robots.txt file is a text file placed on a website by its administrator. It is not something a visitor to the site normally sees. But when a search engine "visits" the site to index its content, it first checks the robots.txt file to determine what it should and shouldn't look at.

I use an equivalent of the robots.txt file to prevent Google indexing this site's error pages, because nobody in their right minds would be searching for my 404 page, I figure. But using robots.txt as a method of restricting public access to public information is more than a little sinister.

Joseph | 29 Oct 2003

Sorry, comments are not available on this post.

stuff & nonsense

  • Topographic viewTopographic view
     shows elements on a webpage according to how deeply nested they are. It's a bookmarklet for web development.
  • The qualifierThe qualifier
     renders controversial statements on this page harmless. Reinstate the slings and barbs by refreshing. Also a bookmarklet.

  • jjmap
    American Diary

    Two weeks with the apple and the lone star (illustrated).

all posts, ordered by month in reverse-chronological order:

In Words

In Other Words