Though Brewster Kahle works in an unassuming office and projects a laid-back Bay Area aura, he has outsized ambitions. "The idea is to build the great library," the founder and director of the Internet Archive boldly proclaims-the grandiosity and ambition of the statement are only slightly undercut by the fact that Kahle isn't wearing shoes.But Kahle is not one to be doubted. An MIT-educated, middle-aged computer geek, Kahle, 46, made millions from two internet start-ups (one sold to AOL, the other to Amazon.com). In 1996, he founded the nonprofit library-building Internet Archive. As early as the 1980s, Kahle says, he realized that all the world's information could be digitized and collected. It was just a matter of harnessing the technology and organization to do it.The Archive started by collecting information that's already digitized-on the web-by taking bimonthly "snapshots" of the entire internet (now over 4 billion websites) and daily shots of major sites. All are made available on the Archive's Wayback Machine, which turns your web browser into a time-travel device, pulling up, say, The Washington Post website from April 16, 1999, as if it were today. While Kahle concedes that many of the archived websites are Chinese teenagers' MySpace pages, he sees creating a historical record as more than an effete academic exercise.\n\n\n
Quote:
The idea is to build the great library.
"You need third-party archiving," he says, "because people don't archive themselves very well." With the archive's help, powerful institutions have been caught "web-scrubbing," revising sensitive information to purge the historical record of inconvenient facts. WhiteHouse.gov, for example, airbrushed its archived press release from President Bush's 2003 "Mission Accomplished" speech: Originally named "President Bush Announces Combat Operations in Iraq Have Ended," it became "President Bush Announces Major Combat Operations in Iraq Have Ended." More recently, the Vatican website was outed for doctoring the transcript of the pope's controversial Regensburg speech in 2006, which had prompted protests from Muslims around the world.The archive's staff is also hard at work on a more ambitious project of digitizing the nondigital-i.e. books-and creating a freely available repository of the world's literature. "We're scanning books and making them openly available, in contrast to some of the commercial projects going on," says Kahle. One of the "commercial projects," in this case, is Google's better-known for-profit book-scanning operation, a collaboration with libraries including Harvard's and Oxford's. While some are content to take Google's mission to "do no evil" at face value, Kahle subscribes to the power-corrupts-and-absolute-power-corrupts-absolutely philosophy. "They're a media conglomerate," he says. "They started out as search. ‘Oh, we're just search.' It's like, sheeee-yeah," he says, Valley-Girling for emphasis. "They're a media conglomerate and a major one at that."When Google scans a work, the company owns the copyright on the digital edition, Kahle explains. And this troubles him. "We like the idea of having books available online but not the idea of having perpetual restrictions on the public domain, which is what they're doing. They're locking up the public domain." Once Google scans a book, "you'd only be able to get it from Google, under Google's terms, until that division is sold to some other publishing conglomerate." To counter this with his own nonprofit digital library, Kahle has enlisted a rival slate of major libraries, including the Library of Congress, and has begun scanning. The archive has 200,000 books scanned so far, and scans tens of thousands each month.If Kahle succeeds, one day people all over the world will be able to access all the world's information-for free, from home, in their most comfortable pair of socks.