Friday, January 30, 2015

The Internet Archive's Wayback Machine

When it comes to archiving the Internet, the Internet Archive has consistently done the best job.

In fact, "Google wrote its mission statement in 1999, a year after launch, setting the course for the company’s next decade: 'Google’s mission is to organize the world’s information and make it universally accessible and useful.' For years, Google’s mission included the preservation of the past."

According to this mission, Google made some significant steps:
"In 2004, Google Books signaled the company’s intention to scan every known book, partnering with libraries and developing its own book scanner capable of digitizing 1,000 pages per hour. In 2006, Google News Archive launched, with historical news articles dating back 200 years. In 2008, they expanded it to include their own digitization efforts, scanning newspapers that were never online."

But Google has had a shift in priorities. "In the last five years, starting around 2010, the shifting priorities of Google’s management left these archival projects in limbo, or abandoned entirely. Two months ago, Larry Page said the company’s outgrown its 14-year-old mission statement. Its ambitions have grown, and its priorities have shifted. Google in 2015 is focused on the present and future. Its social and mobile efforts, experiments with robotics and artificial intelligence, self-driving vehicles and fiberoptics."

Thank goodness, then, that we have The Internet Archive. "The Internet Archive is mostly known for archiving the web, a task the San Francisco-based nonprofit has tirelessly done since 1996, two years before Google was founded. The Wayback Machine now indexes over 435 billion webpages going back nearly 20 years, the largest archive of the web."

Other things available on the Internet Archive include:

  • Books. One of the world’s largest open collections of digitized books, over 6 million public domain books, and an open library catalog.
  • Videos. 1.9 million videos, including classic TV, 1,300 vintage home movies, and 4,000 public-domain feature films.
  • The Prelinger Archives. Over 6,000 ephemeral films, including vintage advertising, educational and industrial footage.
  • Audio. 2.3 million audio recordings, including over 74,000 radio broadcasts, 13,000 78rpm records, and 1.7 million Creative Commons-licensed audio recordings.
  • Live music. Over 137,000 concert recordings, nearly 10,000 from the Grateful Dead alone.
  • Audiobooks. Over 10,000 audiobooks from LibriVox and more.
  • TV News. 668,000 news broadcasts with full-text search.
  • Scanning services. Free and open access to scan complete print collections in 33 scanning centers, with 1,500 books scanned daily.
  • Software. The largest collection of historical software in the world.
The Internet Archive has been particularly useful over the years to retrieve materials that authors have cited from the web that are no longer available. We can still verify the information as it existed at the time that the author cited it. I have also used the Internet Archive to find old legislative history documents.

No comments:

Post a Comment