Reading Los Angeles: Join The Times' new book club
Opinion Opinion L.A.

Patt Morrison Asks: The Internet Archive's Brewster Kahle

Brewster Kahle has the gleeful air of a man who has just found something wonderful and wants to tell his friends all about it. And his friends are the 2 billion people, and counting, who are on the Internet every day.

What he has found -- or more accurately, crafted -- are the means and the mechanisms to preserve the human record, the whole human record, in its many media, so other humans can get to it with a tap or a mouse click, on www.internetarchive.org and www.openlibrary.org.

For a geek who made his fortune in cutting-edge search engines, Kahle sure does love books and print. He taught his kids geometry out of a 19th century volume of Euclid and does hand-set letterpress printing in his basement.

Thanks to Kahle's Wayback Machine -- a search engine named in homage to a cartoon on "The Rocky and Bullwinkle Show" -- you can follow the history of vanished Web pages. At the archive's website, download a book that's in the public domain or borrow one -- electronically -- that's not.

Kahle's home base is a onetime Christian Science church in San Francisco. Where the week's hymn numbers were once posted, there are now two canonical tech-world numbers: the golden ratio, and pi. Everybody sing!

You love libraries, Web pages, pretty much all forms of information, but you worry about preserving it all.

What happens to libraries is that they burn. And they get burned by governments. The Library of Congress was burned once; it was burned by the British.

So let's design for it. If the folks at the [ancient] Library of Alexandria had made a copy and put it in China or India, we would have the works of Aristotle, the other plays by Euripides.

Wouldn't it be great if you could put all the published works online? The Internet Archive is trying to become useful as a modern-day digital library. We're trying with [today's] Library of Alexandria [among others].

The Alexandria, in Egypt, right?

Yes. They have this gorgeous building; you walk in, turn to your right and [there is] the running Internet Archive. They're scanning their books for it. We have [such] agreements with five or six [libraries] around the world.

Let's not have the Library of Alexandria, version two, burn this time. [Let's be prepared for] when Iron Curtains go up or down, when governments say, “We're not really interested in this library thing anymore.”

The Internet Archive started [by] collecting all the Web pages, a copy of every page from every website every two months. We collected, collected, collected. Then we made the Wayback Machine.

Then we started collecting television — 20 channels worldwide since the year 2000, mostly news.

The book collection — we're digitizing 1,000 books every day [in] 29 scanning centers in six countries. There's a room in the Library of Congress and they keep bringing us maybe 100 or 200 books a day [to scan].

We get a couple of million people a day to see these collections.

One of the ways your collection of current books differs from Google's quest to record all books is that you've structured yours like a lending library -- people check out a book virtually and return it virtually.

We started by scanning public domain books and now have about 2 million available [to download] free. [We get books] from around 500 great libraries. The California state library participates. The libraries, or some foundation, [pay] to have them scanned. The national library of Spain [has] us collect all Spanish websites.

It costs 10 cents a page; about $30 a book. We can do it all in about one hour.

But we wanted to get more modern books [too], so we came up with the lending library system at openlibrary.org.

[He pulls up the site on his laptop and demonstrates.] "Mr. Popper's Penguins" -- it probably has some rights issues, so I can take this book for two weeks. Anyone else who wants to borrow it, they'll have to wait until I return it. OK, so now I'm going to return the book -- ta da!

How do writers get paid? And will all libraries be online affairs?

All this is in transition. We're starting to see a few companies really suck the air out of the room [with] central points of control: Google, Apple, Amazon. Let's find an [open] alternative.

We see libraries [online], buying ebooks as they buy books today: Buy them and lend them out. [Some] publishers are not selling ebooks to libraries, but if the $3 billion to $4 billion that libraries currently spend on publishers' products [still goes] to publishers and authors, then there is a future for all concerned.

Slate called you an evangelical librarian. Do librarians like you?

Yes; we're doing things they wish they could be doing.

You sound like a liberal arts major!

Nah, I just read all these books. My background really comes from geekdom and the idea of building a smart machine. If we're going to build a smart machine, let's have it read good books.

So when you went to the library as a kid, you thought, we can do better than this?

Oh yeah -- [the library] is all romantic, but it's super-slow. Answering questions in a physical library with books -- that's the sort of thing we expect to do like that [he snaps his fingers] on the Net now.

The problem is the Net doesn't have [enough of] the good stuff yet. It's shallow. The way most people are learning these days is through screens, so let's make sure they have as good a [screen] library as [the kind] we grew up with.

My kids are 14 and 17; the books of the 20th century are not at the fingertips of my children, and the 20th century was pretty impactful. If we don't [change] that, we're going to end up with a generation that's going to learn only [from] corporate stuff or Wikipedia.

I read you have 40 billion Web pages from 50 million websites. Do you lie awake at night and think there are millions more being created at this very moment, how do we catch up?

Yes, absolutely. And the Web is changing. It's more difficult to [keep up], but that's our challenge.

With the early websites from the 1990s, a lot of [things] didn't work out but at least we have copies of them. And next time, let's go back and make sure our technology can support those dreams better.

What's [the Web] going to become? I'm hoping [it] isn't just the next glorified television.

You do ephemera like seed catalogs and political brochures too?

A lot of ephemera, old computer magazines, people love that stuff.

And you've got a “book ark.”

We don't want to destroy the books we're scanning. We love books! So we said let's get good at storing books. Libraries spend a lot of money storing books. We [do it for about] one-tenth of what libraries spend.

We do it much more densely. We put them in boxes, then on pallets, then in modified shipping containers. We know where everything is. It's not meant to be a circulating library. It's collection-oriented.

If you're wondering [if] “1984” by George Orwell has been changed [in a new edition], can we check the original? We're a place to do that. It's the original testimony of the artifact. Is this level of protection the ultimate? I don't know. But it's another shot at it.

Do you read on a Kindle?

No, I like books.

How close are you to getting it all digitized?

When we started, we were thought of as crazy; it was impossible. Or if you could do it, you wouldn't want to. We don't hear that anymore. People are saying, glad you're there; I've used it; it's helped me out. So in 15 years -- somewhat because of us doing it and showing it's valuable — we'll use the Net as the library. By being a library, we're able to remember and live a civic role that existed before the Internet.


This interview was edited and excerpted from a longer taped transcript. Interview archive: latimes.com/pattasks.

Copyright © 2015, Los Angeles Times
Related Content
  • FCC is wisely taking its time on net neutrality action
    FCC is wisely taking its time on net neutrality action

    The Federal Communications Commission has decided to put off action on net neutrality until 2015, drawing protests from those lobbying for strict regulation of Internet service providers. It's tempting to argue that the delay gives the commission time to develop a consensus, but that's a fool's...

  • GOP takes on the FCC over net neutrality
    GOP takes on the FCC over net neutrality

    Many congressional Republicans were outraged when Federal Communications Commission Chairman Tom Wheeler announced a new net neutrality proposal this month that was considerably tougher on the cable and telephone companies that provide Internet access than the plan he'd unveiled in April....

  • Just what does this U.N. agency want to do to the Internet?
    Just what does this U.N. agency want to do to the Internet?

    The International Telecommunication Union, an arm of the United Nations that oversees global communications networks, has alarmed tech advocates by debating whether to extend its authority to the Internet. Although it's chilling to think of the U.N. asserting jurisdiction over an area that...

  • An amusing strategy for devising secure passwords that you can remember
    An amusing strategy for devising secure passwords that you can remember

    You’ve forgotten the ubiquitous password you must enter and now try in vain to remember it. It’s not six ones, four ones is too short and the reset password you choose (five twos) is not allowed because you’ve used it before or it’s strength has been deemed to weak.

  • The FCC's new rules for a free and open Internet
    The FCC's new rules for a free and open Internet

    Since the Federal Communications Commission set out to preserve the free and open nature of the Internet more than a decade ago, there's never been a question about the importance of that goal. Instead, the often bitter debate has been over how to achieve it. The latest proposal from FCC...

  • FCC poised to favor Internet users over service providers
    FCC poised to favor Internet users over service providers

    The Federal Communications Commission is expected to adopt a contentious set of rules Thursday that, in order to preserve the freedom of consumers and content providers online, dramatically limits the freedom that Time Warner Cable, AT&T and other Internet service providers have long enjoyed....

  • How Facebook is becoming the Wal-Mart for news
    How Facebook is becoming the Wal-Mart for news

    As reported in the New York Times, Facebook may start directly hosting the content of various news websites, starting with the New York Times, BuzzFeed and National Geographic. What this means for Internet users is that instead of seeing a summary of an article on Facebook, clicking, reading it...

  • Verizon Wireless crosses the privacy line on Web browsing
    Verizon Wireless crosses the privacy line on Web browsing

    Verizon Wireless, the country's most popular mobile phone operator, has been quietly inserting into its customers' Web browsing sessions an identifier unique to each device they use, making it possible for websites and advertising networks to build profiles of individual customers based on...