Creating a digital public library without Google’s money


Say what you want about Google — whether you believe it invariably adheres to its motto “Don’t be evil” or you suspect that its true goal is world domination — the firm’s behavior certainly has a way of shining the spotlight on the most important technological issues in our lives.

These include secrecy, privacy and now, in connection with a huge legal fight in which a New York federal judge last week dealt Google a huge defeat, copyright law.

Judge Denny Chin threw a wrench into six years of litigation by tossing out a 165-page settlement reached in 2008 between Google and authors and publishers groups.


At issue was Google’s plan to create a global digitized library to “unlock the wisdom” imprisoned in the world’s out-of-print books, as its co-founder Sergey Brin described the project in 2009.

Like other authors and researchers, I’m conflicted about the project. On the plus side, the vision of a widely accessible digital library is a worthy one that is, for the first time in human history, technologically achievable.

On the other hand, Google was plotting to acquire effective control over millions of works whose copyrights belong to others.

The Google books case began as a narrow legal dispute but broadened out, like an umbrella unfurled in the rain, into an effort to provide a shelter for a huge, monopolistic profit-oriented corporate enterprise.

The original lawsuit dealt with Google Book Search. The company announced in 2004 that it had made searchable digital copies, or scans, of millions of books contributed by Stanford, Harvard, the University of Michigan and other institutional libraries.

Type a search term into your Web browser, and Google would display “snippets” of its scanned books displaying your term. Since many of those books were still under copyright, the Authors Guild and the Assn. of American Publishers sued Google for copyright infringement.


Google’s defense was that the snippets fell under the “fair use” exemption in copyright law, a very murky provision allowing limited use of works, without permission, for comment and criticism, news reporting, scholarship and research.

Had the settlement been limited to that issue, it might have gained Chin’s approval and performed a public service besides by clarifying the fair use exemption for digital indexing — for example, the judge might have set a standard for how big a snippet and how many words can be displayed without permission.

But the document went much further. The settlement created a safe harbor for the vast digital bookstore Google hoped to create out of a digital hoard that so far comprises about 12 million volumes, or nearly 10% of the world’s published library.

The settlement would have allowed Google to continue scanning and offer access to the results for a fee.

The company was to pay $45 million into a settlement fund for authors whose copyrighted books it had already scanned without permission. But infringement wasn’t an issue for many books. Google or anyone else can copy and display the text of those out of copyright, such as the works of Charles Dickens.

Books under copyright and still in print — and therefore whose rights holders are not a mystery — are subject to deals Google makes with their publishers or authors, typically allowing the display of limited chunks, such as several pages, at a time.


The sticking point was “orphan books” — those copyrighted but out of print, and whose rights holders can’t be found or identified. Google executives have portrayed their effort as one that would give these forgotten or overlooked tomes a new lease on life.

The settlement required the company to fund an independent registry which would, among other things, oversee interests in yet-unclaimed works and hold payments from Google for their exploitation.

Any author, including the parents of orphaned works when and if they surfaced, could opt out of Google’s digital scanning on request. But that reverses the burden of existing copyright law, which forbids use unless the owners give affirmative permission for uses of their work.

Critics observed that the deal would have given the company a huge advantage in the digital marketplace by validating its strategy of scanning books first and worrying about copyrights later.

Google digitized material the ownership of which was unclear “in calculated disregard of authors’ rights,” observed copyright lawyer Robert Kunstadt in testimony cited by Chin. “Its business plan was, ‘So, sue me.’” Rivals who went through the tough process of tracking down owners before scanning their books thus were left in the dust.

Judge Chin concluded that rewriting copyright law is a task that belongs in the halls of Congress, not a courtroom, hinting that he couldn’t have approved the settlement even if he wanted to.


But his decision places the spotlight on several questions about the digital present and future.

One is: How to advance the goal of a digital public library without Google’s deep pockets?

The rejection of the Google settlement has raised the profile of a leading alternative being promoted by Robert Darnton, a Harvard history professor and director of the university library, and a long-term critic of the Google settlement.

Darnton’s idea is for charitable foundations to fund a digital analogue to the Library of Congress, freely available to all citizens and accessible to anyone within reach of the Internet. The Alfred P. Sloan Foundation has agreed to play a leading role.

“Now that the settlement seems to have unraveled, this looks like a serious alternative,” Darnton told me.

Darnton’s proposal would eliminate the problems of entrusting a major archival project to an entity whose main purpose is commercial, not scholarly. The settlement would have required Google to provide the participating university libraries with a free digital copy of its scanned out-of-print books. But it also would have allowed Google to restrict its use by faculty members to reading, printing, or downloading no more than five pages for free — and only once per person each academic term. For greater access, the institution would have had to buy a subscription.

Even a public digital library might need legal help dealing with orphan books. Chin’s advice of referring the issue to Congress ignores the question of whether Congress is up to the job. As recently as 2008 a bill to fill the orphan-books gap sank without a trace in the House.


Chin’s ruling may well provoke Google to pressure Congress to solve the problem so it can proceed with its own project.

But there it will face counter-lobbying by publishers, film studios and record labels. “Those content industries don’t like any proposal seen as weakening copyright,” says Peter Jaszi, an expert in copyright law at American University.

The Google books case now looks like a salvage operation for the dream of a digital library.

“There were many things in the settlement that were innovative and useful, and I’d be sorry to see lost,” remarks Lewis Hyde, the author of “Common as Air,” a recent book about copyright in the digital era.

Judge Chin’s decision forces us — or allows us — to ponder the dream of a digital library without ceding our future to Google.

Michael Hiltzik’s column appears Sundays and Wednesdays. Reach him at, read past columns at, check out and follow @latimeshiltzik on Twitter.