Advertisement

New York Times sues OpenAI, Microsoft over use of its stories to train chatbots

The New York Times building
In a suit filed Wednesday in the Southern District of New York in Manhattan, the New York Times said OpenAI and Microsoft are advancing their technology through the unlawful use of the newspaper’s work.
(Mark Lennihan / Associated Press)
Share

The New York Times is striking back against the threat that artificial intelligence poses to the news industry, filing a federal lawsuit Wednesday against OpenAI and Microsoft seeking to end the practice of using its stories to train chatbots.

The Times says that the companies are threatening its livelihood by in effect stealing billions of dollars’ worth of work by its journalists, in some cases spitting out Times material verbatim to people who seek answers from generative artificial intelligence such as OpenAI’s ChatGPT. The newspaper’s lawsuit was filed in federal court in Manhattan.

OpenAI and Microsoft did not respond to requests for comment.

The media industry is one of many that could be upended by the rapid development of AI. Media organizations have already been pummeled by a migration of readers to online platforms, and although many publications have successfully carved out a digital space, AI could become a significant threat.

Advertisement

“These bots compete with the content they are trained on,” said Ian B. Crosby, partner and lead counsel at Susman Godfrey, which is representing the New York Times.

AI companies scrape information available online, including articles published by news organizations, to train generative AI chatbots. The large language models also are trained on a huge trove of other human-written materials, such as instructional manuals and digital books. That helps them to build a strong command of language and grammar and to answer questions correctly.

But the technology is still under development and still gets many things wrong. In its lawsuit, for example, the Times said OpenAI’s GPT-4 falsely attributed product recommendations to Wirecutter, the paper’s product reviews site, endangering its reputation.

OpenAI and other AI companies, including rival Anthropic, have rapidly attracted billions in investments since public and business interest in the technology exploded, particularly this year.

Microsoft has a partnership with OpenAI that allows it to capitalize on the company’s AI technology. The Redmond, Wash., tech giant is also OpenAI’s biggest backer and has invested at least $13 billion in the company since the two began their partnership in 2019, according to the lawsuit. As part of the agreement, Microsoft’s supercomputers help power OpenAI’s AI research and the tech giant integrates the startup’s technology into its products.

The paper’s complaint comes as the number of lawsuits filed against OpenAI for copyright infringement is growing. The company has been sued by a number of writers — including comedian Sarah Silverman — who say their books were ingested to train OpenAI’s AI models without their permission. In June, more than 4,000 writers signed a letter to the chief executives of OpenAI, Google, Microsoft, Meta and other AI developers accusing them of exploitative practices in building chatbots that “mimic and regurgitate” their language, style and ideas.

Advertisement

Sarah Silverman and several other authors have filed a lawsuit against Meta and OpenAI, the company behind ChatGPT, alleging copyright infringement.

July 10, 2023

The lawsuit filed Wednesday claims that generative AI tools developed by OpenAI and Microsoft are closely summarizing content from the newspaper, mimicking its style and even reciting it verbatim. The complaint cited examples of OpenAI’s GPT-4 spitting out large portions of news articles from the New York Times, including a Pulitzer Prize-winning investigation into New York City’s taxi industry that was published in 2019 and took 18 months to complete. It also cited outputs from Bing Chat that it said included verbatim excerpts from Times articles.

The Times did not list specific damages that it is seeking, but said the legal action “seeks to hold them responsible for the billions of dollars in statutory and actual damages that they owe for the unlawful copying and use of The Times’s uniquely valuable works.”

Web traffic is an important component of the paper’s advertising revenue and helps drive subscriptions to its online site. The outputs from AI chatbots divert that traffic away from the paper and other copyright holders, the lawsuit says, making it less likely that users will visit the original source for the information.

Less traffic to the Times’ Wirecutter articles, for example, means fewer people clicking on affiliate links, which in turn means less revenue for the paper’s product review site.

The New York Times said it’s never given permission to anyone to use its content for generative AI purposes. The lawsuit also follows what appears to be breakdowns in talks between the newspaper and the two companies that began in April, and could be a way to kickstart talks on ending a business dispute.

The News/Media Alliance, a trade group representing more than 2,200 news organizations, applauded Wednesday’s action by the newspaper.

Advertisement

“Quality journalism and GenAI can complement each other if approached collaboratively,” said Danielle Coffey, alliance president and CEO. “But using journalism without permission or payment is unlawful, and certainly not fair use.”

In July, OpenAI and the Associated Press announced a deal for the artificial intelligence company to license AP’s archive of news stories. This month, OpenAI also signed a similar partnership with Axel Springer, a media company in Berlin that owns Politico and Business Insider. Under the deal, users of OpenAI’s ChatGPT will receive summaries of “selected global news content” from Axel Springer’s media brands. The companies said the answers to queries will include attribution and links to the original articles.

The newspaper has compared its action to a copyright lawsuit more than two decades ago against Napster, when record companies sued the file-sharing service for unlawful use of their material. The record companies won and Napster was soon gone, but it has had a major effect on the industry. Industry-endorsed streaming now dominates the music business.

AP writer Matt O’Brien contributed to this story.

Advertisement