New York Times Sues OpenAI and Microsoft for Using Millions of Articles to Train Chatbots

The New York Times has sued OpenAI and Microsoft over copyright infringement, seeking to end the companies’ practice of using its stories to train chatbots.

The newspaper filed a lawsuit in the United States federal court in Manhattan on Wednesday, alleging the companies’ powerful artificial intelligence (AI) models used millions of its articles for training without permission and saying that copyright infringements at the paper alone could be worth billions.

The Times said OpenAI and Microsoft are advancing their technology through the “unlawful use of The Times’s work to create artificial intelligence products that compete with it” and “threatens The Times’s ability to provide that service”.

Through their AI chatbots, the companies “seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment”, the lawsuit said.

The Times, one of the most respected news organisations in the United States, is seeking damages as well as an order that the companies stop using its content – and destroy data already harvested.

While no sum is specifically requested, the Times alleged that the infringement could have cost “billions of dollars in statutory and actual damages”.

Confrontational approach

With the suit, The New York Times chose a more confrontational approach to the sudden rise of AI chatbots, in contrast to other media groups, such as Germany’s Axel Springer or The Associated Press, which have struck content deals with OpenAI.

Microsoft, the world’s second biggest company by market capitalisation, is a major investor in OpenAI and swiftly implemented the powers of AI in its own products after the release of ChatGPT last year.

The AI models that power ChatGPT and Microsoft’s Copilot (formerly Bing) were trained for years on content available on the internet under the assumption that it was fair to be used without need for compensation.

But the lawsuit argued that the unlawful use of the Times’s work to build artificial intelligence products threatened its ability to provide quality journalism.

“These tools were built with and continue to use independent journalism and content that is only available because we and our peers reported, edited and fact-checked it at high cost and with considerable expertise,” a spokesperson for the Times said.

The Times said it reached out to Microsoft and OpenAI in April to raise concerns about the use of its intellectual property and reach a resolution on the issue.

During the talks, the newspaper said it sought to “ensure it received fair value” for the use of its content, “facilitate the continuation of a healthy news ecosystem and help develop GenAI technology in a responsible way that benefits society and supports a well-informed public”.

“These negotiations have not led to a resolution,” the lawsuit said.

The lawsuit said that content generated by ChatGPT and Copilot closely mimicked New York Times style and the paper’s content was given a privileged status in perfecting the chatbot technology.

It also said content that proved to be false was sourced incorrectly to The New York Times.