COURT-RULING

NYT v OpenAI

Updated 2 May 2025

court-rulingcopyrightopenaiusa

NYT v OpenAI

The case that could define the economics of the entire AI industry. The New York Times is suing OpenAI and Microsoft for training GPT models on its copyrighted journalism without permission.

If the NYT wins, every AI company that trained on internet data faces potential liability. If OpenAI wins, it establishes that AI training is fair use. Either way, the answer reshapes the industry.

The Facts

In December 2023, The New York Times filed suit against OpenAI and Microsoft in federal court in New York. The core allegations:

OpenAI used millions of NYT articles to train GPT models without permission or payment
ChatGPT can reproduce NYT content nearly verbatim when prompted
This constitutes copyright infringement at massive scale
It also undermines the NYT’s business model (why subscribe if the AI has the content?)

The NYT showed examples where ChatGPT reproduced paragraphs of NYT reporting word-for-word — a powerful demonstration that the models had memorised, not just “learned from,” the training data.

The Question

Is training an AI model on copyrighted content “fair use” under US copyright law?

Fair use is the US legal doctrine that permits limited use of copyrighted material without permission for purposes like criticism, education, and research. The four-factor test:

Purpose — Is the new use “transformative”? (Does it create something new, or just copy?)
Nature of the original — Is the original creative or factual?
Amount used — How much of the original was taken?
Market effect — Does the new use harm the market for the original?

OpenAI argues AI training is transformative. The NYT argues it’s wholesale copying that undermines their business.

Where It Stands

The case is ongoing. Key developments:

OpenAI has moved to dismiss parts of the case
The court has allowed the core copyright claims to proceed
Discovery (exchange of evidence) is underway
Settlement discussions have been reported but no agreement reached
Trial date not yet set

Why It Matters

For AI companies: If training on copyrighted data is not fair use, the economic foundation of current AI models collapses. Every company would need to license training data — at costs that could be prohibitive.

For publishers: If it is fair use, publishers lose control of their content to AI companies that capture the economic value without compensation.

For the industry: The answer determines whether the current open-internet training paradigm continues, or whether AI development shifts to licensed, synthetic, and public-domain data.

For you: The outcome affects what AI can know, how much it costs, and who profits from information.

The Bigger Picture

This isn’t the only copyright case — Getty v Stability AI tests similar questions for images, and hundreds of authors have filed suits — but NYT v OpenAI is the flagship. It has the most resources on both sides, the clearest factual record, and the highest profile.

The EU AI Act takes a different approach: it requires AI companies to disclose what they trained on, giving rights holders information to enforce their rights. The US approach is litigation-first.

See AI Models for context on how training data shapes model capabilities.

Go Deeper

Court Rulings — All tracked cases
Getty v Stability AI — The image generation copyright case
OpenAI — The company at the centre
Training & Fine-Tuning — How AI training works (and why it needs so much data)
EU AI Act — Europe’s regulatory approach to training data transparency
Legal & Compliance — The full legal landscape
AI Intelligence Hub — Back to the hub home

Sources

NYT Original Complaint (PDF) — Primary source
Reuters — Case Tracking — Ongoing coverage
EFF — AI and Copyright — Digital rights perspective