AI training’s “fair use” wins over books, but court leaves piracy bill on tech’s table
By willowt // 2025-06-27
 
  • Federal judge rules Anthropic’s use of lawfully purchased books for AI training qualifies as “fair use.”
  • Pirated 7M+ books sourced illegally condemned, with trial on damages set for December.
  • Court likens AI learning to human education but penalizes systematic infringement.
  • Authors Guild condemns ruling as harmful to creators, citing unprecedented commercial copying.
  • Legal battle signals evolving tensions between AI innovation and intellectual property rights.
In a landmark copyright decision, U.S. District Judge William Alsup ruled on June 25, 2025, that AI company Anthropic’s use of millions of lawfully acquired physical books to train its Claude chatbot constitutes “fair use” under U.S. law. However, the judge also condemned Anthropic’s simultaneous pilfering of over seven million pirated books from unlicensed digital libraries, clearing a path to trial over the company’s liability for “criminal-level” copyright violations. The case sets critical precedents for how AI developers can ethically and legally build their systems amid an explosion of litigation against tech giants like Meta and OpenAI.

The “transformative” use defense

Judge Alsup’s ruling, the first of its kind applying copyright law to AI training datasets, centered on two key arguments. First, he held that Anthropic’s digitization of legally purchased print books via “destructive scanning”—notably stripping pages for digitization—qualified as transformative under Section 107 of the Copyright Act. “The mere conversion of a print book to a digital file to save space and enable searchability was transformative,” he wrote, comparing the process to placing a book in a central library. The judge went further by equating AI training to human learning: “Everyone reads texts, then writes new texts,” he noted, warning that forcing firms to pay royalties “each time they read it” would stifle broader creativity. This reasoning secured Anthropic’s right to train models on purchased books, despite authors alleging their works were used to build AI that might compete with them commercially.

Piracy and legal setbacks for Anthropic

However, the ruling did not shield the company from scrutiny of its prior conduct. Early in its operations—relying on pirated repositories like LibGen and Pirate Library Mirror—Anthropic allegedly downloaded 196,640 books in 2021, escalating to seven million by 2022. The judge accused Anthropic of blithely prioritizing profit over legality, noting CEO Dario Amodei and co-founder Ben Mann intentionally avoided licensing deals to pursue a “business slog.” “The defendants had no entitlement to use pirated copies for their central library,” Alsup wrote, denying immunity even after Anthropic switched to legally acquiring books in 2024. The judge mandated a December trial to assess statutory damages for willful infringement, including whether damages should reflect Anthropic’s reported $1 billion annual revenue.

Escalating battles over AI and intellectual property

The Authors Guild, filing as an intervenor, criticized the ruling as “flawed” for treating AI’s mass digitization differently than human learning. “When humans learn from books, they don’t store digital copies forever for commercial gain,” said the group, arguing the decision risks eroding authors’ economic rights. Meanwhile, Anthropic defended its pivot to ethical sourcing, earning praise for hiring Tom Turvey—a Google Books veteran—to purchase physical books en masse. Yet critics dismissed this as a PR gesture, noting digital copies of pirated books remain in Anthropic’s library. The case underscores a fraught tension between innovation and property rights. While tech firms claim AI requires broad data access to thrive, creators and publishers argue copyright law should not allow “strip-mining” of human ingenuity. As legal challenges to Meta, OpenAI and others loom, this trial will shape whether the industry adopts a “license-to-train” standard or carves out new loopholes.

A precarious precedent ahead

The outcome hangs on both law and principle. While Alsup’s decision shields AI training with lawfully purchased works, the piracy trial may force Anthropic to confront the costs of its earlier shortcuts. For authors, it’s a Pyrrhic victory—with one hand protecting books from unauthorized AI training and the other slapping tech for unlawfully hoarding digital libraries. As Judge Alsup noted, the case mirrors classic copyright struggles over new technologies—be it photocopiers, VCRs, or search engines. Yet this time, the stakes involve artificial intelligence’s capacity to both mimic and eclipse human creativity. The verdict may well influence whether regulators worldwide impose stricter licensing or promote a “transformative” free-for-all. For now, the verdict’s complexity ensures this is not the last word but a critical step in balancing progress with accountability. Sources for this article include: ReclaimTheNet.org PublishersWeekly.com APnews.com