Federal judge rules copyrighted books are fair use for AI training

8 hours ago 3

A federal judge has sided with Anthropic in a major copyright ruling, declaring that artificial intelligence developers can train on published books without authors’ consent.

The decision, filed Monday in the U.S. District Court for the Northern District of California, sets a precedent that training AI systems on copyrighted works constitutes fair use. Though it doesn’t guarantee other courts will follow, Judge William Alsup’s ruling marks the first of dozens of ongoing copyright lawsuits to give an answer on fair use in the context of generative AI.

It’s a question that’s been raised by creatives across various industries for years since generative AI tools exploded into the mainstream, allowing users to easily produce art from models trained on copyrighted work — often without the human creator’s knowledge or permission.

AI companies have been hit with a slew of copyright lawsuits from media companies, music labels and authors since 2023. Artists have signed multiple open letters urging government officials and AI developers to constrain the unauthorized use of copyrighted works. In recent years, companies have also increasingly inked licensing deals with AI developers to dictate terms of use for their artists’ works.

Alsup on Monday ruled on a lawsuit filed by three authors — Andrea Bartz, Charles Graeber and Kirk Wallace Johnson — last August, who claimed that Anthropic ignored copyright protections when it pirated millions of books and digitized purchased books to feed into its large language models, which helped train them to generate human-like text responses.

“The copies used to train specific LLMs were justified as a fair use,” Alsup wrote in the ruling. “Every factor but the nature of the copyrighted work favors this result. The technology at issue was among the most transformative many of us will see in our lifetimes.”

His decision stated that Anthropic’s use of the books to train its models, including versions of its flagship AI model Claude, was “exceedingly transformative” enough to fall under fair use.

Fair use, as defined by the Copyright Act, takes into account four factors: the purpose of the use, what kind of copyrighted work is used (creative works get stronger protection than factual works), how much of the work was used, and whether the use hurts the market value of the original work.

“We are pleased that the Court recognized that using ‘works to train LLMs was transformative — spectacularly so,’” Anthropic said in a statement, quoting the ruling. “Consistent with copyright’s purpose in enabling creativity and fostering scientific progress, ‘Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different.’”

Bartz and Johnson did not immediately respond to requests for comment. Graeber declined to comment.

Alsup noted, however, that all of the authors’ works contained “expressive elements” earning them stronger copyright protection, which is a factor that points against fair use, although not enough to sway the overall ruling.

He also added that while making digital copies of purchased books was fair use, downloading pirated copies for free did not constitute fair use.

But aside from the millions of pirated copies, Alsup wrote, copying entire works to train AI models was “especially reasonable” because the models didn’t reproduce those copies for public access, and doing so “did not and will not displace demand” for the original books.

His ruling stated that although AI developers can legally train AI models on copyrighted works without permission, they should obtain those works through legitimate means that don’t involve pirating or other forms of theft.

Despite siding with the AI company on fair use, Alsup wrote that Anthropic will still face trial for the pirated copies it used to create its massive central library of books used to train AI.

“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft,” Alsup wrote, “but it may affect the extent of statutory damages.”