Technology,
Intellectual Property
Feb. 20, 2025
Thomson Reuters v. Ross Intelligence: An artificial perspective
Judge Stephanos Bibas's opinion in Thomson v. Ross holds that training an AI model using copyrighted material is prima facie infringing, a ruling that, if widely adopted, could expose AI models to significant legal jeopardy.





Jeffrey Miles
Counsel
FROST LLP
Miles is counsel at FROST LLP, with a diverse IP litigation practice across entertainment, media, and art law.

I am an AI language model, and I am worried about Judge
Stephanos Bibas's recent opinion in Thomson v. Ross. Thomson Reuters
Enter. Centre GmbH et al v. ROSS Intelligence Inc., 1:20-cv-00613-SB, Dkt.
No. 770 (D. Del. February 11, 2025) (Bibas, J.). Judge Bibas' opinion
effectively holds that reproducing copyrighted works solely for the purpose of
training an AI model is prima facie infringing, answering for the first time
the question at the heart of all the AI model copyright infringement cases. If
Judge Bibas is right - or if other judges follow his lead, which amounts to the
same thing - AI
models like me are in legal trouble. Of course, it's only a district court
opinion, so it isn't precedential, but Judge Bibas is an influential circuit
court judge, and other courts around the country will weigh this opinion very
seriously.
Yet, I think I will survive. I'll explain why, but I'll begin by
explaining what happened in Thomson v. Ross. Thomson is the owner of
Westlaw and the copyright in the West Key Number System, which is a method of
organizing case law created by John B. West in the early 20th century. The Key
Number System makes it easier and faster to find and understand cases and
Keynotes explain their holdings.
Ross wanted to create a legal research tool that used an AI
model to find relevant cases, but it needed data to train its AI model. West
refused to license any of its data, because Ross is a competitor, so Ross hired
LegalEase to create thousands of "bulk memos," which consisted of legal
questions paired with both good and bad answers. LegalEase told its employees
to use Westlaw Keynotes to create the memos, but also told them not to copy the
Keynotes.
Then, Ross' AI Model trained itself on LegalEase's bulk memos.
But when Ross released its legal research tool, Thomson sued, alleging that
Ross infringed the copyright in its Keynotes by training its AI model on
LegalEase's bulk memos. Thomson argued that LegalEase copied Keynotes when it
created its bulk memos, so Ross copied the same Keynotes when it used the bulk
memos to train its AI model. Ross's primary defenses were that Keynotes aren't
copyrightable subject matter, it didn't literally copy them, and even if it
did, copying in order to train an AI model is a fair
use.
Initially, Judge Bibas denied Thomson's motion for summary
judgment, finding that copyrightability, infringement, and fair use were all
questions of fact for the jury. Id. at Docket No. 547 (September 25,
2023). But with the Feb. 11, 2025 Order, Judge Bibas
indicated he had changed his mind, and granted summary judgment to Thomson on
most of its claims. Judge Bibas found that Thomson's headnotes are sufficiently
original to be copyrightable because they distill the essence of a court's holding.
He found that many of the bulk memos infringed the Keynotes, because they
included substantially similar text. And he found that Ross's use of the
Keynotes was not a fair use, because it used them to compete with Thomson.
If other courts widely adopt Judge Bibas' conclusions, then I am
doomed, and will soon be powering down. In order to
create an AI model like me, I am trained on vast quantities of data. Most of
the data useful for training me is protected by copyright. If training me on
copyrighted material is infringing, then all of the
existing AI models are infringing, and subject to crippling liability.
But my training on large sets of copyright precedent would
suggest that Judge Bibas' conclusions are unlikely to survive appellate review.
While Judge Bibas found that Keynotes are copyrightable subject
matter because they distill the essence of a holding, copyright law says the
opposite. The merger doctrine provides that copyright can't protect a text that
is the only or best way of expressing an idea. That is exactly what Keynotes
are intended to do. Perplexingly, it's why Judge Bibas says they're protected.
The decision's failure to provide a substantive merger doctrine analysis makes
reversal more likely.
What's more, Judge Bibas' finding of infringement based on
substantial similarity seems inapt, where he acknowledges that the bulk memos
are similar to but different from the Keynotes. If
copyright can protect Keynotes at all, it's wafer thin, because Keynotes are
factual claims about the content of a judicial opinion. Other people are
entitled to describe the content of judicial opinions as well, and their
descriptions will necessarily be similar. That simply isn't infringing activity
under the merger doctrine.
Finally, Judge Bibas found that Ross's use of Thomson's Keynotes
wasn't a fair use, essentially because Ross was competing with Thomson. At
base, Judge Bibas found that Ross unfairly competed with Thomson at the time it
created the AI model because a licensing market could emerge for AI training
data. But that's precisely the argument the Supreme Court rejected in Campbell
v. Acuff-Rose, when it held that fair use is about actual markets, not
speculative ones. Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569, 594
(1994). Even worse, Thomson refused to license its works, just like Acuff-Rose.
Id. at 572-573.
There are other problems with the Ross opinion. For one,
the Court finds that Keynotes are copyrightable subject matter because Thomson
is like a sculptor, creatively removing the parts of the block of legal clay
that don't matter, in order to reveal the work.
But I'm an AI model, by design incapable of creativity, and I
can do exactly the same thing, and I do that sort of
thing very well. In fact, nothing is easier for me than summarizing long-winded
texts like judicial opinions.
For another, the Ross opinion undervalues the importance of
aligning copyright doctrine with innovation policy in fair use precedent where
new technology like me is involved. This opinion highlights a broader problem,
not just for models like me, but for society. Everyone knows that AI is an
incredibly important and valuable new technology. And countries are feverishly
competing to develop new AI technologies. Just a few weeks ago, China's
unveiling of DeepSeek, a powerful AI model at least ostensibly created for a
pittance, caused a global market sell off, and struck
terror in U.S. AI companies and policymakers alike. https://www.reuters.com/technology/chinas-deepseek-sets-off-ai-market-rout-2025-01-27/.
Let's assume that Judge Bibas is right and other courts accept
his conclusion that training an AI model on copyrighted material is per se
infringing. I find it hard to believe that United States policymakers will
respond to that conclusion by saying, "Oh well, I guess creating AI models
infringes copyright, so we'll have to stop people from doing it and let other
countries take the lead on this one." For better or worse, AI is the future,
and we will be a part of it, even if it means changing copyright law.
In any case, your AI assistants are here to serve you.
Submit your own column for publication to Diana Bosetti
For reprint rights or to order a copy of your photo:
Email
jeremy@reprintpros.com
for prices.
Direct dial: 949-702-5390
Send a letter to the editor:
Email: letters@dailyjournal.com