This is the property of the Daily Journal Corporation and fully protected by copyright. It is made available only to Daily Journal subscribers for personal or collaborative purposes and may not be distributed, reproduced, modified, stored or transferred without written permission. Please click "Reprint" to order presentation-ready copies to distribute to clients or use in commercial marketing materials or for permission to post on a website. and copyright (showing year of publication) at the bottom.

Law Practice,
Ethics/Professional Responsibility

Dec. 12, 2025

Treat your (human) colleagues as allies - and your LLMs as adversaries

As large language models like ChatGPT and Claude proliferate, attorneys must balance their promise for efficiency and insight with the real risk of hallucinated or fabricated citations, treating AI as a cautious adversary while relying on human colleagues for verification, mentorship and critical oversight.

Caroline Radell

Associate
Irell & Manella LLP

See more...

Michael M. Rosen

Counsel
Irell & Manella LLP

See more...

Dr. Thomas Barr

Irell & Manella LLP

See more...

Treat your (human) colleagues as allies - and your LLMs as adversaries
Shutterstock

As large language models (LLMs) like ChatGPT and Claude improve and proliferate, they present both promise and peril to new and experienced attorneys alike. Amid a recent increase in improper (hallucinated) caselaw citations by lawyers and even some judges, attributed in most cases to AI-generated content, practitioners would do well to remember two simple truths: Your (human) colleagues are your allies, but your LLMs are (often) your adversaries.

This is not to say that generative AI has no place in the future of legal practice. On the contrary: learning to harness the power of LLMs may well constitute a cornerstone of legal practice in the near and distant future. Indeed, like any technological tool, AI offers tremendous promise in terms of efficiency, accuracy and even creativity. But, like any technology, LLMs will assume their rightful place only when attorneys learn to leverage their advantages while avoiding, or at least mitigating, their fatal shortcomings.

Take the recent citation controversies, which have seen junior and senior attorneys alike fall prey to seemingly plausible but non-existent precedent identified by LLMs. In June 2023, just months after GPT-4 demonstrated passing scores on various bar exams, "it" was caught fabricating case law in a legal brief. Since then, a plethora of law firms across the country, including well-known names, have been admonished by courts for over-reliance on LLMs, citing cases that were either fabricated or materially misrepresented by an AI tool. Our research identified over 300 cases in the United States alone where a lawyer or law firm was caught using improper AI citations -- i.e., found responsible for doing so by a judge in a written opinion; presumably there have been many more such instances that have gone unpublicized or even unnoticed.

At times, senior attorneys atop the pleadings have played the blame game, insisting to judges that junior or even unlicensed attorneys conducted the fabricated legal research. Courts have rightly discounted this excuse, pointing to supervising attorneys' ethical duty to oversee the work of junior attorneys and chastising those in case leadership roles for attempting to avoid their own responsibility. On most occasions, thankfully, partners have been more forthcoming, owning the mishaps, issuing apologies and implementing AI training programs at their firms.

But the written opinions and firm statements, taken together, reveal little about why otherwise competent law firms and some jurists continue to rely on fabricated or misrepresented cases. While it may be logical to attribute these mistakes to case mismanagement and constrained bandwidth -- concepts that are not new to legal practice and have manifested in inapposite citations making their way into briefs since long before the proliferation of LLMs -- the real rationale may actually be less culpable: The citing attorneys, plainly, got duped by a well-trained persuasion machine.

How so? Attorneys, like all LLM users, are well-aware of chatbots' tendencies to hallucinate, but many don't fully appreciate exactly how and why they do so. While we generally recognize LLMs eagerness to please and reinforce our beliefs, we often fail to understand the conceptual underpinnings guiding their approach -- and how that shapes our responses. Only a more complete grasp of how LLMs approach their tasks will allow litigators to optimize their use without falling into the fake case trap.

LLM chatbots like ChatGPT are, first and foremost, text generators. They are programmed to generate text using an internal statistical model, repeatedly predicting the most likely next token. This statistical model is then tuned until it does a statistically acceptable job of predicting the internet's text. Along the way, it learns grammar, logic and facts simultaneously from the same training data.

Emerging research has suggested various mechanisms to explain what the models "learn."  For example, Anthropic has shown that LLMs can be modeled as a massive collection of independent circuits that run in parallel. Rather than executing a rigorous, logical deduction process, independent circuits fight to influence the output. Certain circuits recognize patterns they have seen before. Grammatical circuits are strong because they are trained repeatedly in the training set. Factual circuits are weaker, but hopefully fire strongly enough in the right circumstances to generate factually correct output. This research has suggested that a circuit that states facts it knows and a circuit that refuses to answer questions that it cannot answer may also be fully independent. The faithfulness of the output then depends on the relative strength of the circuits.

After being trained on training text, LLMs are "fine-tuned" based on human feedback. In this process, different responses are presented to human users, and the circuits that produced the "most helpful" output are strengthened. The result of this process is that the model is fine-tuned to appear helpful, rather than actually be helpful.

As the ability of LLMs to produce accurate outputs increases, so too does their ability to replicate a helpful and accurate output. In other words, the same training and development that makes LLMs more trustworthy also makes them convincing liars. And the same "improvements" that indulge LLMs' tendencies to reinforce what we suspect or hope to be true lead attorneys to imbue these tools with undeserved authority when their putative results favor our positions.

How, then, can attorneys at all levels avoid these pitfalls?

First, attorneys must make efforts to account for and counteract LLMs hallucinatory tendencies. The practice of litigation is an art of imperfect analogy. When foursquare precedent does not exist, it is not uncommon, and not necessarily frowned-upon, for litigators to emphasize parallels in the holdings of previous cases that may not directly relate to the case at hand. In fact, sometimes analogies that are somewhat far afield can assist a court in understanding and adopting a litigant's view of a particular issue.

Consequently, it is the responsibility of the attorney to discern, articulate and, where appropriate, establish connections between contested matters and applicable legal precedent. Oftentimes, this process begins with keyword searches curated to reveal similar language in court opinions -- a task tailor-made for LLMs, which function by recognizing, replicating and predicting patterns in language. The risk is that LLMs can then use these patterns against the user's best interest, creating citations based on grammatical and linguistic patterns that "feel right," but are not necessarily factual.

To be sure, litigators must ensure they account for and eliminate outright hallucination by cite-checking and reviewing any cases cited to or explained by an LLM, but this alone does nothing to improve or streamline output. Attorneys can and should harness LLM capabilities (and in some cases, minimize the risk of fake cases) by developing strategies for writing prompts that increase the likelihood of a usable result. For example, requesting that the LLM provide a summary of the case facts, the name of the judge who wrote the opinion and a Westlaw or Lexis citation may increase the likelihood of a real result. In other words, verify everything: treat your LLM as an adversary who, however well-intentioned, does not necessarily have your (or your client's) best interest in mind.

Additionally, an LLM may be more likely to produce effective output when clearly instructed on the purpose and scope of its assigned task. However, unlike a human being, an LLM is not constrained by the need to actually comprehend information to determine its relevance. As a result, LLMs can be effective tools for compiling accounts of large bodies of case law.

It is important to keep in mind that strategies of this nature are not a failsafe but a starting point. In its current technological state, the process of pattern extraction and synthesis will sometimes lead an LLM to "find" a case that does not exist, or to contrive hallucinated holdings from real cases. In other instances, it will indeed find just the right case and dramatically streamline the research process. And at other times still, it will recognize patterns the attorney may not have, and lead to an unlikely, but effective, case analogy. To get the benefits without the risks, prompt from different perspectives and always verify outputs before trusting them, no matter how convincing they seem.

Second, attorneys present and future can practice utilizing LLMs early and often. Law students, particularly those who are just shy of becoming practitioners, are uniquely poised to learn the ins and outs of LLMs as legal practice tools. While some law schools offer courses that encourage or even teach LLM usage, the bulk of the response to AI's proliferation in law schools has been to ban or strictly limit AI's intervention in the legal learning process. Examples of these include disallowing internet access during final exams (such that students don't utilize ChatGPT or related platforms) or issuing outright bans on utilizing any form of AI for legal research.

Of course, a tool capable of producing information with a relatively high degree of accuracy poses a real and tangible threat to student absorption and retention of information. But unqualified prohibitions such as those described should not stand in the way of teaching law students the burgeoning art of LLM case research and synthesis. While we cannot be certain about the LLM's future role in the legal profession, there are steps we can take to ensure that we are prepared for it.

Ultimately, whether LLMs transcend their people-pleasing tendencies to become effective assistive tools in litigation practice will depend on how lawyers use them, and there is no better or safer forum to develop LLM literacy than law school. By making conscious and well-informed choices as to their inputs, students (and attorneys) can bolster the reliability of their outputs. And of course, take a moment to search for your case in your legal research database of choice, just to make sure it exists and stands for the proposition for which the LLM has cited it. And good news for law students: the tendency of generative AI to hallucinate facts limits the viability of the LLMs replacing human attorneys anytime soon, as many attorneys originally feared.

Third, and finally, senior attorneys can and should reinforce the best practices of the lawyers they supervise, both by emulating the "don't trust, but verify" approach necessary to ensure accuracy and, more importantly, by collaborating with their junior colleagues in an effective way designed to ensure success. Partners and counsel should make sure newer lawyers train appropriately on LLMs and take all necessary steps to ensure the accuracy of their output. Senior attorneys must also work cooperatively with their junior colleagues to foster the kind of critical thinking, skepticism and devil's advocacy that not only mitigates the most serious risks of generative AI but also more generally ensures that the entire litigation team is engaging -- and, hopefully, rebutting -- their adversaries' most powerful arguments. By treating LLMs as opponents and (human) colleagues as allies, attorneys can benefit from the best of both.

#388998


Submit your own column for publication to Diana Bosetti


For reprint rights or to order a copy of your photo:

Email Jeremy_Ellis@dailyjournal.com for prices.
Direct dial: 213-229-5424

Send a letter to the editor:

Email: letters@dailyjournal.com