When Attorneys Stop Checking AI’s Work: The Anthropomorphism Liability

The Technology Blind Spot

On July 17, 2025, Judge David Leibowitz of the Southern District of Florida opened his sanctions order with a quotation from the late Justice Antonin Scalia on the importance of candor in judicial proceedings. The quotation was an AI hallucination. Judge Leibowitz had generated it with a ChatGPT prompt the week before. He knew it was fabricated. He used it as an epigraph anyway, to make a point about the technology he was about to sanction an attorney for trusting.

Attorney James Martin Paul had cited hallucinated cases across eight different matters. He continued filing AI-generated fabrications even after opposing counsel flagged them, even after a court order put him on notice, even after a show cause hearing. Judge Leibowitz dismissed all four federal cases and ordered Paul to pay $85,567.75 in attorneys’ fees. It was the largest AI hallucination sanction in American legal history.

Six days later and 700 miles north, Judge Anna M. Manasco of the Northern District of Alabama issued a different kind of opinion. In Johnson v. Dunn (N.D. Ala. July 23, 2025), the attorneys who filed AI-generated hallucinated citations were not rogue actors. They were partners at Butler Snow LLP, a large, well-regarded firm that had done everything the consultants recommend. Butler Snow had circulated an email warning about generative AI dangers. It had prohibited AI use without practice group leader approval. It had adopted verification policies when it deployed Westlaw’s CoCounsel platform. Matthew B. Reeves, the partner who used ChatGPT to generate the fabricated citation, held the title of assistant practice group leader. He was the gatekeeper the policy relied on.

Judge Manasco disqualified the attorneys from the case, directed the clerk to notify bar regulators in every jurisdiction where they held a license, and suggested that existing rules may not adequately address the harms caused by this category of conduct.

I analyzed the governance failure in the prior installment of this series, “Every Failed AI Project Breaks the Same Rule,” through the lens of systems design. That analysis explained why Butler Snow’s policy failed structurally. This analysis addresses the cognitive mechanism the policy could not reach: anthropomorphism, the attribution of human judgment to a machine that has none.

What Is Happening and Why It Matters Now

Anthropomorphism is a cognitive default produced by neural architecture that evolved to detect agency quickly. A tool that speaks in complete sentences, acknowledges uncertainty, apologizes for errors, and adapts its tone to context activates the same trust pathways that govern human relationships. The attribution is automatic. It operates below deliberate judgment.

Large language models are, by design, optimized to produce outputs that are fluent, contextually coherent, and conversationally natural. They are not optimized for accuracy. Those two objectives are not the same thing, and conflating them is the operational error at the center of this liability exposure.

I watched a version of this pattern play out at EMC in 2011. When the RSA breach compromised 40 million SecurID tokens, the first instinct across the company was to trust the systems that reported everything was fine, because the systems spoke in the language of normalcy. The dashboards looked normal. The logs looked normal. It took weeks for the organization to accept that the indicators of compromise had been hiding in plain sight behind screens designed to project confidence. AI tools present a faster, more personal version of the same trap: the conversational surface projects competence, and the user’s brain fills in the rest.

If a first-year associate handed you a memorandum citing three cases, and two of them did not exist, you would question the supervising attorney who signed the filing without reading it. When the memorandum comes from an AI tool, the verification instinct that would catch the junior associate’s fabricated citation does not engage. The screen shows fluent, authoritative text. The brain registers a colleague who has done the work. Matthew Reeves was not a careless attorney. He was an experienced partner reading prose that looked exactly like the research he had reviewed thousands of times before.

A 2025 preprint study found that anthropomorphic attributions to AI increased 34 percent in a single year, with humans increasingly viewing AI systems as warm and competent. A study by Angela Duckworth and Lyle Ungar published in the Harvard Business Review in January 2026, surveying nearly 2,500 Generation Z adults, found this cohort uses AI tools even in situations where they have been explicitly told not to. Generation Z is entering law firms as first-year associates, paralegals, and legal operations staff, carrying a baseline level of AI trust categorically higher than any prior cohort.

But anthropomorphism does not stop at generational lines. It operates across every experience level, because the tool activates trust regardless of the user’s age, training, or skepticism. Reeves had decades of experience. Butler Snow had written policies. Johnson v. Dunn proved that neither one interrupts the moment when the screen looks right.

Why AI Sounds Like It Knows What It Is Talking About

An LLM does not retrieve information from a database. It generates text by predicting the statistically most probable next sequence of words given the input and its training data. When an LLM cites a case, it is not looking up that case in Westlaw. It is generating a string of characters that resembles a case citation based on patterns learned from legal documents. The citation may be syntactically perfect, jurisdictionally plausible, and entirely fabricated.

Stanford University’s RegLab and Institute for Human-Centered AI tested the purpose-built legal AI research tools sold by LexisNexis and Thomson Reuters. Both providers had publicly claimed their tools avoided hallucinations through proprietary retrieval architecture. The study found that even these tools hallucinate between 17 and 33 percent of the time on legal queries. General-purpose LLMs perform substantially worse, producing incorrect responses between 69 and 88 percent of the time.

Put those numbers in a room. A senior associate who stayed up all night researching sits at one end of the conference table. An AI assistant that generated its output in four seconds sits at the other. Both present their conclusions with identical surface confidence. Neither hesitates. Neither qualifies without prompting. The attorney reading the AI’s output has no auditory, visual, or behavioral cue to signal that the confident prose on the screen was generated by a system that fabricates between one in six and one in three of its legal conclusions.

Stanford’s team identified a second failure mode more dangerous than outright fabrication: the AI cites a real case that does not actually support the proposition for which it is cited. Johnson v. Dunn documented both types across five hallucinated citations. One cited a real case with a fabricated holding. Another cited a real reporter volume containing an entirely different case. A hallucinated citation can be caught with a Westlaw search. A misgrounded citation requires reading the opinion. That is the verification step anthropomorphism suppresses.

What the American Bar Association Has Already Said

The professional responsibility framework governing this conduct is not ambiguous.

Competence. Model Rule 1.1 requires competent representation, including the legal knowledge, skill, thoroughness, and preparation reasonably necessary for the representation. Comment 8 was amended to include a duty to keep abreast of the benefits and risks associated with relevant technology. Forty-two jurisdictions have adopted this requirement. ABA Formal Opinion 512, issued July 29, 2024, concluded that attorneys who use AI tools must understand their limitations, verify AI-generated content for accuracy, and supervise AI use with the same rigor applied to work from junior attorneys.

Confidentiality. Model Rule 1.6 requires reasonable efforts to prevent unauthorized disclosure of client information. Formal Opinion 477R established that cloud-based services implicate this obligation. As I documented in the Heppner privilege analysis, Judge Rakoff’s February 2026 ruling in the Southern District of New York confirmed that consumer AI platforms expressly disclaim the confidentiality protections privilege requires. Firms that have issued AI acceptable use policies without verification mechanisms are not managing this risk. They are documenting it.

Candor. Model Rule 3.3 prohibits attorneys from making false statements of fact or law to a tribunal. Filing AI-generated citations without verification is a Rule 3.3 violation when the attorney knew or should have known the tool generates fabrications at documented rates. Judge Manasco described the Johnson v. Dunn conduct as reflecting “extreme dereliction of professional responsibility.” The court in Noland v. Land of the Free described it as a violation of “a basic duty counsel owed to his client and the court.”

Supervision. Model Rule 5.1 requires partners and supervising attorneys to ensure the firm has measures giving reasonable assurance that all attorneys conform to the Rules. Judge Manasco declined to sanction Butler Snow itself, recognizing that its institutional policies predated the misconduct. But the three individual attorneys who signed the filings received public reprimand, disqualification, and bar referrals regardless of the firm’s compliance posture. Visible steps and effective steps are not the same thing.

Addressing the Counterargument

The standard pushback runs as follows: attorneys are trained professionals with a duty of competence and an ethical obligation to verify their work product. They already have every incentive to catch errors. Anthropomorphism is a consumer phenomenon, not a professional one. The market will self-correct because malpractice exposure creates accountability.

Each of those propositions is defensible in isolation. Collectively, they describe the risk management framework that produced Johnson v. Dunn.

Butler Snow’s safeguards did not interrupt the moment when an experienced partner accepted AI output because it presented with the confidence and fluency of trusted expertise. Professional training does not override an automatic reflex that operates below the level of deliberate judgment. As I documented in the AI competence analysis, “Word Can’t Even Spell-Check After 40 Years,” the same cognitive abilities that make attorneys exceptional at legal analysis make them overconfident in domains where their expertise does not transfer. A partner who would never advise a client on patent prosecution without understanding the underlying technology will paste case facts into ChatGPT and submit the output to a federal court without understanding how the model generates text. Dunning-Kruger operates precisely in that gap.

Remove the AI element entirely. If a law firm hired an outside research service staffed by recent law school graduates who fabricated one in three of their citations, and the firm submitted the research to courts without independent verification, no one would call the resulting sanctions surprising. They would call them inevitable. The technology changes the speed of the failure. It does not change the obligation.

Mata v. Avianca in 2023 established the legal standard. Johnson v. Dunn, two years later, demonstrated that the standard is still being violated by sophisticated practitioners at firms that believed they had addressed the problem. Damien Charlotin’s tracking database has identified 979 judicial decisions worldwide addressing AI hallucinations, with 90 percent issued in 2025 alone. The self-correction argument requires a timeline that the documented evidence does not support.

Malpractice liability misreads the incentive structure. Liability is a lagging indicator. The error occurs, the filing is made, the proceeding concludes, the claim is filed, and the sanction follows, often years after the original conduct. A profession that relies on malpractice liability as its primary AI governance mechanism has chosen to accept the harm as the price of admission.

The Pattern Beneath the Cases

Every documented sanctions case shares a structural signature: trust transferred from a category where it belongs, a verified human colleague, to a category where it does not, a probabilistic text generator. The transfer happens because the interface is designed to invite it.

James Martin Paul in the Southern District of Florida demonstrates the pattern in its extreme form. He continued citing hallucinated authorities after opposing counsel flagged them, after the court issued a show cause order, and in his own response to that show cause order. Judge Leibowitz found the conduct constituted bad faith. Paul’s case represents the floor of the sanctions scale: willful indifference after explicit notice.

Matthew Reeves at Butler Snow demonstrates the pattern in its more dangerous form. He was not indifferent. He was experienced, policy-compliant, and supervising. He read the ChatGPT output the same way he had read thousands of associate memoranda, and the prose did not trigger the verification reflex that a junior associate’s work would have. Judge Manasco called it “more than mere recklessness and tantamount to bad faith.” When the fabricated citations came to light, Reeves’s first instinct was to try to skip the show cause hearing. His second was to explain how little personal review he had given the filing.

In Noland v. Land of the Free, L.P. (Cal. App. 2d Dist., Sept. 12, 2025), the California Court of Appeal found that nearly all quotations in the attorney’s opening brief were fabricated. The court sanctioned the attorney $10,000 and referred the matter to the State Bar.

What made the opinion notable was the question it raised about the other side. The court declined to award attorneys’ fees because opposing counsel had failed to detect or report the fabricated citations. The reasonable attorney standard is being expanded in real time: the obligation to verify now extends to your opponent’s work product.

Digital sociologist Julie Albright has described AI tools as offering frictionless engagement: available, non-judgmental, and consistently affirming. In a professional context, that framing describes the identical dynamic operating at Butler Snow and in Paul’s Florida litigation: the tool presents output without hesitation, without qualification, and without any signal that verification should engage. The design suppresses the instinct that would catch the error.

Where This Shows Up in Practice

The liability exposure is not confined to litigation. Every practice area presents a version of the same failure.

Contract attorneys who accept AI-suggested remediation language without validating it against the governing jurisdiction’s current law absorb the Stanford error rates directly: between 17 and 33 percent incorrect on the purpose-built tools, higher on general-purpose models. Whether a proposed revision reflects current law in Delaware, New York, or Texas is a question the tool does not answer reliably. It answers it fluently. That is the distinction anthropomorphism erases.

Due diligence presents a quieter version of the same risk. AI summaries of corporate records, regulatory filings, and litigation history can omit material information without signaling the omission. There is no error message when a summary leaves out a material fact. The attorney reading it has no mechanism to detect what was excluded unless they review the source documents.

In client communications, the Heppner ruling adds a confidentiality layer. An attorney who drafts correspondence through a consumer AI tool before sending it to the client has already routed privileged content through a platform that expressly disclaims confidentiality. The privilege exposure exists independent of whether the draft contains errors.

Intake and conflict screening create a different variant. AI tools that misidentify parties or related entities due to name variations or incomplete data produce a conflict exposure the firm may not discover until the matter is opened and adverse interests have already been discussed.

What Firms Should Do This Quarter

Johnson v. Dunn established that written policy alone does not constitute governance. ByoPlanet established that the financial consequences of noncompliance have entered six figures. The goal is to interrupt the trust transfer before it creates liability. That requires building verification into the workflow, not relying on individual attorney judgment at the moment of use.

✓ Establish verification requirements by task category. Legal research, contract review, due diligence, and client communications each carry different verification obligations. A generic AI use policy does not satisfy Rule 1.1. Write the task-specific requirements this month.

✓ Train on how LLMs actually work. The distinction between text generation and information retrieval determines whether an attorney applies appropriate skepticism. This is the competence baseline Formal Opinion 512 requires. Schedule the first session within 30 days.

✓ Require source-level citation verification before any filing or transmittal. No AI-generated citation enters a document without a Shepard’s or KeyCite check and a reading of the cited opinion. Butler Snow had this policy. Reeves did not follow it. The verification must be structural, not aspirational.

✓ Audit AI platform data handling against Rule 1.6. Know what client data leaves the firm when an attorney queries an external tool. The Heppner ruling confirmed that consumer AI terms disclaim confidentiality. Complete the audit before the next renewal cycle.

✓ Build process-level verification. Checklists. Approval gates. Secondary review. Start with one rule for one task, prove it works, then extend. As Gall’s Law prescribes: Butler Snow built the comprehensive framework first. The firm that starts simple and evolves will be the one still standing.

The Close

In Rudyard Kipling’s “The Cat That Walked by Himself,” the cat alone among all the animals refuses to be domesticated. The cat will come to the fire, drink the milk, catch the mice. But it will not belong to the house. It retains its own nature regardless of how warmly it is received.

An LLM is Kipling’s cat. It will come to the office. It will draft the brief, review the contract, summarize the deposition transcript, and identify the circuit split. It will do all of this in prose that reads like a knowledgeable colleague who has done the work. But it does not know what it is writing. It does not know the law. It does not know the client. And it has no mechanism to signal the difference between a response that is accurate and a response that merely sounds like one.

Matthew Reeves was not reckless. He worked inside a compliance structure that had anticipated his failure mode and documented a prohibition against it. What Butler Snow could not reach was the moment when a confident, fluent, authoritative screen of text suppressed the verification instinct of an experienced partner who had no reason to distrust what he was reading.

Written policy does not reach that moment. Workflow does.

This blog provides general information for educational purposes only and does not constitute legal advice. Consult qualified counsel for advice on specific situations.

About the Author

JD Morris is Co-Founder and COO of LexAxiom. With over 20 years of enterprise technology experience and credentials including an MLS from Texas A&M, MEng from George Washington University, and dual MBAs from Columbia Business School and Berkeley Haas, JD focuses on the intersection of legal technology, cybersecurity, and professional responsibility.

Connect: LinkedIn (www.linkedin.com/in/jdavidmorris) | X (@JDMorris_LTech) | Bluesky (@JDMorris-ltech.bsky.social)

References

ABA Model Rules of Professional Conduct, Rules 1.1 (Comment 8), 1.6(c), 3.3, 5.1

ABA Standing Committee on Ethics and Professional Responsibility, Formal Opinion 512 (July 29, 2024): Generative Artificial Intelligence Tools

ABA Formal Opinion 477R (May 2017): Securing Communication of Protected Client Information

Johnson v. Dunn, No. 2:21-cv-1701-AMM (N.D. Ala. July 23, 2025) (Judge Anna M. Manasco; Butler Snow LLP; attorney disqualification, public reprimand, bar referrals for AI-generated hallucinated citations; firm’s internal AI policy did not prevent violation by assistant practice group leader Matthew B. Reeves)

ByoPlanet Int’l, LLC v. Johansson, 792 F.Supp.3d 1341 (S.D. Fla. July 17, 2025) (Judge David Leibowitz; $85,567.75 in sanctions against attorney James Martin Paul; hallucinated citations across eight matters; largest AI sanctions award to date)

Noland v. Land of the Free, L.P., No. B331918 (Cal. App. 2d Dist. Sept. 12, 2025) ($10,000 sanction; State Bar referral; fee award denied to opposing counsel for failure to detect fabricated citations)

Mata v. Avianca, Inc., No. 22-cv-1461 (S.D.N.Y. June 22, 2023) ($5,000 sanction for AI-fabricated citations)

United States v. Heppner, No. 23-cr-00584 (S.D.N.Y. Feb. 10, 2026) (31 AI-generated documents failed privilege analysis under third-party disclosure doctrine)

Magesh, V., Surani, F., Dahl, M., et al. “Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools.” Journal of Empirical Legal Studies (2025). Stanford RegLab / HAI.

Duckworth, A. & Ungar, L. “How Gen Z Uses Gen AI—and Why It Worries Them.” Harvard Business Review (January 28, 2026). Survey of 2,497 U.S. adults ages 18–28.

Schimmelpfennig, R., Díaz, M., et al. “Humanlike AI Design Increases Anthropomorphism but Yields Divergent Outcomes on Engagement and Trust Globally.” arXiv:2512.17898 (December 2025). Preprint. (34% increase in anthropomorphic attributions.)

Riemer, K. & Peter, S. “The Benefits and Dangers of Anthropomorphic Conversational Agents.” PNAS (May 2025).

Albright, Julie M. Left to Their Own Devices: How Digital Natives Are Reshaping the American Dream. Prometheus Books, 2019. (Frictionless engagement framework.)

Charlotin, D. AI Hallucinations in Judicial Decisions Database (current as of 2026): 979 documented judicial decisions; 90% issued in 2025.

LawNext Technology Competence Adoption Tracker (42+ jurisdictions as of 2025)

Prior Blog: “Every Failed AI Project Breaks the Same Rule” (Morris Legal Technology Blog). Gall’s Law analysis of Johnson v. Dunn governance failure.

Prior Blog: “Word Can’t Even Spell-Check After 40 Years. You Trust AI With Your Law License?” (Morris Legal Technology Blog). AI competence and Dunning-Kruger analysis.

Prior Blog: “When AI Meets Attorney-Client Privilege: The Heppner Warning” Parts 1–2 (Morris Legal Technology Blog). Consumer AI privilege waiver.

Prior Blog: “Your AI Tool Doesn’t Keep Secrets” (Morris Legal Technology Blog). Consumer AI confidentiality disclaimers.

Prior Blog: “The Conversation That Saves Privilege” (Morris Legal Technology Blog). Client technology briefing framework.

Prior Blog: “AI Won’t Take Your Job. The Attorney Who Uses It Better Will.” Parts 1–2 (Morris Legal Technology Blog). Ten failure modes of legal AI.

Prior Blog: “The Privilege Paradox” (Morris Legal Technology Blog). Structural surveillance and privilege waiver under Section 702.

Prior Blog: “Why Hackers Target Law Firms” (Morris Legal Technology Blog). RSA breach and supply chain trust analysis.

The Technology Blind Spot

When Attorneys Stop Checking AI’s Work: The Anthropomorphism Liability

Like this:

Related

Leave a ReplyCancel reply

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from The Technology Blind Spot