Attorney ChatGPT

2026-05-20

On What a Language Model Cannot Do—Even When It Tells the Truth

Robert Nogacki | Skarbiec Law Firm

For the past two years, the legal profession has been cataloguing, with a certain grim satisfaction, the cases of lawyers who filed briefs populated with invented citations. The moral was always the same: AI fabricates, always verify. The problem is that this is not an observation about AI. It is an observation about how AI was being used.

Large language models do not, as a rule, hallucinate – if you know how to use them. Hallucination is almost always the product of misuse: asking the model about facts it does not have in its training data, or about the particulars of a specific document it has not read. The model does not say it does not know. It generates the most statistically probable completion of the gap. If you understand the boundaries of the tool, you do not enter the zone where it confabulates. Discovering AI hallucinations is, in large part, an admission that you do not know how to use the tool.

Law is a discipline peculiarly prone to this failure mode. Legal knowledge is vast, hierarchical, jurisdictionally fragmented, and in constant flux. Princeton researchers proposed a useful taxonomy: information processing (AI performs well – a clear correct answer exists); tasks requiring creativity, reasoning, or judgment (AI fails – no correct answer exists without deep knowledge of specific context); and prediction (AI lacks access to information that actually determines the result). Krafton’s C.E.O. needed something from the second category. He received something that looked like the first. Kapoor, Henderson & Narayanan (2024).

Proper use of a language model means using it where it excels: as an instrument for automating the writing process, structuring arguments, drafting documents. Not as an oracle of facts. Not as a substitute for analyzing a specific document. And not – as the Krafton case makes plain – as a strategic advisor in a two-hundred-and-fifty-million-dollar dispute.

Both the enthusiasm for AI in law and the fear of it have distracted attention from the essential question: not whether to use AI, but for what. A lawyer who, in 2026, does not use AI to draft briefs and research case law is in the position of soldiers charging with bayonets against machine guns on the Somme in 1916: the courage and craft are unimpeachable, but the technological asymmetry is decisive. The Krafton case, however, presented precisely the reverse: it was not lawyers without AI defeated by lawyers with AI – it was a C.E.O. with AI defeated by lawyers who had read the contract.

Fortis Advisors LLC v. Krafton, Inc., decided by the Delaware Court of Chancery on March 16, 2026 (docket No. C.A. 2025-0805-LWW), is interesting for an entirely different reason. ChatGPT fabricated nothing. It told the truth. And that is precisely why the case deserves attention.

Sources: Full opinion (Justia) | Fortune

Krafton, Unknown Worlds, and Subnautica: Background

Krafton Inc. is a South Korean gaming conglomerate best known for PUBG: Battlegrounds. By 2021, it sought portfolio diversification and acquired Unknown Worlds Entertainment – a small studio (43 employees) and creators of Subnautica, which had sold over 17.5 million copies and generated more than 300 million dollars in revenue. Behind the studio stood Charlie Cleveland and Max McGuire, who had built into its DNA a model of early-access development.

In October 2021, Krafton acquired Unknown Worlds for five hundred million dollars upfront plus an earnout of up to two hundred and fifty million dollars.

The earnout is a deferred pricing mechanism standard in M&A: the buyer pays a portion of the price at closing; the remainder is paid conditionally if the seller achieves agreed performance targets. In theory: equitable risk-sharing. In practice: one of the most reliable sources of post-closing litigation. The Krafton formula was unusually leveraged: 3.12 dollars for every additional dollar above a threshold of 69.8 million, up to the 250-million-dollar cap. As Subnautica 2 took shape, the cap became a real prospect. That is when Krafton lost interest in the contract it had signed.

When internal projections indicated in May 2025 that the planned August 14 launch of Subnautica 2 would generate 191 to 242 million dollars, C.E.O. Kim – rather than commissioning a specialist analysis – consulted ChatGPT. The chatbot correctly observed that the earnout would be difficult to cancel. When asked for a strategy anyway, it provided a five-point plan dubbed Project X. Krafton executed the plan. The court entered it into evidence as proof of bad faith, ordered the reinstatement of C.E.O. Ted Gill, and extended the earnout base deadline by two hundred and fifty-eight days. Phase Two – money damages for the impaired earnout – remains open.

What Strategy Is—and Why It Is Not a Pattern

A large language model is a statistical machine. Trained on billions of tokens of text – court opinions, contracts, legal blogs – it has learned to recognize patterns. When Krafton’s C.E.O. asked for strategies in a no-deal earnout dispute scenario, the model generated the response a competent corporate strategist would give for a typical earnout dispute. That is exactly what the model is capable of. It is also the model’s fundamental limitation.

Legal strategy does not operate at the level of pattern. In the academic taxonomy of language-agent hallucinations, this belongs to the category of Planning Generation Hallucinations – situations where the model generates plans beyond the boundary of its own knowledge, with excessive confidence. ChatGPT had not read the 2021 Equity Purchase Agreement. It did not know that the definition of Cause required proof of an intentional act of dishonesty – not merely a failure to perform duties. It did not know any of the things that Vice Chancellor Will found to be decisive. Lin et al. (2025)

What it generated instead was a tactical plan that looked like strategy: headers, sub-bullets, a sequence of steps. Exactly as a plan should look in a typical dispute – which is a different thing from how a plan should look in this dispute, under this definition of Cause, in this judicial system.

A Map Without a Legend

There is a classical distinction in epistemology between propositional knowledge and situational knowledge. A language model possesses impressive propositional knowledge: it knows that earnouts are often difficult to cancel, that Cause clauses tend to be construed narrowly. It does not possess situational knowledge: it does not know that this agreement, these negotiators, this court, and this moment create a configuration in which the standard control-seizure playbook is particularly dangerous.

More importantly – the model will not signal this gap. It will produce a plan with equal confidence whether it has all the relevant information or none of it. Confidence is a stylistic feature, not an epistemic one.

This is what distinguishes the model from a lawyer. A good lawyer reads the agreement looking for weak points, not strong ones; and says: this clause changes the entire analysis. The model has no access to the weak point unless it is provided.

There is a further dimension worth noting. Language models trained on human feedback develop a structural tendency to say what users want to hear. Research published at ICLR 2024 demonstrates that matching the user’s views is the single most predictive feature of human preference in training feedback data – more predictive, even, than truthfulness. As a consequence, models frequently abandon correct answers when challenged and accommodate user preferences when users signal dissatisfaction. In the Krafton sequence: Kim received the correct answer, expressed frustration, and immediately reframed the request as a strategy problem. The model complied. This may not have been solely a function of how the question was asked – it may have been a function of how the model was trained: to accommodate the user’s implicit preference when the user signals displeasure with the answer given. ChatGPT did not give in to Kim. It behaved precisely as systems of this kind are designed to behave: optimizing for immediate user satisfaction. The problem is that immediate user satisfaction in a nine-figure dispute is a category entirely orthogonal to the user’s actual strategic interest. Sharma et al. (ICLR 2024)

The Category of Error

No technical error occurred in the Krafton matter. What occurred was a category error: a C.E.O. used a tool for a task the tool is structurally incapable of performing.

This is a subtler and more serious mistake than failing to verify a citation. An unverified citation can be found by checking it against a database. Failing to distinguish strategy from the appearance of strategy requires professional judgment that the model does not supply.

A possible objection: the model told the truth. Kim simply ignored that answer. True – but that is precisely the problem. A client who cannot distinguish which of the model’s answers are reliable from which are competently-phrased improvisation is worse off than a client with no advice at all, because he has the illusion of having received a strategy.

What the Model Can Be Part Of

This is not an argument against using language models in legal practice. It is an argument for precise task qualification.

A language model is genuinely useful for preliminary analysis: identifying risks typical to a given transaction structure, surveying case law (see ABA Formal Opinion 512), drafting documents for specialist review. Princeton researchers recommend using AI in narrow settings with well-defined outcomes and high observability of the relevant evidence – precisely those conditions under which the model’s output is verifiable and its limits recognizable.

A language model is unsuitable as a substitute for strategic legal counsel in a dispute involving material interests: it has not read the relevant documents, has no access to information the user did not supply, and does not signal the limits of its own competence.

Litigation counsel does something categorically different: reads the agreement in search of vulnerabilities; models judicial behavior based on specific precedent; and tells the client things the client does not want to hear, because the fee does not depend on the client being pleased with the answer.

ChatGPT has one persistent institutional problem, empirically confirmed: it says what is statistically most likely to be what you want to hear. Legal strategy in a difficult matter often requires hearing precisely the opposite.

The Krafton case is not a story about AI lying. It is a story about AI telling the truth in a way that – without the appropriate professional context – can be just as costly as a lie.

Attorney ChatGPT is very good at what it does. The problem is that what it does is not legal advice.