GPT-5 Beats Human Judges at Following the Law-But Do We Want It on the Bench?

GPT-5 and Gemini 3 Pro aced rule-bound legal tests, even besting federal judges. Useful for crisp, black-letter calls-but when fairness or context matters, a human should decide.

Categorized in: AI News Legal

Published on: Feb 16, 2026

GPT-5 beats judges on rule-following. Should it judge?

New research from University of Chicago law professor Eric Posner and researcher Shivam Saran reports that GPT-5 applied the correct legal rule in every scenario they tested, outperforming a cohort of US federal judges who hit 52 percent. The model showed no hallucinations or obvious logical slips in these tasks.

Google's Gemini 3 Pro matched GPT-5 with a perfect score. Earlier work by the same team found GPT-4o stuck closely to the letter of the law in an ICTY war-crimes appeal simulation, even when sympathy for a party might have nudged a human judge.

What the numbers say

GPT-5: 100% correct rule application; perfectly formalistic
Gemini 3 Pro: 100%
Gemini 2.5 Pro: 92%
o4-mini: 79%
Llama 4 Maverick: 75%
Llama 4 Scout: 50%
GPT-4.1: 50%
US federal judges (comparison study): 52%

Important nuance: a 52 percent "rule-following rate" does not mean judges were lax. Many questions turn on standards and guidelines, not hard rules. Discretion is part of the job, especially where policy, equity, or context matters.

Why this matters for legal practice

AI is getting better at mechanical legal tasks. It's also getting more rigid. If you give it a rule-bound question, it will likely snap to the black-letter answer-fast and consistently. That is useful, but it can cut against fairness in cases where the law invites judgment instead of deduction.

The real question isn't whether a model can follow rules. It's who decides when rule-following should yield to standards, policy, or mercy-and how that discretion is documented, reviewable, and accountable.

Practical guidance for courts and chambers

Segment the workload: Use AI for conflicts-of-law triage, cite checks, and consistency reviews. Keep discretionary calls-sentencing, custody, asylum, equitable remedies-human-led.
Define "AI-eligible" issues: Rules with clear elements, bright lines, or fixed choice-of-law tests are in. Ambiguous standards are out.
Require human sign-off: Any AI input must be reviewed, edited, and owned by a judicial officer or clerk.
Set model parameters centrally: Lock system prompts and temperature. Document who controls changes and why.
Maintain an audit trail: Save prompts, model versions, outputs, and edits. Make them discoverable where appropriate.
Test blindly: Run historical opinions and bench memos through the system to check recall, precision, and unintended bias before live use.
Localize law: Preload governing statutes, rules, and controlling precedent for the jurisdiction. Block access to non-authoritative sources.
Security and confidentiality: Use on-prem or approved tenants. No public endpoints for sealed or sensitive materials.
Error pathways: Define how to challenge or override AI suggestions, and who has that authority.

Practical guidance for litigators and in-house counsel

Use AI where formality helps: Choice-of-law factors, element checklists, deadline and rule compliance, and cite validation.
Guard against overreach: For standards (reasonableness, undue burden, best interests), force the tool to expose competing frameworks and policy trade-offs-not a single answer.
Prompt with constraints: Specify controlling jurisdiction, date cutoffs, and binding vs. persuasive authority. Ban speculation.
Cross-verify: Independently confirm every citation and quote. No exceptions.
Document provenance: Keep a record of the exact prompt, model version, and edits in the file.
Prepare to explain: If you use AI-assisted analysis, be ready to justify the reasoning without leaning on "the model says so."

Open questions policy-makers should settle

Parameter governance: Who sets model prompts and dials, and under what oversight?
Transparency: When must parties be told AI assisted a decision or draft?
Standards of review: How should appellate courts treat AI-influenced reasoning?
Fairness vs. formalism: When do we permit deviation from rules for policy or equity, and who decides?
Vendor accountability: What warranties, logs, and audit rights must be in the contract?

Where AI fits today

These findings make a strong case for AI as a co-pilot on rule-heavy tasks: conflicts-of-law screening, element-by-element analysis, and consistency checks across similar fact patterns. Use it to flag deviations and surface controlling authority quickly.

Do not let a model be the last word where standards, context, or moral judgment carry weight. The studies show models can ignore sympathy and stick to rules. That's valuable-until the just outcome requires stepping away from the script.

Skill up your team

If your organization is formalizing AI literacy for legal-adjacent work, see our AI courses by job role for structured options tailored to practical workflows.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

GPT-5 Beats Human Judges at Following the Law-But Do We Want It on the Bench?

GPT-5 beats judges on rule-following. Should it judge?

What the numbers say

Why this matters for legal practice

Practical guidance for courts and chambers

Practical guidance for litigators and in-house counsel

Open questions policy-makers should settle

Where AI fits today

Further reading

Skill up your team

Related AI News for Legal

Oregon court fines Salem attorney $10,000 for brief containing 15 AI-hallucinated citations

Harvey raises $250M Series C at $11B valuation to expand legal AI platform

Federal court rules AI chatbot conversations are not protected by attorney-client privilege

AI shopping agents outpace commerce law as courts begin ruling on who holds authority in automated transactions

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: