Automatic, Not Autopilot: GPT-5's Promise, Misfires, and Legal AI Governance

GPT-5 improves speed and reasoning but reduces manual control and still stumbles. Legal teams should enforce governance: pin models, test outputs, keep human review.

Categorized in: AI News Legal

Published on: Sep 23, 2025

The AI Law Professor: When the new AI model disappoints

GPT-5's rollout is a reminder: as AI gets more autonomous and changes how we work with it, legal teams need principled governance, not just experiments. The promise is big; the practice is where risk lives.

Key takeaways

GPT-5 will make a difference. It's a unified system that routes your prompt to different reasoning modes. Performance improves, but you give up some manual control you've grown to trust.
Results still see errors. Benchmarks looked strong, yet public misfires (like error-filled maps) show limits that matter in real work.
Disillusionment is real. Gartner's view suggests GenAI is heading into the trough of disillusionment, which makes governance and expectation-setting essential for legal teams. Gartner's Hype Cycle

Automatic transmission, less stick-shift

If you learned on a manual transmission, you remember the feeling of control. GPT-5 feels like switching to automatic. It's faster and, in many contexts, smarter, but it decides when to "shift."

OpenAI describes GPT-5 as one system with a router that picks between fast and deeper "thinking" modes. Microsoft echoed that framing across Copilot. For early-adopter lawyers, that's both helpful and irritating. It smooths routine work, but it moves a slice of craft from your hands to the platform.

Expectations meet a colder reality

On launch day, we heard that GPT-5 is like having a team of PhD-level experts in your pocket. Then the internet lit up with flubs: invented state names on a U.S. map, scrambled presidential timelines in graphics. If you promise doctorates, basic geography errors feel like malpractice.

Part of this is on us. We amplify great demos, project personality onto models, and then confuse style with reliability. Yes, GPT-5 shows fewer hallucinations and stronger scores in math, coding, and multimodal tasks. But no benchmark guarantees competence on every odd, mixed-media request you throw at it.

You can find all The AI Law Professor's columns here.

Control, relationship, capability

Many lawyers didn't just use prior models; they built relationships with them. When GPT-5 launched, OpenAI retired several choices in the model picker and auto-mapped old threads to new equivalents. That broke habits and felt personal for paying users who relied on each model's tone and tempo. After pushback, some options came back-proof this was more than UI.

Capability isn't uniform. GPT-5's gains are real, and the brittleness at the edges is real too. "Best average" isn't "best for your task." Think comparative advantage, not universal claims.

The market is cooling-governance matters more

Gartner's recent analysis places GenAI on the slide into disillusionment. Many 2024 initiatives under-delivered, and fewer than a third of AI leaders say their CEOs are happy with returns. GPT-5 landed right in the middle of that mood. In this climate, one over-promised demo or clumsy deprecation can overshadow a long list of real improvements.

For practicing lawyers, the lesson is simple. Expect steady progress, not magic. Expect shorter change windows. Expect platform shifts that force you to adapt. Your governance has to flex without compromising duties to clients and courts.

What legal teams can do now

Select specific versions of AI models. If you need consistent behavior, insist on model pinning, announced change windows, and rollback. You may need the OpenAI API instead of consumer ChatGPT. If you use ChatGPT Business or Enterprise, learn the legacy model access policy and set internal migration timelines. These details can mean the difference between sound analysis and slop.
Test like you bill. Keep a simple checklist of the tasks you give the tool: e-discovery summaries, brief outlines, citation checks, transcript cleanups, RFP drafts. Define what "good" looks like. Score accuracy, completeness, and consistency-not how persuasive the tone feels.
Separate tone from truth. A model that feels right isn't necessarily more reliable. A blunter model may surface uncertainty more honestly. Treat tone as a configuration issue, not a proxy for validity.
Keep human control visible. Anchor to four principles: transparency, autonomy, reliability, visibility. The router may help with autonomy and sometimes reliability; you must enforce transparency and visibility. Log prompts and outputs, add review points, keep humans in the loop, and make the model's blind spots explicit to the supervising lawyer.

Right-sizing expectations

Was GPT-5 overhyped? Probably. Are we complicit? Also yes. We want one model to be perfect writer, paralegal, researcher, designer, and cartographer. We treat the best average performer as a sure thing everywhere, then feel betrayed when it stumbles.

Adopt a stance of modesty plus rigor. Take GPT-5's real gains-better reasoning modes, fewer hallucinations on many prompts. Keep manual control where it matters. Don't let a router, however clever, become a hidden change agent inside your practice. If GPT-5 is the automatic transmission, keep your hand near the gearshift. Know when to let it shift, and when to downshift yourself.

If your team needs structured upskilling and model-agnostic playbooks, see curated programs by job role: Complete AI Training - Courses by Job.

Next column, we'll examine how to build an AI governance policy that actually works.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Advertisement

Automatic, Not Autopilot: GPT-5's Promise, Misfires, and Legal AI Governance

The AI Law Professor: When the new AI model disappoints

Key takeaways

Automatic transmission, less stick-shift

Expectations meet a colder reality

Control, relationship, capability

The market is cooling-governance matters more

What legal teams can do now

Right-sizing expectations

Related AI News for Legal

AI and Law Firm Visibility: What Works, What Doesn't, and What to Do Next

One Rulebook or Power Grab? Trump's AI Order Triggers a Federal-State Showdown

Legal AI Goes Core: CS Disco Turns Data Deluge into Faster Discovery, Better Margins, and Recurring Revenue

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company:

Automatic, Not Autopilot: GPT-5's Promise, Misfires, and Legal AI Governance

The AI Law Professor: When the new AI model disappoints

Key takeaways

Automatic transmission, less stick-shift

Expectations meet a colder reality

Control, relationship, capability

The market is cooling-governance matters more

What legal teams can do now

Right-sizing expectations

Related AI News for Legal

AI and Law Firm Visibility: What Works, What Doesn't, and What to Do Next

From Data Consent to Patents: Essential AI Legal Answers for Canadian Businesses

One Rulebook or Power Grab? Trump's AI Order Triggers a Federal-State Showdown

Legal AI Goes Core: CS Disco Turns Data Deluge into Faster Discovery, Better Margins, and Recurring Revenue