AI Is Moving Faster Than Legal Governance. That's a Discovery Problem Waiting to Happen.
Legal teams are adopting large language models across review, search, and drafting. The speed is impressive. The governance isn't keeping up.
As FTI managing director Tom Barce warns, "overreliance on AI is going to cause a discovery problem. It is only a matter of time before someone blames a large language model for a privilege waiver or inadvertent production of highly confidential information."
Why this matters right now
LLMs make judgment calls on ambiguous text and metadata. That can shift relevance, privilege, and confidentiality decisions without a clear audit trail.
If you can't explain how a tool tagged, redacted, translated, or excluded a document, expect pressure in meet-and-confers, 30(b)(6) depositions, and sanctions motions.
Where discovery goes off the rails
- Prompt leakage: Sensitive facts or names pasted into prompts end up in vendor logs or model telemetry.
- Hallucinated tagging: AI assigns privilege or responsiveness based on weak signals, and no one catches it at scale.
- Auto-translation risk: Meaning shifts; privilege intent gets muddied.
- Near-duplicate errors: AI clusters misgroup variants; unique, responsive docs are missed.
- Redaction misses: Overlay or vector-based redactions fail on OCR quirks; pixel data remains.
- Non-reproducibility: Different model runs produce different results; your certification under Rule 26(g) looks shaky.
- Cache contamination: Legal hold applies, but the model still retrieves or suggests content from quarantined data.
What courts will care about
- Reasonableness and certification: Your signature still certifies the process under FRCP 26(g).
- Preservation: If AI alters, overwrites, or prunes ESI, you're exposed under FRCP 37(e).
- Auditability: Can you show inputs, outputs, settings, versions, and human checks? If not, expect skepticism.
A practical AI governance checklist for legal
- Policy and risk tiers: Define approved use cases. High-risk tasks (privilege, redaction) require human review and sign-off.
- Data minimization: Ban sensitive details in prompts. Maintain a do-not-prompt list and prebuilt prompt templates.
- Privacy and security: Enforce zero data retention with vendors. Require "no training on your data," SOC 2 Type II, and ISO 27001.
- Reproducibility: Pin model versions. Log prompts, system instructions, seeds, temperature, and datasets. Keep raw outputs.
- Human-in-the-loop: Two-tier sampling and QC. Set confidence thresholds and escalation paths for low-confidence classifications.
- Privilege controls: Maintain live do-not-produce and do-not-contact lists. Run privilege models as aides, never final arbiters.
- Redaction workflow: Use burn-in/pixel redactions. Re-OCR and verify. Block release until QA passes.
- Vendor diligence: Review model cards, eval reports, failure modes, and breach history. Contract for logging access and onsite audits.
- Legal holds and caches: Disable retrieval from held sources. Purge model caches on hold issuance and release.
- Incident response: Tabletop an inadvertent production. Pre-negotiate 502(d) orders and clawback procedures.
- Competence and ethics: Train lawyers and staff on AI limits, confidentiality, and oversight per ABA Model Rule 1.1 and supervision duties.
ESI protocol updates worth adopting
- AI usage disclosure: Identify tools, versions, and the tasks they were used for (e.g., first-pass review, translation, clustering).
- Logging and access: Commit to preserving prompts, outputs, and decision logs, subject to privilege and work product.
- Redaction standards: Specify burn-in methods, QA rates, and handling of layered/PDF-OCR issues.
- Re-review triggers: Define recall thresholds and sampling rates that auto-trigger human re-review.
- 502(d) protection: Codify clawback with no subject-matter waiver for inadvertent production.
Start small, measure, expand
Pilot narrow, high-volume tasks: triage for responsiveness, translation with human spot checks, drafting privilege log templates from attorney notes.
Track precision/recall, error classes, and time-to-resolution. Use those metrics to tune prompts, thresholds, and staffing before scaling.
Bottom line
LLMs can accelerate review and reduce cost. They also introduce silent failure modes that land you in motion practice.
Treat the model like a zealous junior that works fast and makes confident mistakes. You're still responsible for the signature and the outcome-and Barce's warning about privilege and inadvertent production should be your cue to tighten governance now.
Upskill your team (optional resources)
If you're building skills and guardrails for AI in legal workflows, see role-based learning paths here: Courses by Job and a curated feed of new programs here: Latest AI Courses.
Your membership also unlocks: