Surrey-built AI cuts Supreme Court transcription errors and links judgments to video

Surrey researchers built an AI that transcribes UK Supreme Court hearings and links judgment paragraphs to exact video moments. Training cut errors by up to 9% across 139 hours.

Categorized in: AI News Science and Research
Published on: Sep 21, 2025
Surrey-built AI cuts Supreme Court transcription errors and links judgments to video

AI system for UK Supreme Court hearings cuts transcription errors and links judgments to video

Researchers at the University of Surrey have built an AI tool that transcribes UK Supreme Court hearings and links sections of written judgments to the exact moments those arguments appear in video. The study reports up to a 9% reduction in transcription errors compared with leading commercial tools after training on 139 hours of hearings and legal documents.

"Our courts deal with some of the most important questions in society. Yet the way we record and access those hearings is stuck in the past," said Prof Constantin Orăsan, co-author of the study. "By tailoring AI to the unique language of British courtrooms, we've built a tool that makes justice more transparent and accessible."

What the team built

  • A domain-adapted speech recognition system trained on 139 hours of Supreme Court audio and legal text.
  • Measured accuracy improvements with up to a 9% reduction in transcription errors versus leading commercial systems.
  • A semantic linking tool that matches paragraphs in written judgments to precise timestamps in hearing videos.

Why this matters for science and research

  • Reproducibility: Timestamped links between claims in judgments and the source video create an auditable trail.
  • Search and retrieval: Domain adaptation improves recall for rare legal terms, case citations, and named entities.
  • Workflow efficiency: Faster review for clerks, researchers, and archivists through accurate transcripts and direct video jumps.
  • Public access: Clearer records strengthen transparency for media, practitioners, and citizens.

Method signals

  • Domain adaptation: Training on hearings and legal documents helps the model parse courtroom cadence, accents, and specialized vocabulary.
  • Error analysis: Reporting head-to-head gains versus commercial baselines points to practical performance, not just benchmarks.
  • Semantic linking: Paragraph-to-timestamp matching supports evidence tracking, context checks, and rapid fact verification.

Adoption and next steps

The project is drawing interest from the UK Supreme Court and The National Archives. For institutions, the near-term value is clear: faster public release of accurate transcripts, richer archives, and better tools for legal research.

Considerations before deployment

  • Data governance: Ensure clear policies on sensitive audio, redactions, and retention.
  • Bias and coverage: Evaluate performance across accents, speaking rates, and courtroom audio conditions.
  • Human-in-the-loop: Pair automated drafts with expert review for final records.
  • Provenance: Keep versioned transcripts and linkages so citations remain stable over time.
  • Accessibility: Provide exports, captions, and APIs for researchers and archives.

Bottom line

Specialized training for legal language, combined with semantic links back to video, moves court reporting from static records to verifiable, searchable evidence trails. That's a practical step toward more transparent justice and more efficient legal research.