AI Alone Not Ready for Emergency Department Triage, Study Finds

Study finds ChatGPT-3.5 trails nurses and doctors in MTS triage, often over-triaging. Use AI to flag critical cases with clinician oversight, not as a stand-alone tool.

Categorized in: AI News Healthcare

Published on: Oct 02, 2025

Triaging patients: no job for AI (alone)

AI can support emergency triage, but it should not run the show. A study led by postdoctoral researcher Dr Renata Jukneviciene at Vilnius University found that ChatGPT 3.5 underperformed compared to nurses and doctors across most triage metrics using the Manchester Triage System (MTS). The signal is clear: use AI as an assistant, keep clinicians in control.

Why this study

Emergency departments face overcrowding and rising nursing workloads. The team set out to test whether a general-purpose AI could help clinicians triage faster and more consistently without sacrificing safety.

Methods at a glance

Six emergency physicians and 51 nurses from Vilnius University Hospital Santaros Klinikos triaged real clinical cases drawn from 110 PubMed reports. Each case was assigned to one of five MTS urgency levels. The same cases were assessed by ChatGPT 3.5.

Completion rates were high: 44 nurses (86.3%) and all 6 physicians (100%).

Headline results

Across the board, clinicians outperformed AI. Overall accuracy: AI 50.4% vs nurses 65.5% and doctors 70.6%. Sensitivity for urgent cases: AI 58.3% vs nurses 73.8% and doctors 83.0%.

Doctors led in every category. Notably, in the most urgent level (Level 1), AI showed higher accuracy and specificity than nurses: accuracy 27.3% vs 9.3%; specificity 27.8% vs 8.3%.

How each group assigned urgency

AI tended to over-triage, pushing more cases into higher urgency.

Doctors: Level 1 9% | Level 2 21% | Level 3 29% | Level 4 23% | Level 5 18%
Nurses: Level 1 9% | Level 2 15% | Level 3 35% | Level 4 35% | Level 5 6%
AI: Level 1 29% | Level 2 24% | Level 3 43% | Level 4 3% | Level 5 1%

This cautious bias can help flag critical cases but risks downstream inefficiency if unchecked.

Surgical vs therapeutic cases

Performance in surgical cases (reliability): doctors 68.4%, nurses 63.0%, AI 39.5%. In therapeutic cases: doctors 65.9%, nurses 44.5%, AI 51.9% (AI outperformed nurses here, but not physicians).

What this means for ED leaders

AI is not a stand-alone triage tool. It can help highlight red cases and support less experienced staff, yet it requires strong guardrails and clinician oversight to avoid over-triage and resource misallocation.

Practical steps before piloting AI triage

Define scope: start with alerting for highest-acuity presentations; keep final assignment with clinicians.
Embed human-in-the-loop review and clear escalation policies.
Track impact metrics: over-/under-triage rates, time-to-triage, LWBS, time-to-provider, and downstream resource use.
Calibrate thresholds regularly; run parallel testing before go-live.
Train staff to critically interpret AI outputs and document overrides with reasons.
Audit for bias and performance drift; update models and workflows as needed.

Limitations and next steps

Single-center, small sample, and offline AI use without real-time vitals, patient interaction, or follow-up. ChatGPT 3.5 was not trained for medical use. Strengths include real clinical cases, a mixed clinician cohort, accessible study design, and a clear finding that AI over-triages-a key safety consideration for implementation.

Planned follow-ups will test newer, medically fine-tuned models, larger cohorts, ECG interpretation, and integration into nurse training and mass-casualty triage.

Independent expert view

According to Dr Barbra Backus, chair of the EUSEM abstract selection committee and an emergency physician in Amsterdam, AI is useful for tasks like image interpretation but cannot replace trained staff in ED triage. It may speed decision-making if used with caution and clinical oversight. Ongoing evaluation is essential as systems improve.

Bottom line

Use AI to assist, not replace, triage decisions. Expect a bias toward higher urgency categories, and design your workflow, training, and quality controls to manage it.

Learn more about the Manchester Triage System at the Manchester Triage Group and explore the evidence base on PubMed.

If you're building AI literacy for clinical teams, see curated options at Complete AI Training - Courses by Job.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Advertisement

AI Alone Not Ready for Emergency Department Triage, Study Finds

Triaging patients: no job for AI (alone)

Why this study

Methods at a glance

Headline results

How each group assigned urgency

Surgical vs therapeutic cases

What this means for ED leaders

Practical steps before piloting AI triage

Limitations and next steps

Independent expert view

Bottom line

Related AI News for people in Healthcare

AI Restacks Healthcare: Risk, Coordination, and the End of Knowledge Scarcity

WHO Flags AI Risks as Rocket Doctor CEO Pushes a Clinician-Governed, Safety-First Model

AI Comes to Rural Hospitals Under Trump's Bill-Experts See Help and Hazards

Nigerian doctors push AI in healthcare, citing the world's largest trove of Black clinical data-and a need for better financial planning

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: