AI Alone Not Ready for Emergency Department Triage, Study Finds
Study finds ChatGPT-3.5 trails nurses and doctors in MTS triage, often over-triaging. Use AI to flag critical cases with clinician oversight, not as a stand-alone tool.

Triaging patients: no job for AI (alone)
AI can support emergency triage, but it should not run the show. A study led by postdoctoral researcher Dr Renata Jukneviciene at Vilnius University found that ChatGPT 3.5 underperformed compared to nurses and doctors across most triage metrics using the Manchester Triage System (MTS). The signal is clear: use AI as an assistant, keep clinicians in control.
Why this study
Emergency departments face overcrowding and rising nursing workloads. The team set out to test whether a general-purpose AI could help clinicians triage faster and more consistently without sacrificing safety.
Methods at a glance
Six emergency physicians and 51 nurses from Vilnius University Hospital Santaros Klinikos triaged real clinical cases drawn from 110 PubMed reports. Each case was assigned to one of five MTS urgency levels. The same cases were assessed by ChatGPT 3.5.
Completion rates were high: 44 nurses (86.3%) and all 6 physicians (100%).
Headline results
Across the board, clinicians outperformed AI. Overall accuracy: AI 50.4% vs nurses 65.5% and doctors 70.6%. Sensitivity for urgent cases: AI 58.3% vs nurses 73.8% and doctors 83.0%.
Doctors led in every category. Notably, in the most urgent level (Level 1), AI showed higher accuracy and specificity than nurses: accuracy 27.3% vs 9.3%; specificity 27.8% vs 8.3%.
How each group assigned urgency
AI tended to over-triage, pushing more cases into higher urgency.
- Doctors: Level 1 9% | Level 2 21% | Level 3 29% | Level 4 23% | Level 5 18%
- Nurses: Level 1 9% | Level 2 15% | Level 3 35% | Level 4 35% | Level 5 6%
- AI: Level 1 29% | Level 2 24% | Level 3 43% | Level 4 3% | Level 5 1%
This cautious bias can help flag critical cases but risks downstream inefficiency if unchecked.
Surgical vs therapeutic cases
Performance in surgical cases (reliability): doctors 68.4%, nurses 63.0%, AI 39.5%. In therapeutic cases: doctors 65.9%, nurses 44.5%, AI 51.9% (AI outperformed nurses here, but not physicians).
What this means for ED leaders
AI is not a stand-alone triage tool. It can help highlight red cases and support less experienced staff, yet it requires strong guardrails and clinician oversight to avoid over-triage and resource misallocation.
Practical steps before piloting AI triage
- Define scope: start with alerting for highest-acuity presentations; keep final assignment with clinicians.
- Embed human-in-the-loop review and clear escalation policies.
- Track impact metrics: over-/under-triage rates, time-to-triage, LWBS, time-to-provider, and downstream resource use.
- Calibrate thresholds regularly; run parallel testing before go-live.
- Train staff to critically interpret AI outputs and document overrides with reasons.
- Audit for bias and performance drift; update models and workflows as needed.
Limitations and next steps
Single-center, small sample, and offline AI use without real-time vitals, patient interaction, or follow-up. ChatGPT 3.5 was not trained for medical use. Strengths include real clinical cases, a mixed clinician cohort, accessible study design, and a clear finding that AI over-triages-a key safety consideration for implementation.
Planned follow-ups will test newer, medically fine-tuned models, larger cohorts, ECG interpretation, and integration into nurse training and mass-casualty triage.
Independent expert view
According to Dr Barbra Backus, chair of the EUSEM abstract selection committee and an emergency physician in Amsterdam, AI is useful for tasks like image interpretation but cannot replace trained staff in ED triage. It may speed decision-making if used with caution and clinical oversight. Ongoing evaluation is essential as systems improve.
Bottom line
Use AI to assist, not replace, triage decisions. Expect a bias toward higher urgency categories, and design your workflow, training, and quality controls to manage it.
Learn more about the Manchester Triage System at the Manchester Triage Group and explore the evidence base on PubMed.
If you're building AI literacy for clinical teams, see curated options at Complete AI Training - Courses by Job.