AI struggles to detect self-harm in real psychiatric wards, Korea study finds
Researchers at Korea University College of Medicine tested whether video-based AI can reliably detect self-harm behavior in psychiatric wards. The answer: not yet. A comparison of AI performance on simulated versus real clinical footage revealed significant gaps between laboratory conditions and actual hospital use.
The team trained six deep learning models on 1,120 simulated self-harm videos recorded in a studio designed to match psychiatric ward conditions. They then tested these models against 118 real clinical videos from Korea University Guro Hospital's closed psychiatric ward. All patient data were de-identified and verified against medical records.
AI models performed well in the studio environment but accuracy dropped substantially when applied to real footage. Even the latest transformer-based models failed to generalize across real-world variables: diverse behavioral patterns, occlusions, and irregular movements all degraded detection rates. Subtle, repetitive behaviors-scratching and skin picking-proved especially difficult for the systems to identify.
The findings appear in Scientific Reports, a peer-reviewed journal.
Why this matters for clinical deployment
Continuous human monitoring in psychiatric wards faces practical constraints: staffing shortages and observation blind spots create safety gaps. AI could theoretically fill those gaps. This study quantifies what stands in the way.
The researchers identified specific technical barriers that need addressing before clinical implementation becomes viable. The study also establishes a dataset of simulated self-harm behaviors that future research can build on.
For professionals working on AI for Healthcare applications, the study demonstrates why validation against real-world data is non-negotiable. Laboratory performance metrics can mask critical failures once systems encounter actual clinical complexity.
The work reflects broader challenges in AI Research: the gap between controlled environments and production settings, and the difficulty of detecting subtle, variable human behaviors with current computer vision approaches.
Your membership also unlocks: