Improving FDA Postmarket Surveillance for AI and Machine Learning Medical Devices
A study finds the FDA’s MAUDE system falls short in tracking safety issues of AI/ML medical devices, with missing data and inadequate event classification. Recommendations urge improved reporting on AI-specific risks.

A General Framework for Governing Marketed AI/ML Medical Devices
Abstract
This project provides the first systematic assessment of the US Food and Drug Administration’s (FDA) postmarket surveillance of AI and machine learning (AI/ML) based medical devices. We analyze the Manufacturer and User Facility Device Experience (MAUDE) database, the FDA’s primary tool for tracking safety issues in legally marketed AI/ML devices.
Focusing on approximately 950 AI/ML devices authorized between 2010 and 2023, we find that the current system is insufficient for assessing the safety and effectiveness of these devices. Our contributions include: (1) characterizing adverse event reports for AI/ML devices, (2) identifying shortcomings in the FDA’s adverse event reporting system, and (3) suggesting improvements to enhance postmarket surveillance of AI/ML devices.
Introduction
There is ongoing debate about whether the FDA’s current regulations adequately address medical devices that incorporate AI/ML functions. The FDA’s Digital Health Advisory Committee recently dedicated its first meeting to discussing “total product lifecycle considerations for Generative AI-enabled devices,” highlighting the growing importance of this topic.
While many concerns center on bias and diversity in AI training data, this study focuses on the FDA’s adverse event reporting system, specifically the MAUDE database. This database tracks postmarket safety issues for legally marketed medical devices and is critical for understanding how regulatory practices handle the rapid growth of AI/ML devices.
As of August 2024, the FDA had authorized 950 AI/ML devices. Despite a comprehensive system for reporting adverse events, it is unclear how well MAUDE captures unique problems related to AI/ML technologies. Our study examines adverse event data from 2010 to 2023, aiming to understand the limitations of current surveillance and propose pathways for improvement.
Methods
We compiled a dataset using FDA’s 510(k) clearance files alongside NyquistAI’s database, identifying 823 unique FDA-cleared AI/ML devices linked to 943 adverse event reports between 2010 and 2023.
Medical device regulation in the US began with the 1976 Medical Device Amendments. The FDA’s Medical Device Reporting (MDR) system requires manufacturers, user facilities, and importers to report serious adverse events, including death, injury, or device malfunction likely to cause harm.
This system was originally created for hardware devices and later adapted for software embedded in medical devices. However, AI/ML devices pose new challenges:
- Performance may degrade due to differences between training and deployment populations (concept drift and covariate shift).
- Outcomes can vary significantly for similar patients, affecting stability and reliability.
- Devices might underperform for rare conditions or show bias against specific groups.
These issues don't fit well into traditional categories like “malfunction,” raising questions about the suitability of the current reporting framework for AI/ML devices.
Results
From the MAUDE database, 823 AI/ML devices were linked to 943 adverse event reports. Notably, over 98% of adverse events were concentrated in fewer than five devices, a sharper concentration than seen in non-AI/ML devices.
Two products dominate the reports:
- Biomerieux’s Mass Spectrometry Microbial Identification System, with many reports of microorganism misidentifications likely due to limitations in its knowledge base.
- DarioHealth’s Blood Glucose Monitoring System, with reports of inaccurate blood glucose readings, some possibly false positives.
Although these devices are overrepresented, the data do not allow conclusions about their overall safety or quality due to missing information on usage frequency and event severity.
Several limitations of the MAUDE system were identified:
- Missing data: Key details like event location (missing entirely), reporter’s occupation, and event date were frequently absent, undermining the utility of reports.
- Inadequate event classification: Most reports are labeled as malfunctions, with very few classified as injury or death, despite textual descriptions suggesting otherwise.
- Underreporting: Physicians might avoid reporting issues with AI/ML tools they distrust or no longer use, reducing visibility into problems.
These gaps complicate efforts to assign responsibility or assess the true impact of AI/ML device failures.
Discussion
Improving postmarket surveillance of AI/ML devices demands either reforming the existing MAUDE system or adopting a new regulatory approach.
Key areas for improvement include better tracking of:
- Concept drift: Changes in data or patient populations that affect device performance over time.
- Covariate shift: Differences between training data and real-world use environments.
- Algorithmic stability: Consistency of device outputs under similar conditions.
Regulators should require manufacturers to report updates to training data and changes in deployment conditions. Expanding reporting beyond traditional adverse events to include these AI/ML-specific factors would enhance safety monitoring.
Conclusion
This study reveals significant shortcomings in the FDA’s MAUDE database for AI/ML medical devices, including missing and inaccurate data and lack of reporting on key AI/ML risks. The current system falls short in providing reliable information to assess device safety and effectiveness.
We recommend:
- Revising the MAUDE reporting framework to capture AI/ML-specific issues such as concept drift and algorithmic stability.
- Considering alternative postmarket surveillance models that provide greater transparency and proactive safety management for AI/ML devices.
Healthcare professionals and regulators must work together to adapt surveillance systems to the unique challenges presented by AI/ML in medicine.