Model Behavior: FDA and EMA's Guide to Good AI in Drug Development
On January 23, 2026, the FDA and EMA released a joint statement with 10 guiding principles for using AI across drug development. This isn't a new rulebook. It's a clear signal: if AI supports any regulated work, your evidence must be reliable, auditable, and safe for patients.
These principles apply to sponsors, CROs/CMOs, software teams, data vendors, and anyone building, validating, or operating AI that feeds into non-clinical research, clinical trials, manufacturing, or post-market safety. The term "drug" includes drugs, biologics, and EU medicinal products.
What was released
The agencies outlined how AI should be designed, tested, governed, and communicated when it generates or analyzes evidence used in regulatory decisions. AI is defined broadly: systems that support research, trials, production, and safety monitoring.
Bottom line: AI should support-not weaken-core requirements for quality, efficacy, and safety. The focus is reliability of evidence and protections for patients.
Why this matters for your FDA/EMA interactions
- Be ready to explain data lineage, preprocessing steps, and why the model is fit for its stated purpose.
- Expect questions on human oversight, cybersecurity, access control, monitoring for drift, and change control.
- Early alignment with these principles reduces rework, speeds review, and improves inspection readiness.
The 10 principles in plain language (grouped by theme)
- Human-centric by design (#1): Build for patient and user safety from day one. Anticipate impacts and bake in safeguards.
- Risk-based approach (#2): Define context-of-use. Scale testing, controls, and oversight to the level of risk. High-impact decisions need deeper validation.
- Adherence to standards (#3) + Multidisciplinary expertise (#5): Follow legal, ethical, technical, scientific, cybersecurity, and regulatory standards (e.g., GxP). Involve domain experts, data scientists, software engineers, security, clinical/manufacturing quality, and safety teams.
- Clear context, data governance, model practice (#4, #6, #7): Document sources, transformations, and analytical choices with traceability and privacy controls. Use data that fits the purpose. Balance interpretability, explainability, and predictive performance with safety in mind.
- Risk-based performance, lifecycle management, clear information (#8, #9, #10): Validate the full system (people + process + tech). Govern the lifecycle with a QMS. Monitor, re-evaluate, and communicate purpose, performance, limits, data used, update cadence, and how to interpret outputs-in plain language.
What IT and Development teams should do now
- Define context-of-use: Who uses it, where in the workflow, what decisions it influences, and acceptable limits.
- Set a risk taxonomy: Categorize AI uses by impact; tie controls and validation depth to risk.
- Tight data governance: Data lineage, consent and privacy controls, versioning, and access logging. Keep it traceable and verifiable.
- MLOps with guardrails: Reproducible training pipelines, model registries, environment parity, and immutable artifacts.
- Validation plan: Prespecified metrics, thresholds, and acceptance criteria that match the context-of-use. Use representative, independent datasets.
- Human-in-the-loop design: Clear escalation paths, override options, and operator training.
- Security-first: Threat modeling, secure software lifecycle, dependency hygiene, secrets management, and audit trails.
- Monitoring and alerts: Drift detection, performance decay tracking, data quality checks, and rollback procedures.
- Change control: Documented updates, impact assessments, re-validation triggers, and versioned release notes.
- Transparent comms: Plain-language model cards and user guidance that state purpose, limits, data used, and how to interpret outputs.
- Vendor oversight: Contracts with audit rights, evidence requirements, SLAs for incidents, and security/privacy expectations.
How this fits with existing FDA/EMA expectations
- Data integrity: Traceability and auditability echo ALCOA+ principles.
- Validation: "Fit for intended use" mirrors current clinical and manufacturing review practices.
- Quality systems: Lifecycle governance matches GCP/GMP expectations, including deviations and periodic review.
- Security & privacy: Aligns with existing requirements to protect systems and sensitive data.
You don't need new processes-extend your QMS, CSV/CSA, and risk management to cover AI systems and their data pipelines.
Artifacts regulators may ask for
- Context-of-use statement and risk classification.
- Data inventory, lineage, and governance controls.
- Model card, training/eval datasets description, and rationale for metrics.
- Validation protocol/report with predefined acceptance criteria.
- Human factors/usability evidence for AI-assisted workflows.
- Security risk assessment, access logs, and incident response plan.
- Monitoring plan, drift thresholds, and re-validation triggers.
- Change control records and version history.
Common pitfalls to avoid
- Vague context-of-use; unclear decision boundaries or users.
- Validating the model but not the end-to-end system and workflow.
- Skipping human oversight or operator training.
- No plan for drift, dataset shift, or degraded performance.
- Weak vendor controls or missing audit rights.
- Unexplained outputs with no error bounds or confidence guidance.
Useful resources
Skill up your team
If you're building or validating AI for regulated use, targeted upskilling shortens the path from prototype to compliant production.
- AI courses by job for IT, data, and engineering roles
- Latest AI courses to keep pace with current practices
The takeaway: treat AI like any other regulated system-clear purpose, sound data, fit-for-use validation, lifecycle controls, and honest communication. Do that, and your AI can stand up to scrutiny across the drug lifecycle.
Your membership also unlocks: