Delphi-2M AI predicts more than 1,000 diseases years ahead

Delphi-2M predicts risk for 1,000+ diseases years ahead from health records. Trained on UK Biobank and validated in Denmark, it shows promise but awaits prospective trials.

Categorized in: AI News Science and Research
Published on: Sep 18, 2025
Delphi-2M AI predicts more than 1,000 diseases years ahead

AI model forecasts risk for 1,000+ diseases years in advance

An international team has introduced Delphi-2M, a transformer-based AI that forecasts the likelihood of more than 1,000 diseases years ahead using longitudinal health records. The work appears in Nature and focuses on modeling the natural history of disease, not clinical decision-making-yet.

The model treats sequences of diagnoses like language, learning patterns in the order and combination of conditions across time. That framing lets it generalize across many diseases, rather than optimizing for a single endpoint.

What's new

Delphi-2M was trained on the UK Biobank, which includes data from roughly 500,000 participants. The team then evaluated performance on nearly two million records from Denmark's national health database to test generalization across systems and populations.

Early results suggest the model can refine risk stratification beyond age-based baselines-for example, identifying subgroups with markedly higher or lower heart attack risk. The authors emphasize that these are retrospective findings and require prospective validation.

How it works

Delphi-2M uses a transformer architecture-the same class of neural networks behind large language models-to learn temporal dependencies in medical event sequences. By ingesting prior diagnoses and their order, the model estimates future disease rates across a broad set of conditions and time horizons.

This multi-disease framing contrasts with tools like QRISK3 that target single outcomes. The payoff is a unified model for long-range risk estimation across many endpoints.

Validation and limits

External testing on Danish health data supports portability, but the authors caution against clinical deployment. Both UK and Danish datasets have biases in age, ethnicity, and health system practices that can affect calibration and fairness.

Interpretability remains a focus. Commentators note the work as a step toward scalable, explainable, and ethically responsible predictive modeling, while acknowledging the need for transparent reasoning and clinician trust.

Implications for research and health systems

  • Preventive care: Earlier monitoring and targeted interventions become feasible if prospective trials confirm risk lift and calibration.
  • Resource allocation: Multi-disease forecasts could inform population health planning, screening schedules, and staffing where demand is most likely.
  • Study design: Cohort enrichment strategies can use predicted risk to power trials with fewer participants or shorter follow-up.
  • EHR integration: Real-world use will require pipelines for continual retraining, drift detection, and governance across institutions.

Practical notes for scientists

  • Bias and calibration checks are non-negotiable across demographics, sites, and time windows.
  • Prospective, intervention-aware validation is the next step-retro signals don't guarantee actionability.
  • Interpretability should focus on temporal patterns and comorbidity trajectories clinicians can act on.
  • Data governance: consent, auditability, and versioning must be in place for any preclinical deployment.

For the technical details and evaluation, see the Nature publication: Learning the natural history of human disease with generative transformers.