CytoDiffusion analyzes blood cells with expert precision and knows when it's unsure

CytoDiffusion spots unusual blood cells with expert-level sensitivity and says when it's unsure. Trained on 500k smears, it boosts triage and caught rare leukemia-linked cells.

Published on: Nov 26, 2025
CytoDiffusion analyzes blood cells with expert precision and knows when it's unsure

CytoDiffusion: Generative AI that spots abnormal blood cells with expert-level precision

A new AI system called CytoDiffusion analyzes blood cell morphology with sensitivity that matches, and sometimes exceeds, human experts. Built on generative AI similar to image models like DALL-E, it studies the full range of normal and abnormal blood cell appearances-then flags what looks unusual with clear estimates of its own uncertainty.

Developed by researchers from the University of Cambridge, University College London, and Queen Mary University of London, the system was trained on more than half a million blood smear images from Addenbrooke's Hospital. Results were published in Nature Machine Intelligence and show meaningful gains in detecting rare cells linked to disease, including leukemia. Read the study.

Why this matters

Manual analysis of blood films is slow, demanding, and subject to disagreement-especially on hard cases. A typical smear contains thousands of cells, far beyond what any clinician can review in detail. CytoDiffusion automates the routine, highlights the suspicious, and makes its confidence visible so humans can prioritize what needs attention.

How CytoDiffusion works

Most diagnostic AI models classify patterns into fixed categories. CytoDiffusion models the full distribution of how cells can look-across hospitals, microscopes, and staining methods-so it better recognizes edge cases and rare phenotypes. That broader view helps it generalize to new data and reduces brittle behavior when conditions change.

Trained on an exceptionally large and diverse dataset, the model learned to identify common cell types and the unusual ones that often drive critical clinical decisions. It doesn't just give a label; it also estimates uncertainty, which is essential for safe deployment.

Performance and uncertainty

In tests, the system detected abnormal cells linked to leukemia with higher sensitivity than current approaches and matched or beat state-of-the-art models even with fewer training examples. Crucially, it knew when it wasn't sure-avoiding confident errors that can mislead downstream decisions.

Researchers evaluated performance across tough real-world scenarios: unseen images, different machines, inconsistent labels, and domain shifts between sites. The result is a more dependable first pass that supports, rather than replaces, expert review.

Synthetic images that fool experts

Because it's generative, CytoDiffusion can create realistic blood cell images. In a small "Turing test" involving ten experienced hematologists, experts could not reliably tell real from synthetic. That opens doors for data augmentation, education, and quality control-provided synthetic data is clearly managed and labeled.

Open data to accelerate progress

The team is releasing what they describe as the largest public dataset of peripheral blood smear images-over half a million samples. That step can speed up independent validation, benchmarking, and safer deployment across diverse settings.

What this means for clinicians and labs

  • Use AI for triage: Let the system process the bulk of routine smears and surface the unusual for expert review.
  • Trust, but verify: Lean on the uncertainty estimates. Low confidence should trigger human oversight, not automation.
  • Plan for variability: Cross-site performance matters. Validate on your scanners, stains, and workflows.
  • Think beyond accuracy: Audit for fairness across patient groups and monitor drift over time.

Not a replacement-an amplifier

Researchers stress that CytoDiffusion is built to support clinicians, not sideline them. The practical value is faster throughput, consistent screening, and clearer escalation paths when the model is unsure.

As one co-senior author noted, the real promise of healthcare AI is greater diagnostic and prognostic power-and a machine's frank assessment of what it doesn't know. That "metacognitive" signal is often the difference between a safe decision and a risky one.

What comes next

Further work will focus on speed and broad testing across diverse populations and sites. That includes prospective studies, regulatory pathways, and human factors research to ensure the system fits into everyday clinical practice without adding friction.

Who supported the work

The project involved the BloodCounts! consortium and received support from the Trinity Challenge, Wellcome, the British Heart Foundation, Cambridge University Hospitals NHS Trust, Barts Health NHS Trust, the NIHR Cambridge Biomedical Research Centre, the NIHR UCLH Biomedical Research Centre, and NHS Blood and Transplant.

Upskilling for teams adopting medical AI

If you're building literacy in generative AI for diagnostics or lab operations, explore practical training paths and course maps curated by role: Courses by job and the latest AI courses.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide