Generative AI spots blood cell abnormalities doctors miss

CytoDiffusion, a generative AI, spots rare blood cells tied to disease and can match or beat experts. It triages slides, flags uncertainty, and comes with a large open dataset.

Categorized in: AI News Healthcare
Published on: Nov 22, 2025
Generative AI spots blood cell abnormalities doctors miss

Generative AI spots blood cell abnormalities that clinicians miss

Hematology still hinges on morphology. Subtle shifts in size, granularity, and nuclear structure separate routine from high-risk. The challenge: a single smear holds thousands of cells-far more than any clinician can scan in full.

Researchers from Cambridge, UCL, and Queen Mary have introduced CytoDiffusion, a generative AI system that analyzes blood cell shape and form with accuracy that matches or beats human experts. It flags rare or unusual cells linked to disease, including leukaemia, and quantifies its own uncertainty-so it knows when not to trust itself.

What's different about CytoDiffusion

Most AI models learn to separate categories. CytoDiffusion models the full distribution of normal and abnormal appearances. That gives it stronger performance when slides come from different hospitals, microscopes, or staining protocols.

The team trained on more than half a million images from Addenbrooke's Hospital, including common morphologies, rare cell types, and confounders that trip up automated systems. In tests, it detected abnormal cells tied to leukaemia with higher sensitivity than existing tools and kept performing well even with fewer training examples.

Built for real clinical constraints

A typical day leaves haematology teams with stacks of films to review. Humans can't scrutinize every cell. CytoDiffusion automates the sweep, triages routine cases, and flags anything unusual for a human to adjudicate.

The system's standout feature is calibrated uncertainty. In trials, it avoided being confidently wrong-something people do under pressure and fatigue. That makes it easier to set safe thresholds for auto-release vs. manual review.

Explainability you can point to

CytoDiffusion generates counterfactual heat maps-visual overlays showing what would need to change for the model to relabel, say, an eosinophil as a neutrophil. Matrices of these heat maps reveal where classes are easily confused and where the latent space has large gaps.

For clinical use, that means two things: clearer error analysis when performance slips, and a practical teaching aid for trainees learning nuanced morphological differences.

Synthetic images and open data

The model can produce synthetic blood cell images that fooled ten experienced haematologists in a Turing-style test-they performed at chance when asked to tell real from AI-generated. That opens the door to data augmentation and training without exposing patient data.

The team is releasing what they describe as the largest publicly available peripheral blood smear dataset-more than 500,000 images-to accelerate external validation and model development across the community.

Not a replacement-an assistive workflow

CytoDiffusion is built to support clinicians, not supplant them. It's well-suited for rapid triage, pre-screening, and surfacing atypical cells for expert review. As one co-senior author put it, the value of healthcare AI isn't imitating experts at lower cost-it's extending diagnostic and prognostic reach and knowing its own limits.

How to pilot this in your service

  • Start with a retrospective validation on your own slides, stratified by instrument, stain, and diagnosis.
  • Use the model as a second reader: auto-release low-uncertainty normals; route high-uncertainty or abnormal flags to experts.
  • Set uncertainty thresholds in collaboration with haematology leads; monitor false positives/negatives weekly at launch.
  • Track performance across patient demographics and disease prevalence to catch bias early.
  • Define clear escalation rules for suspected blasts, atypical lymphocytes, and rare cells.
  • Integrate with LIS/EPR for audit trails; log model confidence and counterfactual maps with each case.
  • Measure time-to-result and reviewer workload before and after rollout; adjust triage thresholds accordingly.

Limitations and next steps

Speed needs to improve for high-throughput labs. External validation across diverse populations is essential for fairness and reliability. Regulatory clearance and robust IT governance will determine how quickly this moves from study to standard of care.

Learn more

Read the study context in Nature Machine Intelligence, and explore the role of nationwide services like NHS Blood and Transplant in scaling safe diagnostics.

If you're building AI literacy for clinicians and lab teams, see curated AI courses by job to support responsible adoption.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)