AI in cancer research: big bets, bold claims, and why we're still toddlers in biology

AI is making strides in cancer-from risk prediction to protein design-but bold claims still outpace proof. Progress hinges on narrow, tested models and real clinical validation.

Categorized in: AI News Science and Research
Published on: Mar 03, 2026
AI in cancer research: big bets, bold claims, and why we're still toddlers in biology

AI, Cancer, and the Gap Between Hype and Hard Proof

Humanity has chased a cancer cure for thousands of years. One of the first recorded cases dates to ancient Egypt, where Imhotep described a tumor on papyrus around 2600 BC.

Today, tech leaders are confident that AI will crack what medicine hasn't. Google's president forecasted major gains. Dario Amodei called it "the compressed 21st century," arguing AI will speed up progress dramatically.

Not everyone buys it. Eli Lilly CEO David Ricks was blunt: "If you just ask them to solve biology or chemistry questions, they're not particularly good at it. They're trained on the human language, not on the language of chemistry, physics, and biology."

What's Actually Working

There are real wins. In 2023, Harvard's Sybil model predicted individual lung cancer risk up to six years out using a single low-dose CT scan. That's a useful clinical triage tool, not a cure. Peer-reviewed results back it.

On the discovery side, Google DeepMind's AlphaProteo has helped design protein binders for specific targets, including cancer-related molecules. And AlphaFold changed structure prediction and is already in use at Eli Lilly. Its methods and coverage are public.

But these are pieces of a massive puzzle. As Ricks put it, "We can get a machine to predict things pretty well, like predicting the structure of a protein. But that is one maybe 1,000th of the kind of problems we face in drug discovery."

Capital Is Betting Big

AI funding is surging on the belief it can spark breakthroughs. A proposed federal "Stargate Project" has been discussed as a multi-hundred-billion-dollar build-out of AI infrastructure through 2029. Larry Ellison even suggested such efforts could help produce a cancer vaccine in 48 hours. Ambitious claims, but they need repeatable evidence from bench to bedside.

Why Generic LLMs Stall on Biology

LLMs learn patterns in text. Biology runs on constraints: thermodynamics, kinetics, stochasticity, cellular context, and messy, biased measurements. That mismatch shows up fast when you move from papers to petri dishes.

Drug discovery needs models that reason over sequence, structure, dynamics, and function-and express calibrated uncertainty. It needs causal signals from perturbations, not just correlations scraped from literature.

What Researchers Can Do Now

  • Build specialized datasets with strong metadata: assay protocols, batch effects, negatives, confidence intervals, and standardized ontologies. Garbage in still equals garbage out.
  • Favor narrow predictors over general chat models: structure-to-function, binder affinity, ADMET, epitope likelihood, and patient-level risk with explicit uncertainty (e.g., conformal prediction, ensembling).
  • Close the loop with active learning: design-make-test cycles, Bayesian optimization, and automated labs to shrink iteration time and reduce false leads.
  • Use causal evidence where possible: CRISPR screens, perturb-seq, knockout/overexpression, and hybrid models that blend mechanistic priors with ML.
  • Stress-test for shift: external cohorts, prospective studies, blinded readouts, and temporal validation to catch degradation before deployment.
  • Prioritize interpretability tied to mechanism: counterfactuals, binding-mode rationales, and assay-level explanations over generic feature attributions.
  • Ship with discipline: data versioning, unit tests on features, assay simulators, model cards, and governance for genomic privacy and consent.
  • Treat AlphaFold/AlphaProteo as components, not endpoints. Use them to prune search space, then validate function wet-lab first.

If you want practical frameworks and tools across labs and institutes, see our collection on AI for Science & Research.

The Case for Specialized Models

Ricks argues the path forward is focused models trained on advanced, domain-specific data. "The future here is actually to build more and more models of those narrow prediction problems because biology, unlike human language, doesn't follow all the same rules in the same way." AlphaFold and AlphaProteo are proof that tight scope beats vague generality.

Bottom Line

We've advanced, but we're early. As Ricks said, "We're sort of like a toddler in the language of biology."

AI will matter most where it is constrained, measured, and plugged into real experiments. Judge progress by prospective validation and clinical impact-not by slogans or press events.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)