Making science run at the speed of thought: the reality of AI in drug discovery - Part 1
"Make science run at the speed of thought" sounds like a slogan. It's not. It's a hard goal forged in failed models, missing metadata and experiments that don't compare the way we hope they do.
That tension sits at the heart of Eric Ma's day-to-day: building practical machine learning with bench scientists in immunology, chromatography and protein engineering. The lesson so far is simple-AI doesn't speed science by default. Without clean data and statistical discipline, it slows you down.
The economics of machine learning: a catch-22
Supervised models (oracle models) need lots of high-quality examples. But the assays that matter most tend to be expensive. If you can't afford enough data, your model underperforms. If you can afford lots of data because the assay is cheap, brute-force screening might beat any model on speed and certainty.
That leaves a narrow window where ML actually pays off: expensive assays with usable historical data, or cases where uncertainty must be modelled carefully on small datasets. Even then, the value depends on how trustworthy your data and metadata are.
- If the assay is cheap and high-fidelity: screen widely first; consider a model only if you truly need to reduce a second round.
- If the assay is expensive and slow: use ML when you have traceable historical data or a plan for principled uncertainty quantification.
- If you cannot trace the data-generating process: fix that before training anything.
The hidden crisis in historical data
Big pharma often assumes decades of assays are an ML goldmine. Often, they're not. Many databases store only summarised values-no raw measurements, no controls from the same run, no record of the curve-fitting method or who ran the plate.
Over ten years, machines change, operators change, plate formats change, software changes. Yet we treat values like IC50 as if they're directly comparable across time. Sometimes they are. Many times they aren't.
- Assay drift: different technicians and instruments shift baselines in subtle ways.
- Software roulette: a JMP license lapses; a quick Python 2.7 script takes over; the script lives in a home directory and never gets archived.
- Lost context: no control curves, no plate maps, no calibration records, no versioned code for how IC50s were computed.
Train on that, and you're fitting to an unstable foundation. The model may look fine in cross-validation and fail in the next quarter when the lab changes a liquid handler tip.
Statistical systems require discipline
This is not solved by one smart statistician writing a clean R script. The problem is systemic. You're building a statistical system: every parameter, version, operator and instrument can affect the numbers you trust.
If you don't log the knobs you turn, you can't reconstruct what happened. You can't quantify uncertainty honestly. And you can't debug when results drift.
- Track raw data, not just summaries: per-well signals, plate maps, replicates and control wells from the same run.
- Version everything that changes math: curve-fitting code, library versions, model hyperparameters and priors.
- Bind data to context: operator IDs, instrument IDs, plate lot, reagent lot, assay protocol version, environment conditions.
- Keep audit trails in your LIMS/ELN and compute stack: who changed what, when and why.
- Define decision rules up front: thresholds, QC checks, acceptance criteria and how uncertainty affects go/no-go.
The last five years have seen real progress here. But cheaper data generation does not remove the systems problem. Whether running pooled screens with next-gen sequencing or arrayed assays in 96-well plates, you still need disciplined computation in the loop-especially for functional assays where selection shortcuts don't apply.
What to implement this quarter
- Pick one high-value assay and do a metadata gap audit. List required fields (raw reads, controls, plate maps, protocol version, operator, instrument, code version). Make the gaps visible.
- Stand up a versioned curve-fitting pipeline. Lock the environment, log hyperparameters and store fit diagnostics alongside outputs.
- Backfill controls and raw data for the last 6-12 months. If it doesn't exist, document the loss and wall it off from training.
- Add drift monitoring. Control charts on positive/negative controls; alarms when baselines shift; playbooks for investigation.
- Introduce principled uncertainty. Use Bayesian models for small datasets; report credible intervals along with point estimates.
- Make data contracts between bench and compute explicit. If a field is missing, analysis fails loudly-not silently.
- Write the runbook. How data moves from plate to model to decision. Owners, SLAs, and what "good" looks like.
How to decide if ML makes sense for your assay
- Cost per datapoint high, data sparse, but historical data is traceable: ML can focus screens and cut cycles; uncertainty quantification is essential.
- Cost per datapoint low, data abundant, high-fidelity readout: run the screen; use analytics for ranking and QC rather than prediction.
- Mixed fidelity (cheap pre-screen, expensive confirmatory): use models to triage and allocate confirmatory runs with explicit risk budgets.
The cultural shift that makes this stick
Put quantitative leaders in early. Bake traceability into the lab workflow, not as an afterthought. Measure scientists on decision quality and reproducibility, not just throughput. Treat model training like an assay: it has protocols, controls and change control.
Adopt standards that make data reusable and comparable across time and teams. The FAIR principles are a good baseline for scientific data stewardship and interoperability.
What this means for AI teams
Your value isn't in building fancy architectures. It's in reducing decision time without increasing error. That starts with data you can trust, uncertainty you can explain and systems you can rerun months later and get the same answer.
Do that, and "speed of thought" stops being a slogan. It becomes a property of your lab.
Meet the expert
Eric Ma - Senior Principal Data Scientist, Moderna
Eric leads the Data Science and Artificial Intelligence (Research) team at Moderna. Previously at the Novartis Institutes for Biomedical Research, he focused on Bayesian statistical methods to support new medicines for patients. He completed his PhD in Biological Engineering at MIT and was an Insight Health Data Fellow.
He builds open-source tools, including pyjanitor for data cleaning and nxviz for visualising NetworkX graphs, and is a core developer for NetworkX and PyMC. He contributes to the broader data science community through coding, blogging, teaching and writing.
Coming next
Part 2 will cover how automation, large language models and disciplined systems design can move your organisation closer to running science at the speed of thought.
Skill up
If you're formalising ML and automation practices in the lab and want structured learning paths, explore focused AI courses by role.
Complete AI Training - Courses by job
Your membership also unlocks: