A Noble Pursuit: A Long-Time AI-in-Biotech Skeptic Takes Another Look
AI has pulled serious funding into biotech. That alone deserves a fresh, sober look. The question isn't "Will AI change drug discovery?" It's "Where does it actually help right now-and what still depends on the bench?"
Two kinds of AI plays in drug discovery
- AI platform builders: advancing molecular dynamics, structure prediction, and algorithms to score protein-protein or protein-small-molecule interactions.
- AI-driven drug makers: using those tools to design antibodies or small molecules, synthesize them, and push toward preclinical validation.
Open-source software already covers a surprising amount of ground. That makes pure "algorithm-only" differentiation hard unless the data, scale, or a unique experimental loop sets a team apart.
Where the signal is real
Protein structure prediction has crossed a meaningful threshold. Predicted structures can approach ~angstrom-level utility, which is actionable for many design tasks. Tools inspired by this progress have improved binder design and epitope targeting workflows.
If you're building or buying here, sanity-check claims against public benchmarks and your in-house controls. Also, ask what proprietary data or assays feed the improvement loop.
For context on structure prediction progress, see AlphaFold.
The bottleneck is still the bench
Real progress starts the moment predictions meet experiments. Assays often need to be built or adapted first, which takes time. AI can help propose multiplex assay panels and prioritize molecules, but the validation is still biology.
Cell-based assays vary. Animal data vary even more. You can feed results back into models, but "garbage in, garbage out" still applies-harshly.
Data quality is the leverage
- Stronger signal: well-resolved structures, tight biophysical readouts, standardized binding data.
- Weaker signal (for now): cellular phenotypes with batch effects, context drift, and noisy labels.
- What helps: rigorous controls, replicate depth, orthogonal assays, careful metadata, and versioned pipelines.
If you're serious about AI, be more serious about sample tracking, plate maps, edge effects, and QC gates. Most "AI gains" are actually data engineering gains.
Compute isn't free (or invisible)
Search feels cheap. Training and large-scale inference are not. Models burn through GPUs and power; costs pile up fast at screening scale.
Track cost per viable hit or per successful lead-optimization cycle. The ROI question isn't about model accuracy-it's about decision speed per dollar spent.
Literature search: helpful, but shallow by default
Keeping up is impossible by hand. AI summarization is useful for first pass filtering. But it treats all sources similarly and often misses context and quality signals.
- Weight by journal standards, lab reputation, methods rigor, and sample size-not just the abstract's confidence.
- Always read the methods and figures for the short list. Let AI surface candidates; let your judgment decide.
A practical playbook for R&D teams
- Define the decision: What would make you advance, pivot, or kill a program? Set metrics upfront (affinity, selectivity, ADME flags, off-target risk).
- Baseline with open tools: Before paying vendors, try public baselines to quantify lift on your targets and assays.
- Validate early and orthogonally: Pair biophysical assays with cell assays. Add a quick tox/aggregation screen to avoid chasing artifacts.
- Multiplex smartly: Use pooled or parallel assays to cull weak designs fast, then invest in depth on the survivors.
- Harden your data layer: Schema, metadata, versioning, batch notes, and automated QC. Your models are only as good as this spine.
- Track compute economics: Cost per screened design, per active, per lead series. Tie model complexity to ROI, not hype.
- Close the loop: Feed assay results back into the model with clear provenance and retraining cadences.
What to believe right now
- Good bet: structure-informed design, binder ranking, and narrowing the search space.
- Promising but uneven: cellular phenotype prediction and complex MoA inference.
- Use with caution: AI literature summaries without human curation.
- Non-negotiable: careful experiments beat confident predictions-every time.
Bottom line
AI can accelerate the path to a better experiment and a faster "no." That's real value. But drugs are made by data you measure, not by scores you predict.
If you want a structured way to skill up your team for protein design, assay development, and wet-lab data workflows, see the AI Learning Path for Biochemists.
Your membership also unlocks: