ADMET Predictions Get AI Boost, Federated Data Network Unites Pharma
Apheris has launched the ADMET Network, a federated initiative that lets pharma companies train shared absorption, distribution, metabolism, excretion, and toxicity (ADMET) models without sharing raw data. Five founding members-Lundbeck, Orion Pharma, Recursion, Servier, and one undisclosed partner-have each committed about 80% of their relevant data. The result is a global ADMET foundation model accessible to all partners while keeping governance and IP intact.
Drug discovery still carries a ~90% failure rate, with 40-45% tied to poor ADMET. AI can move these calls earlier in the pipeline, but the data is sparse, fragmented, and locked inside proprietary silos. The network tackles that head-on by coordinating training across datasets that never leave company boundaries.
What this means for engineering teams
- Federated base, local control: Members fine-tune and run inference inside their own secure environments. You integrate models into existing workflows and keep full control over how predictions drive decisions.
- General model + program relevance: "That combination, broad industrial training plus local specialization, is what allows models to remain both general and program-relevant," said Robin Roehm, Apheris co-founder and CEO. The structure also avoids anchoring to any single pharma pipeline.
- Compounding coverage: Each new member widens chemical space coverage for everyone. The initial focus is small molecules, with plans to include PROTACs, peptides, and macrocycles.
Why ADMET is a leverage point
Najat Khan, PhD, president and CEO of Recursion, points out that trials don't fail only because of the molecule. Dosing, PK during off-target effects, and picking the right patients matter-and two of those connect straight back to ADMET. "ADMET is what changes molecules in chemistry to potential medicines for patients."
From metrics to decisions
Julian SchΓΆnauer, PhD, at Apheris, stresses that benchmarking can't stop at statistical gains. The focus is practical lift: fewer dead-end syntheses, earlier no-go calls, and higher-quality leads that make it to the clinic. That's the bar for model generalization into new chemical space.
Implementation notes for IT and development
- Federated learning patterns: Expect secure aggregation and strict governance so raw data stays put. If you're building adjacent systems, budget for isolation, key management, and auditable policy enforcement; see federated learning for common approaches.
- Infra and MLOps: Stand up on-prem or VPC GPU capacity, containerized inference, signed artifacts, and private registries. Use shadow deployments and staged rollouts before models influence live design cycles.
- Evaluation: Test on out-of-distribution chemotypes and prospective sets, not just random splits. Track program KPIs: synthesis hit rate, tox flags avoided, and time-to-decision.
- Security: Enforce network segmentation, code provenance (SBOMs), and least-privilege access. Treat model weights, prompts, and prediction logs as sensitive artifacts.
- Data contracts: Lock down schema, units, and ontology mapping early. ADMET datasets are messy; invest in feature stores with lineage, QC checks, and reproducible preprocessing.
How it differs from Lilly's TuneLab
Eli Lilly's TuneLab provides access to Lilly's proprietary drug discovery models in exchange for data that improves them. Those models are built primarily on Lilly's internal data and often involve earlier-stage collaborators. Apheris's ADMET Network is structured for large pharmas contributing substantial proprietary datasets under tight IP and governance.
Related momentum: AISB Consortium and early clinical signals
Apheris previously partnered with the AI Structural Biology Consortium to fine-tune OpenFold3 for protein-ligand co-folding using proprietary data from AbbVie and Johnson & Johnson, later expanding to Astex, Bristol Myers Squibb, and Takeda. For background on the model family, see OpenFold.
On the clinical side, Recursion closed 2025 with favorable Phase Ib and II results for REC-4881 in familial adenomatous polyposis. It's the first candidate to come through the company's full AI stack, Recursion OS, which fuses large-scale phenomics with transcriptomics, proteomics, metabolomics, and other multimodal data.
Action items for your team
- Inventory your ADMET-adjacent datasets and access controls; flag what can participate under strict governance.
- Prepare infrastructure for local fine-tuning and inference (GPU clusters, private registries, model observability, rollbacks).
- Define acceptance gates that tie model outputs to program decisions and risk thresholds, with audit trails.
- Pilot on a bounded chemical series to stress-test out-of-distribution performance and drift monitoring.
- Upskill engineers on secure federated ML and integration patterns. The AI Learning Path for Software Engineers is a practical place to start.
The takeaway
The data gap is the bottleneck for AI in drug discovery. A federated network that grows with each contributor extends model reach into new chemical space while letting teams keep tight control over how predictions drive real decisions.
Your membership also unlocks: