Ai2 debuts AutoDiscovery: an automated system that asks the next scientific question
Updated: 11:00 EST . February 12, 2026
Allen Institute for AI (Ai2) has launched AutoDiscovery, an experimental system built to help researchers surface meaningful questions from overwhelming volumes of literature and data. The core idea: start with data, let the system generate hypotheses, test them, and iterate - without waiting for a human to craft the perfect prompt.
What it is
AutoDiscovery (formerly AutoDS) is now available inside AstaLabs, part of Ai2's Asta ecosystem, which supports analysis, summarization, and search across 108 million academic abstracts and 12 million full-text papers. The system generates hypotheses in natural language, proposes experiment plans, writes and executes Python code, interprets results, and uses those findings to generate new hypotheses.
It can run short analyses or work overnight. At the end, it outputs a reproducible list of potential research directions, each with code and traceable evidence paths.
Why it matters
Reading is not the tightest bottleneck in research - asking the right question is. AutoDiscovery helps compress the "what should we test next?" loop by scanning structured datasets spanning a handful to hundreds of papers and walking through branching statistical analyses automatically.
"AutoDiscovery's ability to reveal discoveries that may be hiding in plain sight is especially valuable in cancer research," said Dr. Kelly Paulson, medical oncologist and head of the center for immuno-oncology at the Swedish Cancer Institute.
How it works under the hood
- Hypothesis-first exploration: The system proposes testable statements and plans analyses before coding.
- Bayesian surprise: It maintains a prior belief based on world knowledge, updates that belief after seeing evidence, and quantifies how much expectations shifted. Both confirmations and disconfirmations matter; large shifts signal interesting leads. A classic example of disconfirmation with high impact is the shift from "miasma" to germ theory after Dr. John Snow's cholera mapping in 19th-century London.
- Search strategy with Monte Carlo Tree Search: It balances exploring novel ideas and following promising leads, guiding compute toward lines of inquiry that are more likely to pay off.
In Ai2's words, it uses Bayesian surprise and Monte Carlo Tree Search to co-collaborate with researchers on the question: "What should be investigated next?" "The ability to generate multiple hypotheses that can then be thoroughly evaluated by the user is extremely powerful," added Dr. Fabio Favoretto, marine ecologist at the Scripps Institution of Oceanography.
For background reading, see Bayesian inference and Monte Carlo Tree Search.
What you can do with it today
- Scope a dataset: Assemble a structured set of papers with consistent outcomes and covariates (e.g., effect sizes, cohorts, biomarkers). The cleaner the schema, the better the automated analysis.
- Set constraints: Define time window, subfields, target outcomes, and acceptable statistical tests.
- Run fast vs. overnight: Kick off a quick pass for triage, then schedule deeper runs to explore branches with higher surprise scores.
- Audit provenance: Review each hypothesis with links to source papers, code used, and statistics produced. Re-run to reproduce.
- Escalate to validation: Prioritize hypotheses with large surprise and clear clinical or experimental impact. Design follow-up experiments or preregister confirmatory studies.
Practical guardrails
- Check assumptions: Inspect model choices, effect size estimates, multiple-testing corrections, and confounders.
- Replicate: Re-run the generated Python code; confirm results on held-out sets or external datasets.
- Document decisions: Keep a lab log of hypothesis branches you prune or promote and why.
Availability
AutoDiscovery is available now as an experimental feature in Asta, Ai2's open-science scholarly agentic AI framework. The system is meant to change how scientists relate to their data - moving from static repositories to a more collaborative workflow that continually proposes the next testable step.
If you're upskilling your team on AI-assisted research workflows, explore curated options by role at Complete AI Training.
Your membership also unlocks: