AI proposing and testing hypotheses within five years: what researchers should prepare for
Two researchers, Robert West (EPFL) and Ágnes Horvát (Northwestern), are blunt about where science is headed. AI is changing how work is written, reviewed, and found. The next shift is bigger: AI proposing the questions we study-and running parts of the process end to end.
That future creates clear upside and real risk. Below is what they're seeing now, what might arrive within five years, and how to prepare.
Social media still helps-just less than before
Horvát's group found a measurable citation boost for scientists active on social platforms over a seven-year window, but the benefit is trending down. Attention is compressed into short posts; click-driven formats reward hype over nuance.
Her team also detected unmistakable LLM fingerprints in 2024 biomedical abstracts: roughly 13% showed signs of AI "massaging," flagged by a set of ~500 telltale words. West reported a similar pattern on the other side of the pipeline-at least 16% of ICLR 2024 peer reviews used LLM help.
That creates a loop where AI writes papers, AI reviews them, and readers ask AI to summarize the results. West's take: if human writing quality is often poor, AI might level the field rather than widen gaps-but it also standardizes voice.
The risk: homogenized ideas and false certainty
Horvát worries about two shifts. First, homogenization-AI tends to produce similar phrasing and framing, which can narrow the space of ideas that feel "fundable" or "publishable." Second, overconfidence-models default to confident statements even when evidence is thin.
Presentation choices steer what gets cited, funded, and built on. Offloading that to AI changes which ideas get oxygen, often without anyone noticing.
Misinformation moves faster with machines
Most online science content is remixed from sources with unclear provenance. AI makes it cheaper and faster to produce endless variants, including plausible junk.
West notes that modern models are highly persuasive when instructed to argue a position. That's free propaganda at scale. Detection tools underestimate true prevalence, so dashboards likely show only a slice of the activity.
The next leap: AI-generated hypotheses
Horvát expects AI will propose research ideas within five years. West agrees-and raises the harder question: will humans be able to follow the science AI is doing, and will the questions align with human priorities?
AI can read everything. That's an advantage for coverage, not necessarily for values. The hard part in science is choosing the next question. If AI does this "better," does it also care about what matters for people?
What labs, journals, and funders can do now
- Require clear disclosure of AI use across the pipeline. Separate writing assistance, code generation, data analysis, and review support. Align with guidance from bodies like COPE (AI tools and authorship).
- Guard against false certainty. Calibrate abstracts and titles to match evidence strength. Enforce hedging where appropriate and penalize overclaiming.
- Add audit trails. Store prompts, versions, and outputs for AI-assisted steps. Keep human-in-the-loop signoff for core claims, statistics, and ethics.
- Publish with reproducibility guarantees. Share code, data, and prompts. Pre-register when feasible. Make replication plans part of grant proposals.
- Use social channels with intent. Favor threads that preserve context, link preprints, and provide effect sizes and limitations. Skip clickbait; it backfires with expert audiences.
- Prepare review processes for AI-generated hypotheses. Create panels to assess novelty, risk, and societal value. Demand mechanistic plausibility, not just pattern-matching.
- Strengthen misinformation response. Track bot amplification around key topics and coordinate corrections with reputable sources (see WHO infodemic management).
- Invest in AI literacy across roles. Train PIs, students, and reviewers on model limits, prompt auditing, and calibration. Set lab policies for acceptable use.
Metrics worth watching
- Share of abstracts, reviews, and code commits with declared AI assistance.
- Claim calibration: language strength vs. effect size and uncertainty.
- Time-to-replication and replication success rates by field.
- Proportion of engagement from suspected bot accounts on science posts.
- Diversity of cited ideas and methods over time (homogenization proxy).
If AI starts asking the questions
Set boundaries now. Decide which domains are appropriate for AI-led hypotheses, what evidence qualifies as "enough" to greenlight wet-lab or human studies, and how to factor public interest into topic selection.
West and Horvát aren't arguing for a slowdown; they're asking for conscious control. Keep humans accountable for direction, values, and consequences-even if machines draft the path.
Practical upskilling
If your lab is formalizing AI policies or building prompt and audit workflows, structured training helps. See researcher-focused options here: AI courses by job.
Bottom line: AI is already embedded in how we write, review, and spread science. The near-term challenge is calibration and provenance; the medium-term challenge is choosing which AI-suggested questions deserve our time, funding, and trust.
Your membership also unlocks: