Anthropic launches Claude Science beta for reproducible scientific research pipelines

Anthropic's Claude Science beta brings multi-agent AI to reproducible scientific pipelines with 60+ skills. A reviewer agent catches errors and self-corrects before delivery.

Categorized in: AI News Science and Research
Published on: Jul 05, 2026
Anthropic launches Claude Science beta for reproducible scientific research pipelines

Anthropic released Claude Science in beta on July 4, 2026 - a multi-agent AI workbench that runs reproducible genomics, proteomics, and cheminformatics pipelines using the company's existing Claude models. The app is available now on Pro, Max, Team, and Enterprise plans for macOS and Linux, targeting researchers who need to execute multi-step workflows across databases, notebooks, and cluster terminals while maintaining full provenance records.

The workbench integrates more than 60 curated skills and connectors pre-configured for genomics, single-cell analysis, proteomics, structural biology, and cheminformatics. Scientists describe their goal in plain language to a single coordinating agent, which then spins up specialized agents to handle the work. Every output carries an auditable history showing exactly how it was made.

How the multi-agent architecture works

A generalist coordinating agent receives the researcher's plain-language request and delegates tasks to domain-specialized agents that know established workflows for their fields. Users can also create their own specialist agents for custom pipelines. A separate reviewer agent inspects outputs step by step as the pipeline executes. It flags incorrect citations, numbers it cannot trace, and figures that do not match their underlying code, then initiates self-correction before delivering final results.

Compute scales from a single GPU on a laptop to hundreds on an HPC cluster. The agent drafts a resource plan and asks for approval before reaching for new infrastructure. It then writes and submits the job over SSH or through Modal, using the researcher's own accounts and infrastructure. Large or sensitive datasets never leave existing systems - only the context needed for each step is sent to Claude.

Reproducibility baked into every output

Every figure Claude Science generates ships with its exact code, software environment, a plain-language description, and the full message history. A researcher can return to a result months later and understand precisely how it was produced. Editing is conversational: ask the agent to change an axis to log scale, and it edits its own code. Sessions can be forked to compare two approaches without discarding the original work.

The release signals growing interest in AI for Science & Research tools that prioritize audit trails over black-box outputs. For labs building on prior work, the provenance record attached to every artifact cuts down the months of reconstruction that typically follow when a lab member leaves or a project changes hands.

Domain coverage and real-world use

Claude Science queries and synthesizes across scientific databases including UniProt, PDB, Ensembl, ClinVar, ChEMBL, GEO, journals, and preprint servers. It also integrates skills from NVIDIA's BioNeMo Agent Toolkit, connecting natively to Evo 2 for genomics, Boltz-2 for biomolecular interaction prediction, and OpenFold3 for protein structure prediction.

Beta users have tested the workbench on practical research problems. Manifold Bio used it to nominate targets for tissue-targeting medicines, assessing surface expression, trafficking, and safety for each candidate. "The app did this end to end, unlike a general coding assistant," the company said. JΓ©rΓ΄me Lecoq at the Allen Institute built a computational review template with roughly 20 custom skills. Sub-agents read thousands of papers into an evidence database, then wrote each section using actor-critic agent pairs. Reviews that once took his team up to two years now arrive in a fraction of that time - Lecoq already has about 10 reviews, many exceeding 100 pages. Stephen Francis at UCSF ran germline workups for glioma molecular epidemiology in roughly one-tenth the prior time, with his group independently validating the results.

Why this matters for science and research professionals

Claude Science addresses a structural problem in computational research: the gap between running an analysis and proving how it was done. The reviewer agent checking citations, numbers, and code-figure alignment means errors get caught during execution, not during peer review. The pipeline's ability to handle long-form literature reviews addresses a persistent bottleneck in academic Research workflows - one that Lecoq's team reduced from a two-year cycle to a dramatically shorter turnaround.

Researchers extend the workbench through Model Context Protocol connectors that link to lab tools and electronic lab notebooks. Saved pipelines become reusable skills that persist across sessions. Validated methods propagate through a lab rather than living in a single researcher's scripts folder. The beta runs locally on macOS or Linux, with remote execution available over SSH for HPC login nodes.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)