PaperBanana: Google's five-agent illustrator turns methods text into diagrams and writes plot code

Google's PaperBanana: Multi-agent AI for publication-ready scientific diagrams

PaperBanana is a multi-agent system built to take your methodology text and return publication-grade figures-without the 4-8 hour design slog per diagram. It launched alongside an arXiv paper on January 30, 2026 and is available as a paid web service.

Under the hood, it combines Gemini-3-Pro as the vision-language backbone with Nano-Banana-Pro and GPT-Image-1.5 for image generation. The team synthesized style rules from 292 NeurIPS 2025 papers, so outputs feel consistent with top-tier CS publications.

How it works

Retriever Agent: Finds two relevant reference figures from a curated pool to anchor layout and style.
Planner Agent: Converts methodology text into a detailed scene spec-components, spatial relationships, hierarchy.
Stylist Agent: Applies color, typography, iconography, and layout principles learned from 292 papers.
Visualizer Agent: Generates the diagram (via Nano-Banana-Pro) or emits Python for statistical plots.
Critic Agent: Scores faithfulness, conciseness, readability, and aesthetics; triggers up to three refinement passes.

For plots, PaperBanana writes executable code (matplotlib/seaborn) instead of raw images. That preserves labels, scales, legends, and data fidelity much better than pure image generation.

Why this matters for researchers

Time back: If a single diagram drops from 6 hours to ~30 minutes, that's ~100-200 hours saved per year for a typical CS researcher.
Consistency: Built-in style rules deliver a clean, journal-friendly baseline across figures and papers.
Focus: You provide the scientific content; the system handles layout, typography, and visual hierarchy.

Results at a glance

The team built PaperBananaBench from 292 NeurIPS 2025 methodology cases. Against a baseline model (no agents), PaperBanana improved:

Faithfulness: +2.8 points
Conciseness: +37.2 points
Readability: +12.9 points
Aesthetics: +6.6 points
Overall: +17.0 points

In blind human preference tests, PaperBanana won 72.7% of head-to-heads, tied 20.7%, and lost 6.6%. Social chatter echoed this with examples where outputs matched or beat designer-level figures-plus plenty of comments about time saved.

Code-first statistical visualization

On 240 ChartMimic cases spanning line, bar, scatter, and multi-panel plots, the system produced runnable Python that matched the described visuals. Treating plots as code avoids common image-gen errors: garbled labels, shifted points, or broken legends.

This mirrors real workflows. You get a clear script you can tweak-colors, annotations, fonts-without redrawing anything.

Pricing

Basic: $14.90/month, 10 credits
Pro: $29.90/month, 30 credits
Premium: $59.90/month, 100 credits
Enterprise: $119.90/month, 250 credits

One credit per illustration. English and Japanese prompts are supported at launch.

Limitations to note

Raster output: Figures are 4K images, not vector (SVG/PDF). Scaling and precise edits are harder.
Editability: Local changes (e.g., a single label) often require regeneration rather than surgical edits.
Style fit: A NeurIPS-style bias may not match biology, physics, or social sciences out of the box.
Faithfulness gaps: Fine-grained connection errors (arrows, missing links) can slip past current critics.

Where it fits in the bigger picture

PaperBanana follows a growing pattern: break complex work into specialized agents and orchestrate them. Google's 2025 framework outlined this ladder of agent sophistication; PaperBanana sits in the collaborative multi-agent tier. Survey data from 2025 also pointed to strong ROI for teams adopting agent workflows.

Evaluation methods and dataset choices

PaperBananaBench samples real methodology text from 292 NeurIPS 2025 papers across four categories: Agent & Reasoning, Vision & Perception, Generative & Learning, and Science & Applications. That keeps cases realistic but skews toward CS/ML writing styles.

Faithfulness is relatively objective; conciseness and aesthetics are more subjective and discipline-dependent. The blind human study helps ground the results in practical preference, not just model-based scoring.

How it compares

Manual tools: Illustrator, OmniGraffle, and Visio produce perfect vectors but demand expert time and design judgment.
Domain tools: BioRender speeds up biology figures with icon libraries, yet composition is still on you.
General AI image gen: Can render "diagram-like" images, but lacks reference retrieval, style governance, critique loops, and code-first plotting.

What's next

Vector output: SVG/PDF for scale and precise editing.
Discipline-aware styles: Adapting to biology, physics, chemistry, and social sciences.
Interactive refinement: Conversational tweaks like "thicken this arrow" or "highlight the attention module."
Stronger QC: Better detection of subtle connectivity mistakes before final export.
Tool integrations: Plugins for Overleaf and Word to generate figures in-document.

Timeline

Jan 30, 2026: arXiv submission and commercial launch; project website goes live
Nov 2025: Google Cloud's agentic AI framework published
Sep 2025: Survey reports 88% ROI among early adopters of AI agents
Jul 2025: Market projection: agentic AI could reach ~$1T by 2035-2040
Jan 2025: Looker Studio adds modern charts, signaling broader visualization automation

Practical next steps

Start with a single methodology section you've already written. Use clear component names and relationships.
Generate one diagram and one plot. Check faithfulness first; then iterate for layout and typography.
If you need strict vector output, plan to regenerate later-or reserve manual tools for final polishing.
Create a lab-wide style note: preferred fonts, palettes, and icon rules to keep outputs consistent across authors.

Resources

Who built it

Research led by Dawei Zhu (Peking University) with Rui Meng, Yale Song, Xiyu Wei, Sujian Li, Tomas Pfister, and Jinsung Yoon from Google Cloud AI Research. Architecture: five agents coordinating on retrieval, planning, styling, visualization, and critique, powered by Gemini-3-Pro with Nano-Banana-Pro and GPT-Image-1.5.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

PaperBanana: Google's five-agent illustrator turns methods text into diagrams and writes plot code

Google's PaperBanana: Multi-agent AI for publication-ready scientific diagrams

How it works

Why this matters for researchers

Results at a glance

Code-first statistical visualization

Pricing

Limitations to note

Where it fits in the bigger picture

Evaluation methods and dataset choices

How it compares

What's next

Timeline

Practical next steps

Resources

Who built it

Related AI News for Science and Research

China puts AI at the heart of science, with AGI in its sights

AI outpaces PhDs in research - and academia scrambles to keep up

Locai Labs and First Light Launch UK-Built AI Collaboration to Accelerate Fusion Research

UW-Madison and Industry Team Up to Build Wisconsin's AI Future

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: