PaperBanana: Google's five-agent illustrator turns methods text into diagrams and writes plot code

Google's PaperBanana turns methodology text into publication-grade diagrams and runnable plot code. Launched Jan 30, 2026, it wins blind tests and cuts figure time to minutes.

Categorized in: AI News Science and Research
Published on: Feb 07, 2026
PaperBanana: Google's five-agent illustrator turns methods text into diagrams and writes plot code

Google's PaperBanana: Multi-agent AI for publication-ready scientific diagrams

PaperBanana is a multi-agent system built to take your methodology text and return publication-grade figures-without the 4-8 hour design slog per diagram. It launched alongside an arXiv paper on January 30, 2026 and is available as a paid web service.

Under the hood, it combines Gemini-3-Pro as the vision-language backbone with Nano-Banana-Pro and GPT-Image-1.5 for image generation. The team synthesized style rules from 292 NeurIPS 2025 papers, so outputs feel consistent with top-tier CS publications.

How it works

  • Retriever Agent: Finds two relevant reference figures from a curated pool to anchor layout and style.
  • Planner Agent: Converts methodology text into a detailed scene spec-components, spatial relationships, hierarchy.
  • Stylist Agent: Applies color, typography, iconography, and layout principles learned from 292 papers.
  • Visualizer Agent: Generates the diagram (via Nano-Banana-Pro) or emits Python for statistical plots.
  • Critic Agent: Scores faithfulness, conciseness, readability, and aesthetics; triggers up to three refinement passes.

For plots, PaperBanana writes executable code (matplotlib/seaborn) instead of raw images. That preserves labels, scales, legends, and data fidelity much better than pure image generation.

Why this matters for researchers

  • Time back: If a single diagram drops from 6 hours to ~30 minutes, that's ~100-200 hours saved per year for a typical CS researcher.
  • Consistency: Built-in style rules deliver a clean, journal-friendly baseline across figures and papers.
  • Focus: You provide the scientific content; the system handles layout, typography, and visual hierarchy.

Results at a glance

The team built PaperBananaBench from 292 NeurIPS 2025 methodology cases. Against a baseline model (no agents), PaperBanana improved:

  • Faithfulness: +2.8 points
  • Conciseness: +37.2 points
  • Readability: +12.9 points
  • Aesthetics: +6.6 points
  • Overall: +17.0 points

In blind human preference tests, PaperBanana won 72.7% of head-to-heads, tied 20.7%, and lost 6.6%. Social chatter echoed this with examples where outputs matched or beat designer-level figures-plus plenty of comments about time saved.

Code-first statistical visualization

On 240 ChartMimic cases spanning line, bar, scatter, and multi-panel plots, the system produced runnable Python that matched the described visuals. Treating plots as code avoids common image-gen errors: garbled labels, shifted points, or broken legends.

This mirrors real workflows. You get a clear script you can tweak-colors, annotations, fonts-without redrawing anything.

Pricing

  • Basic: $14.90/month, 10 credits
  • Pro: $29.90/month, 30 credits
  • Premium: $59.90/month, 100 credits
  • Enterprise: $119.90/month, 250 credits

One credit per illustration. English and Japanese prompts are supported at launch.

Limitations to note

  • Raster output: Figures are 4K images, not vector (SVG/PDF). Scaling and precise edits are harder.
  • Editability: Local changes (e.g., a single label) often require regeneration rather than surgical edits.
  • Style fit: A NeurIPS-style bias may not match biology, physics, or social sciences out of the box.
  • Faithfulness gaps: Fine-grained connection errors (arrows, missing links) can slip past current critics.

Where it fits in the bigger picture

PaperBanana follows a growing pattern: break complex work into specialized agents and orchestrate them. Google's 2025 framework outlined this ladder of agent sophistication; PaperBanana sits in the collaborative multi-agent tier. Survey data from 2025 also pointed to strong ROI for teams adopting agent workflows.

Evaluation methods and dataset choices

PaperBananaBench samples real methodology text from 292 NeurIPS 2025 papers across four categories: Agent & Reasoning, Vision & Perception, Generative & Learning, and Science & Applications. That keeps cases realistic but skews toward CS/ML writing styles.

Faithfulness is relatively objective; conciseness and aesthetics are more subjective and discipline-dependent. The blind human study helps ground the results in practical preference, not just model-based scoring.

How it compares

  • Manual tools: Illustrator, OmniGraffle, and Visio produce perfect vectors but demand expert time and design judgment.
  • Domain tools: BioRender speeds up biology figures with icon libraries, yet composition is still on you.
  • General AI image gen: Can render "diagram-like" images, but lacks reference retrieval, style governance, critique loops, and code-first plotting.

What's next

  • Vector output: SVG/PDF for scale and precise editing.
  • Discipline-aware styles: Adapting to biology, physics, chemistry, and social sciences.
  • Interactive refinement: Conversational tweaks like "thicken this arrow" or "highlight the attention module."
  • Stronger QC: Better detection of subtle connectivity mistakes before final export.
  • Tool integrations: Plugins for Overleaf and Word to generate figures in-document.

Timeline

  • Jan 30, 2026: arXiv submission and commercial launch; project website goes live
  • Nov 2025: Google Cloud's agentic AI framework published
  • Sep 2025: Survey reports 88% ROI among early adopters of AI agents
  • Jul 2025: Market projection: agentic AI could reach ~$1T by 2035-2040
  • Jan 2025: Looker Studio adds modern charts, signaling broader visualization automation

Practical next steps

  • Start with a single methodology section you've already written. Use clear component names and relationships.
  • Generate one diagram and one plot. Check faithfulness first; then iterate for layout and typography.
  • If you need strict vector output, plan to regenerate later-or reserve manual tools for final polishing.
  • Create a lab-wide style note: preferred fonts, palettes, and icon rules to keep outputs consistent across authors.

Resources

Who built it

Research led by Dawei Zhu (Peking University) with Rui Meng, Yale Song, Xiyu Wei, Sujian Li, Tomas Pfister, and Jinsung Yoon from Google Cloud AI Research. Architecture: five agents coordinating on retrieval, planning, styling, visualization, and critique, powered by Gemini-3-Pro with Nano-Banana-Pro and GPT-Image-1.5.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)