Gemini Robotics 1.5 and ER 1.5 enable robots that think before they act in the real world

Gemini Robotics 1.5 turns high-level goals into actions with two models: ER plans and calls tools; Robotics executes from vision and language. Safer, clear, cross-robot skills.

Published on: Sep 26, 2025
Gemini Robotics 1.5 and ER 1.5 enable robots that think before they act in the real world

Gemini Robotics 1.5 brings AI agents into the physical world

Robots are moving from pre-scripted routines to agents that can perceive, plan, reason, use tools and take action. Google's new Gemini Robotics models push that shift forward with a practical two-model approach that turns high-level goals into real-world outcomes.

What's new

  • Gemini Robotics 1.5 (VLA): A vision-language-action model that converts visual context and instructions into motor commands. It thinks before acting, shares its process, and learns skills that transfer across different robot bodies.
  • Gemini Robotics-ER 1.5 (VLM): An embodied reasoning model that plans multi-step missions, calls digital tools natively, and achieves state-of-the-art results on spatial benchmarks.

Gemini Robotics-ER 1.5 is available now via the Gemini API in Google AI Studio. Gemini Robotics 1.5 is available to select partners.

Why this matters for product teams

  • Turn open-ended requests into end-to-end execution (plan, verify, act).
  • Decompose long tasks into reliable segments with transparent reasoning.
  • Reuse skills across robot platforms without retraining from scratch.
  • Use built-in tool calling (e.g., search or custom functions) to inject enterprise context.

How the system works

  • Orchestrate with ER 1.5: Creates the plan, makes decisions, estimates progress and success, and calls tools for missing info (e.g., local recycling rules).
  • Execute with Robotics 1.5: Interprets each step using vision and language, then produces motor commands. It reasons at multiple levels, from "what" to "how," before the robot moves.

Example: real-world sorting

Ask a robot to "Sort these items by local compost, recycling, and trash rules." ER 1.5 fetches the guidelines via a tool call, drafts a step-by-step plan, and assigns actions. Robotics 1.5 inspects objects, thinks through the motions, and completes the sorting safely and fully.

Key capabilities that move the needle

  • Thinks before acting: Generates an internal chain of reasoning in natural language, improving multi-step reliability and adaptability.
  • Cross-embodiment learning: Transfers motions between different robots without bespoke specialization (e.g., from ALOHA 2 to Apollo and Franka, and vice versa).
  • Advanced spatial reasoning: Leads across 15 embodied benchmarks (e.g., ERQA, Point-Bench, RefSpatial, RoboSpatial-Pointing, Where2Place, BLINK, CV-Bench, EmbSpatial, MindCube, RoboSpatial-VQA, SAT, Cosmos-Reason1, Min Video Pairs, OpenEQA, VSI-Bench).
  • Long-horizon tasking: Breaks complex goals into shorter, executable segments and monitors progress.
  • Transparent decisions: Can explain steps and rationale in natural language for easier debugging and operator trust.

Developer access

You can use Gemini Robotics-ER 1.5 today through the Gemini API in Google AI Studio. Robotics 1.5 access is currently limited to select partners.

Integration blueprint (fast path)

  • Perception: Stream robot camera and sensor data.
  • Planning (ER 1.5): Convert goals into detailed, multi-step plans. Call tools for rules, inventory, or maps via user-defined functions.
  • Action (Robotics 1.5): Turn each step plus visual context into motor commands, with pre-act reasoning.
  • Safety: Combine high-level semantic checks with low-level robot safety (e.g., collision avoidance) and require success estimates before execution.
  • Telemetry: Log plans, rationales, tool calls, and outcomes for review and tuning.

Safety and alignment

The models apply semantic safety checks before actions, maintain respectful human dialogue aligned with Gemini policies, and can trigger on-board safety subsystems when needed. Google is also releasing an upgraded ASIMOV benchmark to evaluate semantic safety with broader coverage and new modalities. ER 1.5 shows state-of-the-art performance on these safety evaluations, improving adherence to physical constraints.

What you can build

  • Warehousing and logistics: Bin picking, sortation, replenishment with policy-aware handling.
  • Manufacturing support: Rework, kitting, tool fetching with reasoning over part variations.
  • Retail back-room: Putaway, returns triage, and shelf-ready prep.
  • Facilities and home assistance: Tidying, laundry sorting, and waste separation guided by local rules.
  • Healthcare logistics: Non-clinical fetching and stocking with safer path planning.

Benchmarks and evidence

ER 1.5 leads an aggregated set of 15 academic embodied reasoning benchmarks covering pointing, image QA, and video QA. These results are reinforced by internal tests inspired by real deployments from trusted testers.

Getting started checklist

  • Scope 1-2 high-impact workflows with clear success criteria and safety boundaries.
  • Instrument your robot stack for vision streams, tool calling, and low-level safety overrides.
  • Prototype the loop: ER 1.5 plans and calls tools; Robotics 1.5 executes steps with pre-act reasoning.
  • Create evaluation sets for long-horizon tasks, corner cases, and semantic safety.
  • Pilot in a controlled area, iterate on plans and prompts, then scale to broader environments.

Access and learning

Gemini Robotics 1.5 is a meaningful step toward general-purpose physical agents that can plan, reason, use tools, and act across diverse settings. The value for product teams is clear: shorter integration paths, better generalization, and safer execution for real workflows.