From Gemma 3 270M to FunctionGemma: Google's compact function-calling specialist for edge workloads

FunctionGemma, a 270M text-only model, maps natural language to precise function calls on phones and laptops. With fine-tuning, demo accuracy rose from 58% to 85%.

Published on: Dec 27, 2025
From Gemma 3 270M to FunctionGemma: Google's compact function-calling specialist for edge workloads

From Gemma 3 270M to FunctionGemma: a compact function-calling specialist for edge workloads

Google has introduced FunctionGemma, a focused take on the Gemma 3 270M model. It's built to do one job well: turn natural language into precise function calls, then optionally summarize tool responses.

This isn't a general chat model. It's a small, text-only transformer that works as an edge agent-mapping user intent to executable API actions with predictable structure.

What is FunctionGemma?

FunctionGemma keeps the Gemma 3 architecture at 270M parameters and ships under the Gemma license. The difference is the objective and chat format: everything centers on tool use and structured outputs, not open conversation.

It's presented as a standard causal language model with a shared 32K token budget (input + output). You feed it text, including tool definitions, and it produces text-most often a structured function call.

It's meant to be fine-tuned for your tools and your domain. Out of the box, it knows the pattern; with your data, it becomes reliable.

Architecture and training data

  • Gemma 3 transformer, 270M parameters
  • 256K vocabulary optimized for JSON and multilingual text (helps keep function schemas and tool outputs compact)
  • Trained on 6T tokens with a knowledge cutoff in August 2024
  • Data focuses on tool/API definitions and tool-use interactions: prompts, function calls, responses, and follow-ups

The training signal teaches both syntax (how to format a call) and intent (when to call a tool vs. ask for more info). That's the core of dependable function calling.

Conversation format and control tokens

FunctionGemma uses a strict template. Turns are wrapped as:

  • <start_of_turn>role ... <end_of_turn> where role is typically developer, user, or model

Within turns, it relies on dedicated markers:

  • <start_function_declaration> ... <end_function_declaration> for tool definitions
  • <start_function_call> ... <end_function_call> for the model's tool calls
  • <start_function_response> ... <end_function_response> for serialized tool outputs

These boundaries help the model cleanly separate natural language from schemas and execution results. The Hugging Face apply_chat_template API and official Gemma templates can generate this structure automatically.

Fine-tuning and Mobile Actions performance

Baseline FunctionGemma is trained for generic tool use, but small models hit production quality only after task-specific fine-tuning. That's the consistent pattern.

In the Mobile Actions demo (Android-style tools: create contact, set event, toggle flashlight, maps, etc.), the base model scores 58% accuracy on a held-out set. After fine-tuning with the public cookbook, accuracy jumps to 85%.

The takeaway: invest in domain data. Prompt tweaks help, but targeted examples move the needle.

Edge agents and reference demos

FunctionGemma targets phones, laptops, and small accelerators like NVIDIA Jetson Nano. With 270M parameters and quantization, it runs locally with low memory and latency.

  • Mobile Actions: fully offline device control with on-device deployment
  • Tiny Garden: voice commands mapped to domain functions like plant_seed and water_plots
  • Physics Playground: runs in the browser with Transformers.js; natural language becomes simulation actions

These examples show that a compact function-caller can handle multi-step logic on device-no server calls-once the tools and data are set up.

Why this matters for developers, IT, and product teams

  • Predictable interface: a strict chat template and clear tokens reduce glue code and edge cases
  • Small footprint: practical for on-device assistants, internal tools, or offline workflows
  • Privacy and latency: keep data local while still automating real actions
  • Cross-platform: runs on consumer hardware and browsers (via JS stacks)

Quick start ideas

  • Define tools with explicit, JSON-friendly schemas (types, required fields, enums)
  • Use the official chat template so the model sees clean role and tool boundaries
  • Start with the base model to validate your schemas and error handling
  • Fine-tune on real utterances mapped to your tools; include both correct calls and "ask for clarification" examples
  • Evaluate with held-out tasks and strict parsing; track both call accuracy and argument correctness
  • Quantize for edge; benchmark memory, latency, and throughput on target devices
  • Add guardrails: schema validation, safe defaults, and user confirmation for sensitive actions

Key takeaways

  • FunctionGemma is a 270M, text-only variant of Gemma 3 focused on function calling, not free-form chat
  • 256K vocab, 32K shared context window, trained on 6T tokens; open model under Gemma terms
  • Strict template with <start_of_turn>...<end_of_turn> and function control tokens is essential for production reliability
  • Mobile Actions: 58% base → 85% after fine-tuning-small models need domain data
  • Runs on phones, laptops, and Jetson-class devices; demos include Mobile Actions, Tiny Garden, and Physics Playground
  • Integrated with common ecosystems (e.g., Hugging Face, Vertex AI, LM Studio) and browser runtimes via Transformers.js

Resources


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide