Signup

Agentic AI in Python: Build a Coding Agent with Google Gemini API (Video Course)

Build a practical Python coding agent with Gemini that plans, acts, and iterates. Ship a CLI tool that reads/writes files, runs tests, fixes bugs, adds features, and explains changes,safe by design with path limits, timeouts, and a clear tool schema.

Duration: 3 hours

Rating: 5/5 Stars

Difficulty:

Intermediate Expert (technical)

Video Course

Access this Course

Also includes Access to All:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Video thumbnail for Agentic AI in Python: Build a Coding Agent with Google Gemini API (Video Course)

What You Will Learn

Design an agentic loop that plans, acts, observes, and iterates
Build four LLM-friendly tools: list files, read files, write files, run Python
Integrate and call the Gemini API with function declarations and system prompts
Apply guardrails: working-directory confinement, timeouts, and clear error outputs
Ship a CLI Python agent that reproduces bugs, patches code, and verifies tests

Study Guide

Guide to Agentic AI - Build a Python Coding Agent with Gemini

Most people ask AI for answers. Builders ask AI to work. This course shows you how to architect a working Python agent that reads files, writes code, and runs programs to solve real tasks without you handholding every step. You'll use Gemini as the reasoning engine, give it carefully defined tools, and wrap the whole system in an "agentic loop" that plans, acts, observes, and iterates until the goal is met.

By the end, you'll know the mental models, code patterns, and guardrails behind assistants like Copilot and Cursor. You'll ship a command-line agent that can explore a codebase, diagnose bugs, apply patches, and verify fixes by executing tests , all inside a safe working directory. You'll move from passive prompting to active system design.

What We Will Build and Why It Matters

You will build a Python-based coding agent powered by the Gemini API. It won't just answer questions , it will take action using a small set of tools: list files, read files, write files, and run Python scripts. With that toolset, the agent can work like a junior developer that explores context, proposes changes, tests them, and learns from errors in a feedback loop. That loop is where the leverage is , the ability to make progress without a human shepherding every turn.

Example:
- You ask: "Fix the order-of-operations bug in my calculator project." The agent lists files, runs tests to reproduce the issue, reads the relevant module, proposes a patch, writes it, and re-runs tests to confirm. It then explains what it changed and why.
- You ask: "Add a percentage operator and update the docs." The agent scopes the work by reading files, implements the operator, writes new tests, runs them, and updates README content.

Foundations: Agents vs Chatbots

A chatbot responds once. An agent acts. The difference is autonomy. The agent breaks a goal into steps, calls tools, watches what happens, and adapts. It uses the model's reasoning to pick actions and your host application to execute them. The cycle repeats until completion or a boundary is hit (iterations, time, or explicit success conditions).

The Agentic Loop:
1) Receive Task → 2) Plan & Reason → 3) Act (tool call) → 4) Observe (tool result) → 5) Re-evaluate → repeat or finish.

Example:
- Debugging loop: Plan to run tests → Act: run file → Observe: failing trace → Plan: read module → Act: read file → Observe: bug location → Plan: write patch → Act: write file → Observe: success/failure after re-running tests.
- Refactoring loop: Plan to list files → Act: get directory info → Observe: modules and size → Plan: read target files → Act: get contents → Observe: patterns → Plan: apply refactor → Act: write changes → Observe: run suite and lint → finalize.

Tool Calling (Function Calling):
The LLM doesn't execute code. It produces a structured request that says which function to call and with what arguments. Your app runs the function and returns the result as text. The model consumes that text and decides what to do next.

Example:
- The model asks for: { name: "get_file_content", args: { "file_path": "calculator/core.py" } }. Your app reads the file and returns the content as a string.
- The model asks for: { name: "run_python_file", args: { "file_path": "tests/run_tests.py", "args": [] } }. Your app executes the tests, captures stdout/stderr, and returns the output string.

Key Terminology You'll Use All The Time

Agentic AI:
An LLM-powered system that plans, acts with tools, observes outcomes, and iterates to reach goals.

Large Language Model (LLM):
Gemini is the reasoning engine. It translates goals into structured actions and parses tool results to inform next steps.

Agentic Loop:
The engine of progress. Assess → Plan → Act → Observe → Repeat. Without iteration, there is no agency.

Tool Calling (Function Calling):
The structured interface that lets an LLM request tool execution. You dispatch the call, return the result, and feed it back to the model.

System Prompt:
A high-priority instruction that defines the agent's role, rules, and constraints across the entire session. It weighs more than user prompts.

Tokens:
The unit of text processing and billing. Manage long files and chat history thoughtfully. Chunk or truncate as needed.

Function Declaration (Schema):
A formal description of each tool you expose: name, description, parameters. The model reads these to call tools correctly.

Examples to anchor the terms:
- When the agent "Observes" a stack trace through stderr, that is loop feedback that guides the next "Plan" step.
- When you define a tool schema for write_file, the model can ask to create a file with a path and content without hallucinating the interface.

System Architecture: The Three Pillars

1) Core LLM:
Gemini Flash serves as the reasoning engine. It understands instructions, proposes plans, and decides which tool to call next. You'll configure the model with tools and a system prompt to bind its behavior to your environment.

2) Tools (Python functions):
LLM-friendly functions that accept simple inputs and return clear, descriptive text. Tools we'll build:
- get_files_info(directory_path)
- get_file_content(file_path)
- write_file(file_path, content)
- run_python_file(file_path, args)

3) The Orchestrator (Your App):
Manages the conversation, holds message history, dispatches tool calls, enforces guardrails, and controls the loop.

Target Application (Playground):
A Python calculator project with tests and intentional bugs. The agent will explore, fix, and verify by calling your tools.

Example workflows:
- Diagnose a divide-by-zero handling bug by running tests, reading arithmetic functions, and patching input validation.
- Add exponentiation and update the CLI help text, then re-run tests to ensure no regressions.

Prerequisites and Project Setup with uv

You need:
- Python 3.10+
- uv (fast Python project and package manager)
- A Unix-like shell (Bash, ZSH, or WSL)

Initialize the project:
1) Create a directory and run: uv init
2) Create a virtual environment: uv venv
3) Activate it: source .venv/bin/activate
4) Add dependencies: uv add google-generativeai python-dotenv

Example:
uv init
uv venv
source .venv/bin/activate
uv add google-generativeai python-dotenv

Connecting to Gemini: Your First API Call

Create a .env file with your key:
GEMINI_API_KEY="YOUR_API_KEY_HERE"

Example:
File: main.py
import os
import sys
import google.generativeai as genai
from dotenv import load_dotenv

def main():
load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
print("Error: GEMINI_API_KEY not found.")
return

genai.configure(api_key=api_key)

if len(sys.argv) < 2:
print('Usage: python main.py "your prompt"')
return

user_prompt = sys.argv[1]
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content(user_prompt)

print("--- Model Response ---")
print(response.text)

print("\\n--- Token Usage ---")
if hasattr(response, "usage_metadata") and response.usage_metadata:
print(f"Prompt Tokens: {response.usage_metadata.prompt_token_count}")
print(f"Response Tokens: {response.usage_metadata.candidates_token_count}")

if __name__ == "__main__":
main()

Run it with: uv run main.py "Why is the sky blue?"

Tip:
Print usage metadata to keep an eye on token costs. Truncate long outputs early in development to avoid surprises.

Designing LLM-Friendly Tools

Tools are your agent's hands. Make them simple, safe, and descriptive.

Principles:
- Input/Output Simplicity: Accept clear arguments. Return human-readable text. Avoid binary data.
- Deterministic Formatting: Include labels like --- STDOUT --- and --- STDERR --- so the LLM can parse meaningfully.
- Guardrails First: Constrain paths to a working directory. Add timeouts. Validate file extensions. Handle errors explicitly.
- Token Awareness: For large files, truncate or chunk. Consider returning a header describing truncation.

Good vs. bad outputs:
- Good: "Error: 'src/app.py' is not a file." The agent learns and tries a directory listing next.
- Bad: "It didn't work." The agent has no signal for recovery.

Implementing the Tools

Tool 1: get_files_info
Lists files and directories with size and type. Enforces working directory confinement.

import os

def get_files_info(working_directory: str, directory: str = ".") -> str:
abs_working_dir = os.path.abspath(working_directory)
req_path = os.path.join(abs_working_dir, directory)
abs_req_path = os.path.abspath(req_path)

if not abs_req_path.startswith(abs_working_dir):
return "Error: Access denied. Path is outside the working directory."

try:
if not os.path.isdir(abs_req_path):
return f"Error: '{directory}' is not a directory."
contents = os.listdir(abs_req_path)
output = []
for item in contents:
item_path = os.path.join(abs_req_path, item)
is_dir = os.path.isdir(item_path)
size = os.path.getsize(item_path)
output.append(f"- {item}: size={size} bytes, is_directory={is_dir}")
return "\\n".join(output) if output else "Directory is empty."
except Exception as e:
return f"Error listing files: {e}"

Example:
- Agent asks: list project root → returns modules, tests, README, and sizes.
- Agent asks: list src/ or calculator/ → returns structure to guide next read_file calls.

Tool 2: get_file_content
Reads text content with a length cap to control tokens.

import os

MAX_CHARS = 10000

def get_file_content(working_directory: str, file_path: str) -> str:
abs_working_dir = os.path.abspath(working_directory)
req_path = os.path.join(abs_working_dir, file_path)
abs_req_path = os.path.abspath(req_path)

if not abs_req_path.startswith(abs_working_dir):
return "Error: Access denied. Path is outside the working directory."
try:
if not os.path.isfile(abs_req_path):
return f"Error: '{file_path}' is not a file."
with open(abs_req_path, "r", encoding="utf-8") as f:
content = f.read(MAX_CHARS)
if len(content) >= MAX_CHARS:
content += f"\\n... (file truncated at {MAX_CHARS} characters)"
return content
except Exception as e:
return f"Error reading file: {e}"

Example:
- Agent reads tests/test_calculator.py to understand failing assertions.
- Agent reads calculator/core.py to locate order-of-operations logic.

Tool 3: write_file
Creates or overwrites files, creating directories as needed.

import os

def write_file(working_directory: str, file_path: str, content: str) -> str:
abs_working_dir = os.path.abspath(working_directory)
req_path = os.path.join(abs_working_dir, file_path)
abs_req_path = os.path.abspath(req_path)

if not abs_req_path.startswith(abs_working_dir):
return "Error: Access denied. Path is outside the working directory."
try:
parent_dir = os.path.dirname(abs_req_path)
os.makedirs(parent_dir, exist_ok=True)
with open(abs_req_path, "w", encoding="utf-8") as f:
f.write(content)
return f"Successfully wrote {len(content)} characters to '{file_path}'."
except Exception as e:
return f"Error writing to file: {e}"

Example:
- Agent writes a patched core.py implementing correct operator precedence.
- Agent creates docs/percentage.md and updates README content.

Tool 4: run_python_file
Executes Python scripts with timeout and captures stdout/stderr.

import os
import subprocess

def run_python_file(working_directory: str, file_path: str, args: list = []) -> str:
abs_working_dir = os.path.abspath(working_directory)
req_path = os.path.join(abs_working_dir, file_path)
abs_req_path = os.path.abspath(req_path)

if not abs_req_path.startswith(abs_working_dir):
return "Error: Access denied. Path is outside the working directory."
if not file_path.endswith(".py"):
return "Error: Not a Python file."

command = ["python", abs_req_path] + list(args or [])
try:
process = subprocess.run(
command,
cwd=abs_working_dir,
capture_output=True,
text=True,
timeout=30,
)
output = ""
if process.stdout:
output += f"--- STDOUT ---\\n{process.stdout}\\n"
if process.stderr:
output += f"--- STDERR ---\\n{process.stderr}\\n"
if process.returncode != 0:
output += f"Process exited with code {process.returncode}."
return output if output else "No output produced."
except subprocess.TimeoutExpired:
return "Error: Execution timed out after 30 seconds."
except Exception as e:
return f"Error executing Python file: {e}"

Example:
- Agent runs tests/run_tests.py to reproduce a failure and later to confirm a fix.
- Agent runs calculator/cli.py --help to ensure the CLI shows the new operator.

Teaching Gemini About Your Tools: Function Declarations

You must declare each tool to the model. A declaration includes name, description, and parameters (with types and descriptions). This tells the model what it can do, so it can ask for the right function with the right arguments.

Example: Declarations
from google.generativeai.types import FunctionDeclaration, Tool, Schema, Type

get_files_info_declaration = FunctionDeclaration(
name="get_files_info",
description="List files and directories within the working directory.",
parameters=Schema(
type=Type.OBJECT,
properties={
"directory": Schema(type=Type.STRING, description="Directory path relative to working directory.")
},
required=[],
),
)

get_file_content_declaration = FunctionDeclaration(
name="get_file_content",
description="Get file contents as a string, scoped to working directory.",
parameters=Schema(
type=Type.OBJECT,
properties={
"file_path": Schema(type=Type.STRING, description="Relative file path to read.")
},
required=["file_path"],
),
)

write_file_declaration = FunctionDeclaration(
name="write_file",
description="Create or overwrite a file with given content within working directory.",
parameters=Schema(
type=Type.OBJECT,
properties={
"file_path": Schema(type=Type.STRING, description="Relative file path to write."),
"content": Schema(type=Type.STRING, description="Full file content to write.")
},
required=["file_path", "content"],
),
)

run_python_file_declaration = FunctionDeclaration(
name="run_python_file",
description="Run a Python file and capture stdout/stderr, with timeout.",
parameters=Schema(
type=Type.OBJECT,
properties={
"file_path": Schema(type=Type.STRING, description="Relative path to the Python file."),
"args": Schema(type=Type.ARRAY, items=Schema(type=Type.STRING), description="List of CLI args.")
},
required=["file_path"],
),
)

tools = Tool(function_declarations=[
get_files_info_declaration,
get_file_content_declaration,
write_file_declaration,
run_python_file_declaration,
])

Tips:
- Describe parameters in plain language. The model performs better with concrete descriptions.
- Avoid optional parameters unless necessary. Required lists make the model's calls more reliable.

System Prompt: The Rules of the Game

The system prompt defines the agent's role, scope, and safety rules. It is your primary steering wheel. Keep it crisp and directive. Remind the model to plan, to verify after changes, and to respect path constraints.

Example: System Prompt
system_prompt = """
You are a helpful AI coding agent. Create a plan and use the available tools to fulfill user requests.

Available Operations:
1) get_files_info: List files and directories.
2) get_file_content: Read the content of a file.
3) write_file: Create or update a file.
4) run_python_file: Run a Python file, optionally with arguments, to check behavior or run tests.

Rules:
- All paths are relative to the working directory.
- Explore the project structure before making changes.
- Reproduce issues before fixing them.
- After any change, verify by running code or tests.
- Summarize your actions and results when the task is complete.
"""

Examples of variations:
- For documentation tasks, emphasize reading files and proposing edits before writing.
- For refactoring, emphasize tests and incremental changes with verification between each step.

Message History and Roles

Your orchestrator maintains the conversation history and roles:
- user: The human's request or follow-up.
- model: The LLM's reply, which can be text or a tool call request.
- tool: The text result produced by running a tool. You append this to the history and feed it back to the model.

Example: Iteration flow
1) user: "Fix the failing calculator tests."
2) model: function_call get_files_info(directory=".")
3) tool: returns directory listing
4) model: function_call run_python_file(file_path="tests/run_tests.py")
5) tool: returns failing test output and traceback
6) model: function_call get_file_content(file_path="calculator/core.py")
7) tool: returns file content
8) model: function_call write_file(file_path="calculator/core.py", content="...patched code...")
9) tool: returns success message
10) model: function_call run_python_file(file_path="tests/run_tests.py")
11) tool: returns all tests passed
12) model: final text summary to user

Implementing the Agentic Loop in main.py

Now combine model, tools, system prompt, and a loop that dispatches tool calls, collects results, and feeds them back.

Example: Core Loop (simplified for clarity)
import os
import sys
import google.generativeai as genai
from dotenv import load_dotenv
from google.generativeai.types import Part

# import tool functions and declarations
# from functions.get_files_info import get_files_info, get_files_info_declaration
# from functions.get_file_content import get_file_content, get_file_content_declaration
# from functions.write_file import write_file, write_file_declaration
# from functions.run_python_file import run_python_file, run_python_file_declaration

WORKING_DIR = "calculator" # your sandbox directory

def dispatch_tool_call(name, args):
if name == "get_files_info":
return get_files_info(WORKING_DIR, **args)
if name == "get_file_content":
return get_file_content(WORKING_DIR, **args)
if name == "write_file":
return write_file(WORKING_DIR, **args)
if name == "run_python_file":
return run_python_file(WORKING_DIR, **args)
return f"Error: Unknown tool '{name}'"

def main():
load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
print("Error: GEMINI_API_KEY not found.")
return
genai.configure(api_key=api_key)

user_prompt = " ".join(sys.argv[1:]) if len(sys.argv) > 1 else "Fix the calculator tests."

all_declarations = [
get_files_info_declaration,
get_file_content_declaration,
write_file_declaration,
run_python_file_declaration,
]

model = genai.GenerativeModel(
model_name="gemini-1.5-flash",
tools=genai.types.Tool(function_declarations=all_declarations),
system_instruction=system_prompt,
)

chat = model.start_chat()
response = chat.send_message(user_prompt)

MAX_ITERATIONS = 12
for i in range(MAX_ITERATIONS):
# Try to access function call safely
candidate = response.candidates[0] if response.candidates else None
parts = candidate.content.parts if candidate and candidate.content and candidate.content.parts else []

function_call = None
for p in parts:
if hasattr(p, "function_call") and p.function_call:
function_call = p.function_call
break

if not function_call:
# No further tool calls; print final text
print("--- Final Response ---")
print(response.text)
break

name = getattr(function_call, "name", None)
args = getattr(function_call, "args", {}) or {}
print(f"→ Tool requested: {name}({args})")

tool_result = dispatch_tool_call(name, args)

response = chat.send_message(
Part.from_function_response(name=name, response={"result": tool_result})
)
else:
print("Reached max iterations without completion.")

if __name__ == "__main__":
main()

Tips:
- Limit iterations to prevent runaway loops.
- Log each tool call and result for debugging and explainability.
- If your model returns multiple function calls in one response, process them in order, feeding each result back before the next.

End-to-End Scenario: Fixing a Calculator Bug

Scenario 1: Order of Operations
Goal: Multiplication/division should happen before addition/subtraction.

Flow:
1) The agent lists files to map the project.
2) The agent runs tests to reproduce the failing case (e.g., 2 + 3 * 4 expected 14, got 20).
3) The agent reads calculator/core.py to find the evaluation logic.
4) The agent proposes a patch implementing a parser or precedence-aware evaluation.
5) The agent writes the patch and re-runs tests.
6) The agent summarizes what changed, why it works, and shows all tests passed.

Scenario 2: Add Percentage Operator
Goal: Add % operator, update docs, and ensure help text includes it.

Flow:
1) The agent lists files and reads CLI and core modules.
2) The agent implements % in core logic (with a clearly defined behavior like unary percent or a modulo operator if required).
3) The agent adds new tests for %, runs the test suite, and fixes any issues.
4) The agent updates README and CLI help text.
5) The agent re-runs tests and prints the final summary.

Security and Guardrails

Letting an AI write and run code is powerful and risky. Even in a toy project, you need guardrails.

Implemented Guardrails:
- Working Directory Confinement: Every path is resolved and checked to ensure it stays within the designated directory. Path traversal attempts like ../../ are blocked.
- Execution Timeout: run_python_file times out after 30 seconds. Infinite loops are cut off.
- Controlled Interfaces: Tools accept limited, explicit parameters and return plain text. No shell escapes, no raw file descriptors.

Example risk scenarios:
- Path Traversal Attempt: The agent tries to read "../secrets.txt". Your path check returns "Access denied." The model learns to use relative paths correctly.
- Runaway Execution: The agent runs a script that never returns. Timeout fires, error message guides the model to consider a different test or add arguments to exit properly.

Beyond this project (for production):
- Full sandboxing (containers, seccomp, Firecracker).
- Permission prompts (HITL) for write or execute actions beyond safe scope.
- Network isolation and allowlists.
- Resource quotas (CPU, memory, disk, time).
- Audit logs and tamper-proof tracing for compliance.

Interfacing Details That Make or Break Reliability

System Prompt Quality:
Clear rules beat wishful thinking. Tell the model to explore first, reproduce issues, and verify fixes.

Function Schemas:
Be specific with parameter names and descriptions. The model will call tools more accurately.

Message History:
Include errors as-is. Good error text helps the model recover. Don't hide stderr. Label outputs consistently.

Examples:
- Incorporate tool outputs like "Process exited with code 1" into the chat so the model learns to inspect stack traces.
- Summarize long files before writing patches: have the model ask for relevant sections or propose chunked reads if truncation appears.

Token Management and Cost Control

Agents can consume tokens fast when reading files and generating long outputs.

Tactics:
- Cap file reads (MAX_CHARS). Add a truncation note so the model can ask for specific sections next.
- Encourage the model in your system prompt to list files and choose target modules before reading whole directories.
- Archive or prune old history when tasks are done, or summarize long tool outputs and replace raw logs with short summaries.

Examples:
- Instead of reading an entire repository, the model lists files and reads only test files and the suspected module.
- When subprocess output is huge, the model asks to re-run tests with -k to target a single failing test.

Error Handling and Recovery Patterns

Agents learn by bouncing off constraints and failures. Your job is to make those bounces useful.

Patterns:
- Fail Loudly: Return explicit error messages from tools ("Not a Python file", "Path is outside working directory").
- Guide Next Actions: In your system prompt, hint that the model should read relevant files or re-run a narrower subset of tests after errors.
- Limit and Reset: Use MAX_ITERATIONS. On repeated failure, stop, summarize attempts, and ask the user for guidance.

Examples:
- The agent tries to run a test file that doesn't exist. Your tool returns a clear error, and the model responds by listing the tests directory to find the correct path.
- The agent writes a patch that breaks imports. The next test run shows ImportError, and the agent re-opens the updated file to resolve the import path.

Implications and Applications

Education:
Building a toy agent clarifies how assistants actually work: planning, tool calls, and iterative reasoning. It's a hands-on way to teach advanced AI concepts and system design.

Software Development:
Scale these patterns for domain-specific developer assistants: automated debugging, refactoring, doc generation, test orchestration, release scripts. You define the tools; the model orchestrates the work.

Policy and Security:
Autonomous actions force clear boundaries. Sandbox execution, permissions, and audits aren't optional. Institutions need standards and oversight for agents that touch code, data, and networks.

Recommendations for Implementation

For Developers:
- Start simple: implement a minimum viable agent with clear tools and a tight loop.
- Make tools modular, well-documented, and forgiving. Return descriptive errors; the model uses them to learn.
- Treat security as a feature, not a patch. Constrain paths, enforce timeouts, and consider containerized execution even during experiments.

For Educational Institutions:
- Assign agent-building projects to teach AI integration, APIs, and loop-based reasoning.
- Establish strict protocols: sandboxed environments, no external network calls by default, and HITL approvals for sensitive operations.

Advanced Tips That Increase Success Rates

Prompt Strategy:
- Ask the agent to "Make a plan first, then act." You'll see fewer random tool calls.
- Encourage verify-as-you-go: "After each change, run tests to confirm." This reduces cascading errors.

Tool Output Design:
- Label sections clearly (--- STDOUT ---, --- STDERR ---). The model anchors on these markers.
- Include context in write_file acknowledgments (characters written, target path) to help the model track changes.

Memory and Summarization:
- Summarize long logs. Replace raw history with a short narrative for the next step. Guide the model to request missing details instead of re-dumping logs.

Extending the Agent: New Tools and Capabilities

Once the base loop works, extend carefully. Each new tool expands the agent's reach and your responsibility to secure it.

Example: Git Integration Tool
Purpose: Stage, commit, and show diffs so the agent can manage patches cleanly.
Signature: git_action(action: str, args: list[str]) → str. Actions: status, diff, add, commit, log.
Schema: name "git_action", parameters: action (string), args (array of strings).

Example: HTTP Fetch Tool (Read-Only)
Purpose: Fetch documentation or package info from an allowlist of domains.
Signature: http_get(url: str) → str (truncated HTML/text).
Schema: name "http_get", parameters: url (string). Add allowlist and max bytes.

Challenges to anticipate:
- Security: Network calls introduce new attack surfaces. Use allowlists and response sanitization.
- Complexity: More tools mean more possible actions. Keep schemas crisp and your system prompt opinionated about when to use them.
- Determinism: Prefer idempotent reads and controlled writes. Avoid tools that mutate state unpredictably without checks.

Production Hardening Checklist

- Isolation: Run agent actions in a container with restricted permissions and no default network access.
- Permissions: Require explicit user approval for sensitive operations (e.g., writing to specific files, installing packages).
- Resource Limits: CPU/memory caps, strict timeouts, and output size limits.
- Logging and Audits: Record all tool calls, arguments, results, and model prompts for traceability.
- Rollback Plans: Version files, commit changes, and allow easy reversion if the agent drifts.

Practice: Run These Scenarios

Scenario A: Reproduce and Fix
Ask: "The test for 3 + 4 * 2 fails. Diagnose and fix." Watch the loop: list → run tests → read module → write patch → re-run tests → summarize.

Scenario B: Add a Feature
Ask: "Add power operator ** and update the CLI help. Include tests." Expect planning, minimal reads, targeted writes, and verification.

Scenario C: Document the Codebase
Ask: "Generate a concise README summary of this project with usage examples." The agent should read key files and produce a succinct doc.

Common Pitfalls and How to Avoid Them

- Vague Tool Outputs: If your tool returns "error" without context, the agent can't recover. Always include specific messages.
- Unbounded Loops: Without MAX_ITERATIONS, an agent can get stuck. Cap iterations and ask the user for help when stuck.
- Reading the Entire Repo: Costs explode. Encourage the agent to list directories first, then read selectively.
- Silent Security Holes: If you don't check paths or timeouts, you invite problems. Make guardrails default, not optional.

Key Insights Worth Remembering

"The thing that makes it an agent is that it can self-prompt itself over and over."
Give it tools and a loop, and the model becomes an operator, not just a writer.

"We are building an app that can help us build other apps."
That's leverage. Small, precise tools plus a smart orchestrator compound your output.

"If you thought allowing an LLM to write files was a bad idea, you ain't seen nothing yet."
Running code is where the power , and the risk , lives. Guard it.

Complete Coverage: Reflecting the Core Components

We covered:
- Agentic loop mechanics with multiple concrete examples.
- Tool calling pattern and schema design for four core tools.
- System architecture: Gemini model, tools, and the orchestrator application.
- Practical setup with uv, API key management, and first Gemini call.
- Implementations of get_files_info, get_file_content, write_file, run_python_file with guardrails and detailed outputs.
- Function declarations, system prompt design, and message roles (user, model, tool).
- Security and guardrails: working directory confinement, timeouts, and production-grade recommendations.
- Implications: education, software development, and policy/security contexts.
- Recommendations for developers and institutions with actionable steps.
- Advanced tips: prompting, output labeling, memory control, and cost management.
- End-to-end debugging and feature-addition scenarios with verification.

Short Exercises and Questions

Multiple Choice
1) What is the primary role of the agentic loop?
- Answer: To allow an agent to perform a sequence of actions, observe results, and adjust its plan.

2) What does an LLM provide when it performs a tool call?
- Answer: A structured request (e.g., JSON-like) specifying the function name and arguments.

3) Why capture both stdout and stderr when running Python files?
- Answer: To give the agent complete feedback, including error messages for diagnosis.

Short Answer
1) Name the three main components of the system and their roles.
2) Describe how a single iteration works when the agent decides to read a file.
3) Why is the system prompt more effective at controlling behavior than user prompts?

Discussion
1) What are the biggest risks in write-and-execute agents? What extra safeguards would you add?
2) How would you design a "smart patch" tool to insert, delete, or replace specific lines? What edge cases appear?
3) Propose a read-only web tool or git tool with a function signature and schema. How would you constrain it safely?

Practical Best Practices Recap

- Start each task by exploring structure (get_files_info), not by reading everything.
- Reproduce issues before fixing. Verify after each change.
- Keep tools boring and safe. The model provides creativity; your tools provide dependability.
- Prefer small, reversible changes over sweeping modifications.
- Log everything. Tool requests, arguments, outputs, and summaries form an audit trail and a learning dataset.

Conclusion: From Prompting to Building

Agents turn language models into doers. The magic is not a single prompt , it's a loop, a set of tools, and the discipline to constrain power within safe boundaries. When you design the system, you control the leverage. You built a coding agent that can read, write, and run code, all while learning from the feedback it creates. That's not just better answers , that's compounding action.

Take these patterns and apply them to your own workflows: refactoring at scale, automating docs, orchestrating tests, or generating starter apps. Keep refining your system prompt, your tool schemas, and your guardrails. The more precisely you define the environment, the more reliably the agent works inside it. Now ship something your future self will thank you for.

Next Step:
Point your agent at a real, but sandboxed, project. Ask it to find a bug, propose a fix, and pass tests. Observe the loop. Tighten the tools. Iterate until it feels like a teammate.

Frequently Asked Questions

This FAQ exists to answer the most common,and the most overlooked,questions about building an agentic Python coding agent with Gemini. It moves from core ideas to practical implementation, security, optimization, and real business use. Use it as a quick reference while you architect, build, and ship your agent.

Fundamental Concepts

What is an AI agent, and how does it differ from a standard chatbot?

Summary:
An AI agent is built to take actions, not just answer questions. A chatbot typically responds once and stops. An agent loops through observe → think → act, using tools to work toward a goal. For a coding agent, that means it can read files, write patches, run tests, and report back. Key difference:
Chatbot = single-turn text reply. Agent = multi-step decisions with tool use and feedback. In practice, a user might say "Fix order-of-operations in my calculator." The agent can list project files, open the parser, propose a code change, write it, run tests, and confirm the fix. Business value:
Agents reduce manual handoffs. Instead of telling a developer what to do, they can try a solution themselves, then produce evidence: what changed, what passed, and what still needs attention.

What is an "agentic loop"?

Summary:
The agentic loop is the engine that lets the agent work autonomously until a goal is met. Steps: Observe (state and prior results), Think (choose next action or tool), Act (execute a function), and Feedback (use results to guide the next step). This repeats until the task is done or a stop condition is reached. Why it matters:
Complex tasks require multiple steps. The loop lets the model self-prompt, verify progress, and adapt. Example:
Investigate a failing unit test → read target file → propose fix → write file → run tests → summarize results. This structure keeps the agent focused and makes it auditable.

What role does "tool calling" play in AI agents?

Summary:
Tool calling bridges language output and real actions. The model does not execute code; it requests a function by name with arguments (a structured call). Your Python program runs the function, captures the output (or errors), and feeds it back. Why it's essential:
Without tools, the agent can't read files, run scripts, or change anything. With tools, it can interact with the filesystem, shell, databases, and APIs. Practical example:
A "get_file_content" tool returns the exact text of a file so the model can reason about it, propose edits, and then call "write_file" to apply changes.

Project Setup and Prerequisites

What are the prerequisites for building this AI coding agent?

Summary:
You'll need: Python 3.10+, uv for environment/dependency management, a Unix-like shell (Bash/Zsh or WSL on Windows), and a Gemini API key. Why these choices:
- uv speeds up installs and keeps your project reproducible.
- A Unix-like shell improves compatibility for scripts and subprocesses.
- The API key authenticates your calls to Gemini. Business tip:
Keep secrets in a .env file and exclude it from version control. That reduces risk and simplifies onboarding for teammates.

What is UV, and how is it used to set up a Python project?

Summary:
uv is a fast Python package/dependency manager that pairs with modern pyproject.toml workflows. You'll init a project (creates pyproject), create a venv, activate it, and add dependencies (e.g., google-generativeai, python-dotenv). Why use it:
It's quick, consistent, and works well in CI. Typical flow:
Initialize → create venv → activate → uv add dependencies. This keeps your environment isolated so agent runs don't affect other projects.

What is Google Gemini?

Summary:
Gemini is a family of large language models from Google used here as the "brain" of your agent. It handles reasoning, planning, and language generation, including decisions about which tool to call next. Model choice tip:
Use lighter models (e.g., Flash) for faster, lower-cost loops when exploring files or running tests; use higher-quality models (e.g., Pro) for complex reasoning or code synthesis. Business angle:
Match model selection to the task's complexity and the cost/latency budget of your workflow.

What are "tokens" in the context of LLM APIs?

Summary:
Tokens are chunks of text the model processes. Billing and context limits depend on token counts for both prompts and responses. Roughly, one token is a few characters. Why you should care:
- Cost control: Long prompts and big outputs increase spend.
- Reliability: Exceeding context limits causes failures. Practical tip:
Truncate large files, summarize long logs, and stream outputs only when needed. Track token usage per request for visibility and budgeting.

How do you obtain and configure a Gemini API key?

Summary:
Generate a key in Google AI Studio, then store it in a .env file (e.g., GEMINI_API_KEY="..."). Load it in Python with python-dotenv and pass it to the Gemini client configuration. Security essentials:
- Add .env to .gitignore.
- Use environment variables in CI/CD.
- Rotate keys if you suspect exposure. Common pitfalls:
Missing .env, wrong variable name, or shell not sourcing the environment. Log a clear error if the key isn't found to speed up troubleshooting.

Interacting with the Gemini API

How do you make a basic API call to Gemini in Python?

Summary:
Install google-generativeai and python-dotenv. Load the API key from .env, configure the client, select a model (e.g., gemini-1.5-flash), send a prompt, and read response.text. Optionally log token usage from usage_metadata. Checklist:
- Load .env before config.
- Configure API key once per run.
- Handle empty or error responses gracefully. Example use case:
Kick off a simple Q&A to verify connectivity before you wire up tools and the agentic loop.

What is a "system prompt" and why is it critical for an agent?

Summary:
The system prompt sets the agent's role, rules, and strategy. It has higher priority than user prompts. You'll define allowed tools, constraints (e.g., relative paths only), and the general method to approach tasks (e.g., list files, reproduce bug, write fix, retest). Benefits:
- Fewer unsafe actions
- More consistent behavior
- Easier debugging and audits. Tip:
Be explicit about tool usage order, constraints, and verification steps to reduce loops and mistakes.

Building the Agent's Tools

What are the four essential tools for the coding agent?

Summary:
The baseline toolkit: get_files_info (list directories/files), get_file_content (read text content), write_file (create/overwrite files), and run_python_file (execute a Python script and capture stdout/stderr). Why these first:
They enable discovery, diagnosis, patching, and verification,the full loop for bug fixing. Extension ideas:
Add a test runner, linter, or git tool later, but start with the smallest set that supports end-to-end outcomes.

How do you implement a secure function for the agent to list files?

Summary:
Sandbox it. Resolve the absolute path of a designated working directory, resolve the target path, and ensure the target is within the sandbox (prefix check). If it's outside, return an error. Good behavior:
- Validate input paths
- Return clear error messages
- Avoid sensitive metadata. Business note:
Sandboxing reduces risk while allowing useful work. It's a non-negotiable guardrail for any agent that touches the filesystem.

Why is it important to truncate file contents when implementing a get_file_content tool?

Summary:
Truncation controls cost and prevents context overflows. Large files can balloon token usage and cause API failures. Limit content length (e.g., first 10k chars) and append a note that the file was truncated. Practical workflow:
Let the agent request additional slices when needed (e.g., "show lines 200-400"). Outcome:
Faster, cheaper calls, and fewer interruptions due to context limits.

Author, Links & Resources

Unlock this content to view the author bio and resources by Logging in or Signing up.

Certification

About the Certification

Get certified in Agentic AI in Python with the Google Gemini API. Prove you can build a safe CLI coding agent that plans, edits files, runs tests, fixes bugs, adds features, and explains changes using a clear tool schema, path limits, and timeouts.

Get your: Certification in Developing Agentic Python Coding Agents with Google Gemini API

Official Certification

Upon successful completion of the "Certification in Developing Agentic Python Coding Agents with Google Gemini API", you will receive a verifiable digital certificate. This certificate demonstrates your expertise in the subject matter covered in this course.

Benefits of Certification

Enhance your professional credibility and stand out in the job market.
Validate your skills and knowledge in cutting-edge AI technologies.
Unlock new career opportunities in the rapidly growing AI field.
Share your achievement on your resume, LinkedIn, and other professional platforms.

How to complete your certification successfully?

To earn your certification, you’ll need to complete all video lessons, study the guide carefully, and review the FAQ. After that, you’ll be prepared to pass the certification requirements.

Join 20,000+ Professionals, Using AI to transform their Careers

Join professionals who didn’t just adapt, they thrived. You can too, with AI training designed for your job.