Signup

Pydantic AI for Production: Build Type-Safe LLM Agents (Video Course)

Build AI agents you can trust. Learn Pydantic AI's clean patterns to return typed, validated data; add tools and real dependencies; stream results; set timeouts/retries; plug in search, code, and embeddings. Ship production-ready services without guesswork.

Duration: 1 hour

Rating: 5/5 Stars

Difficulty:

Intermediate Expert (technical)

Video Course

Access this Course

Also includes Access to All:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Video thumbnail for Pydantic AI for Production: Build Type-Safe LLM Agents (Video Course)

What You Will Learn

Build production-grade agents that return Pydantic-validated outputs
Attach and call tools (decorators, tool lists, tool sets) including code execution and web search
Inject live dependencies (DBs, API clients) via dataclasses and RunContext
Manage conversation, streaming, timeouts, retries, and observability for reliability
Use embeddings for RAG and connect remote tools via MCP integration

Study Guide

Pydantic AI Crash Course: Agentic Framework For Production

You're probably here for one reason: you've seen what LLMs can do, but you also know what they can break. Free-form text is great for demos and brainstorms. It's terrible for databases, APIs, and anything mission-critical. That's where Pydantic AI comes in,a minimal, explicit, production-grade framework for building AI agents that return validated, predictable data structures. Think of it like FastAPI for agents: clean interfaces, strict models, and the flexibility to scale without chaos.

In this course, you'll learn how to build robust agents from the ground up,initialize them, validate their outputs with Pydantic models, manage conversation context intentionally, extend capability with tools, inject live dependencies (like DB connections), stream responses, set timeouts/retries, leverage built-ins like web search and code execution, generate embeddings for RAG, and even connect to remote tool servers via MCP. By the time you're done, you'll be able to ship production-grade AI services with confidence, not hope.

Here's the mindset shift: stop wrestling LLMs into structure after the fact. Define the structure first, and let the model fit into it. As someone smart put it, "Why use the derivative when you can go straight to the source." Pydantic AI is the source when it comes to validated, typed, reliable outputs.

What This Course Covers (And Why It Matters)

- The core philosophy of Pydantic AI: type safety, validation, and structured outputs by design.
- Clean setup: what to install, how to store keys, and how to start your first agent.
- Structured output using Pydantic models that auto-validate LLM responses.
- Conversation and context: when to use system prompts vs. instructions, and how to keep state intentionally.
- Tools: extending agents with Python functions (decorators and tool lists).
- Dependency injection: pass live context (DBs, user sessions) into tools with a clean, testable pattern.
- Advanced features: streaming, timeouts, retries, tool sets, web search, code execution, embeddings, and MCP integration.
- Production patterns: testing, error handling, observability, performance, and security considerations.
- Practical applications and templates you can repurpose today.

Philosophy: Minimal, Explicit, and Strongly Typed

Pydantic AI distinguishes itself from generalist frameworks by putting data validation at the heart of the workflow. You tell the model exactly what a valid response looks like and get back a Pydantic model you can trust. The framework is model-agnostic and plays well with OpenAI, Anthropic, and others, but its trademark is predictable, validated outcomes. You won't ship a string when your API expects a "UserProfile" object. You'll ship a "UserProfile" object.

This is the same clean, explicit style developers love in FastAPI: less magic, more clarity, and a surface area small enough to memorize.

Setup: Install, Keys, and Your First Agent

Before anything else, get your environment dialed in. You need to install pydantic-ai and a simple key manager.

Example 1: Install dependencies
pip install pydantic-ai python-dotenv

Example 2: Store your API keys in .env
OPENAI_API_KEY="sk-your-secret-key"

You'll typically load keys at startup using python-dotenv. Keep your repo clean: don't commit .env files, and use environment variables in production.

Your First Agent: Model, System Prompt, and a Clean Response

An Agent is the core orchestrator. You define a model and a system prompt that governs behavior. Calls are async for responsiveness.

Example 1: Minimal agent
from dotenv import load_dotenv
from pydantic_ai import Agent
import asyncio
load_dotenv()
agent = Agent(model="openai:gpt-4o", system_prompt="Answer in one concise sentence.")
async def main():
res = await agent.run(message="What is a unit test?")
print(res.output)
asyncio.run(main())

Example 2: Model-agnostic agents
# OpenAI model
openai_agent = Agent(model="openai:gpt-4o", system_prompt="Be direct.")
# Anthropic model (hypothetical spec, adjust to your provider)
anthropic_agent = Agent(model="anthropic:claude-3-opus", system_prompt="Be direct.")
# Both can be used interchangeably while keeping the same code paths.

Tip: Keep system prompts short and stable. Use instructions for transient behavior (more on that later).

Structured Output with Pydantic Models: Production's Secret Weapon

This is the main event. With Pydantic AI, the model's response is validated against a schema you define. If it doesn't fit, the framework handles corrections or raises errors you can catch and retry. You're no longer parsing brittle text,you're handling typed objects.

Example 1: Simple structured output
from pydantic import BaseModel
from pydantic_ai import Agent
class Person(BaseModel):
  name: str
  age: int
  job: str
structured_agent = Agent(
  model="openai:gpt-4o",
  system_prompt="Extract a person profile from the message.",
  output_type=Person
)
# "Jake is a 29-year-old product manager."
# res.output -> Person(name='Jake', age=29, job='product manager')

Example 2: Nested models for APIs
class Address(BaseModel):
  street: str
  city: str
  country: str
class Customer(BaseModel):
  id: str
  email: str
  address: Address
cust_agent = Agent(
  model="openai:gpt-4o",
  system_prompt="Return normalized customer records only.",
  output_type=Customer
)
# Input text: "Customer 88, email: ann@example.com, lives at 5 Pike, Seattle, USA."
# Output is a fully validated Customer object.

Best practices:
- Start with strict fields (no Optional) unless there's real ambiguity.
- Use Enums for constrained values (e.g., country codes).
- Validate everything at the boundary,once validated, treat the object as safe for your pipeline.

Conversation and Context: Stateful by Choice

Agents are stateless by default. That's a good thing,you control exactly what persists between turns. To carry forward context, pass message_history. You also get fine-grained control with response.all_messages and response.new_messages.

Example 1: Multi-turn with history
res1 = await agent.run(message="Summarize the Python language in one sentence.")
res2 = await agent.run(
  message="Give me two practical uses based on your last answer.",
  message_history=res1.all_messages
)
print(res2.output)

Example 2: Selective history using new_messages
resA = await agent.run(message="Explain test-driven development in brief.")
# new_messages has only the last user+assistant messages
resB = await agent.run(
  message="List three benefits.",
  message_history=resA.new_messages
)
# Useful when you want short context windows.

System Prompt vs. Instructions:
- system_prompt: persistent identity; included in all_messages and inherited by chained agents if you pass history further.
- instructions: one-off directives for a single run; not carried forward. Great for temporary behavior, like "use bullet points this time."

Streaming Responses: Real-Time Feedback

When you want an interactive feel or need to display partial results, use run_stream. It yields tokens or text deltas while the model is thinking.

Example 1: Text delta streaming
async with agent.run_stream(message="Explain gradient descent simply.") as run:
async for chunk in run.text(delta=True):
print(chunk, end="", flush=True)

Example 2: Streaming with structured output context
# You can stream text previews while still validating the final structured output in another pass.
# Pattern: stream for UX, finalize with a second call that enforces output_type for downstream systems.

Tools: Give Your Agent Real-World Powers

Tools are Python functions your agent can call to fetch data, perform calculations, or interact with external systems. Define them with clear type hints and a helpful docstring. Register via a decorator or pass them as a list,both patterns work.

Example 1: Simple decorator (@agent.tool_plain)
agent = Agent(model="openai:gpt-4o")
@agent.tool_plain
def get_favorite_color(name: str) -> str:
  """Return the favorite color of a user by name."""
  return {"florian": "Teal", "mike": "Orange"}.get(name.lower(), "Unknown")
# The agent can now call get_favorite_color when needed.

Example 2: Register a tools list on init or per-run
def usd_to_eur(amount: float) -> float:
  """Convert USD to EUR with a dummy rate."""
  return round(amount * 0.9, 2)
pricing_agent = Agent(model="openai:gpt-4o", tools=[usd_to_eur])
# Or, provide tools dynamically:
res = await agent.run("Convert 120 USD to EUR.", tools=[usd_to_eur])

Tips:
- Keep tool names and docstrings crisp; the LLM uses them for tool selection.
- Use clear type hints; they guide the parameter mapping.
- Return simple, serializable values (dicts, lists, numbers, strings) unless you're deliberately returning Pydantic models.

Dependency Injection for Tools: Clean Access to Context

Real applications need context: database connections, user sessions, feature flags, and API clients. Pydantic AI's dependency injection system lets you define this context as a dataclass and pass it into tools via a RunContext. It keeps your tools clean and testable.

Example 1: Database-backed tool
import sqlite3
from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
@dataclass
class Deps:
  username: str
  db: sqlite3.Connection
db_agent = Agent(model="openai:gpt-4o", dependencies_type=Deps)
@db_agent.tool
def list_customers(ctx: RunContext[Deps]) -> list[tuple]:
  """Return all customers from the database."""
  cur = ctx.deps.db.cursor()
  cur.execute("SELECT id, name FROM customers")
  return cur.fetchall()
db_conn = sqlite3.connect("data/app.db", check_same_thread=False)
deps = Deps(username="alice", db=db_conn)
res = await db_agent.run("List all customers.", dependencies=deps)

Example 2: API client and user session
from dataclasses import dataclass
class CRMClient:
  def get_pipeline(self, user_id: str) -> dict: ...
@dataclass
class Deps2:
  user_id: str
  crm: CRMClient
crm_agent = Agent(model="openai:gpt-4o", dependencies_type=Deps2)
@crm_agent.tool
def get_user_pipeline(ctx: RunContext[Deps2]) -> dict:
  """Fetch the active sales pipeline for the current user."""
  return ctx.deps.crm.get_pipeline(user_id=ctx.deps.user_id)
res = await crm_agent.run("What is my current pipeline?", dependencies=Deps2(user_id="u_123", crm=CRMClient()))

Best practices:
- Keep the dataclass minimal,only inject what tools actually need.
- Use one dependencies_type per agent to avoid confusion.
- Close connections in finally blocks or context managers after runs.
- For tests, replace real deps with fakes or in-memory DBs.

Tool Sets: Organize and Reuse

Group related functions into a FunctionToolSet for better organization. Pass the set into agents on init or per-run for reuse across teams and services.

Example 1: Date/time toolset
from pydantic_ai.tool_sets import FunctionToolSet
def get_current_date() -> str: ...
def get_current_time() -> str: ...
def get_current_weekday() -> str: ...
dt_tools = FunctionToolSet(tools=[get_current_date, get_current_time, get_current_weekday])
agent = Agent(model="openai:gpt-4o", tool_sets=[dt_tools])

Example 2: Financial calculators toolset
def compound_interest(principal: float, rate: float, periods: int) -> float: ...
def mortgage_payment(principal: float, rate: float, terms: int) -> float: ...
finance_tools = FunctionToolSet(tools=[compound_interest, mortgage_payment])
res = await agent.run("If I invest 10,000 at 5% for 8 periods, what's the outcome?", tool_sets=[finance_tools])

Tip: Tool sets simplify permissioning and versioning in larger systems.

Timeouts and Retries: Make Unreliable Things Reliable

In production, tools fail. Networks glitch. Pydantic AI gives you control with per-run timeouts and retries to keep your flows resilient.

Example 1: Tool timeout
res = await agent.run(
  message="Call the slow API and summarize.",
  tool_timeout=3.0 # seconds
)
# If a tool exceeds the timeout, the agent can handle or report it.

Example 2: Retries for flaky dependencies
res = await agent.run(
  message="Fetch stock price and convert to EUR.",
  retries=2 # attempt tool calls up to 3 tries total
)

Best practices:
- Set conservative timeouts for UI-facing features; larger for batch jobs.
- Combine retries with proper logging/alerts so you see patterns, not just patches.

Built-In Tools: Search and Code Execution

Some capabilities are provided out of the box.

Example 1: Web search
from pydantic_ai.built_in_tools import WebSearchTool
search_agent = Agent(model="openai:gpt-4o", built_in_tools=[WebSearchTool()])
# res = await search_agent.run("What is the latest on async Python features?")

Example 2: Code execution
from pydantic_ai.built_in_tools import CodeExecutionTool
code_agent = Agent(model="openai:gpt-4o", built_in_tools=[CodeExecutionTool()])
# res = await code_agent.run("Write Python to compute the 50th Fibonacci number and return the result.")

Safety tips:
- Treat code execution as privileged,log, sandbox, and restrict inputs.
- For web search, record sources to maintain transparency and traceability.

Embeddings: Foundation for RAG

Use the Embedder for converting text into vectors. This powers semantic search, similarity, and retrieval-augmented generation.

Example 1: Single-query embedding
from pydantic_ai import Embedder
embedder = Embedder(model="openai:text-embedding-3-small")
vec = await embedder.embed_query("How to unit test async code in Python?")
print(len(vec)) # embedding dimensionality

Example 2: Batch document embeddings
docs = ["Guide to fixtures", "Mocking I/O", "Async context managers"]
vectors = [await embedder.embed_query(d) for d in docs]
# Store vectors in your index and retrieve by cosine similarity.

Tips:
- Normalize vectors for cosine distance (if your DB doesn't).
- Chunk documents intelligently; use titles and metadata.

MCP Integration: Use Remote Tools Over the Network

Pydantic AI can connect to tools exposed by an MCP (Message-passing Communication Protocol) server. Think of MCP like a tool marketplace inside your architecture: expose tools from other services and let agents consume them without needing local code.

Example 1: Conceptual MCP client integration
# Pseudocode illustrating concept
# from pydantic_ai.mcp import MCPToolSet
# mcp_tools = MCPToolSet.connect("https://tools.internal.yourcompany.com")
# agent = Agent(model="openai:gpt-4o", tool_sets=[mcp_tools])
# res = await agent.run("Get the current inventory and summarize backorders.")

Example 2: Hybrid local + MCP tools
# Combine a local FunctionToolSet for quick utilities and an MCPToolSet for enterprise operations.
# agent = Agent(model="openai:gpt-4o", tool_sets=[local_tools, mcp_tools])
# This lets you compose capabilities across boundaries cleanly.

Note: Exact APIs may vary by MCP client implementation. The point is the architecture,tools can be hosted remotely and discovered dynamically.

Instructions vs. System Prompts: Fine-Grained Control

You need both. Use system_prompt for persistent behavior and guardrails; use instructions for one-off requests that shouldn't pollute future turns.

Example 1: Persistent identity
agent = Agent(model="openai:gpt-4o", system_prompt="You are a structured data extractor. Return only validated models.")
# This sticks across history when all_messages is passed around.

Example 2: One-time instructions
res = await agent.run(
message="Extract product details from this description.",
instructions="Return items sorted by price ascending, no explanations."
)
# Next run won't inherit the sorting directive.

Flexible Tool Management: Decorators, Lists, and Per-Run Injection

You decide how tools attach to agents: permanently with decorators, globally at init, or dynamically per-call. That flexibility helps in testing, permission boundaries, and reuse across services.

Example 1: Permanent decorator
@agent.tool_plain
def get_weather(city: str) -> str: ...

Example 2: Dynamic tools per request
def calc_tax(amount: float, region: str) -> float: ...
res = await agent.run("Compute tax for 459 in EU.", tools=[calc_tax])
# No need to modify the agent's default toolset.

Production Patterns: Testing, Validation, and Contracts

If your output feeds another system, validation is non-negotiable. Test agents like you test APIs: with fixtures, mocks, and predictable outcomes.

Example 1: Contract tests for structured output
- Define BaseModels for every response you expose externally.
- Run the agent on fixed prompts and assert returned models satisfy invariants (non-empty IDs, valid enums, etc.).

Example 2: Tool contract tests
- For each tool, verify the docstring, type hints, and error handling.
- Mock external services; assert retries fire on failures; assert timeouts abort correctly.

Tips:
- Treat output_type as your API contract.
- Log validation errors with enough detail to reproduce but without leaking secrets.

Error Handling and Observability

You can't fix what you can't see. Build observability into your agents so production incidents are traceable and fast to resolve.

Example 1: Structured error logging
- Log agent.run inputs (redact sensitive data), selected tools, tool args, durations, and validation failures.
- Include message_history length to debug context drift.

Example 2: Metrics for reliability
- Track tool_timeout and retries usage per tool and per route.
- Record streaming durations and token counts for capacity planning.

Tip: Add correlation IDs to stitch together multi-step agent workflows across services.

Performance Tuning

Performance isn't just about speed,it's about predictable latency under load.

Example 1: Control context window
- Prefer response.new_messages for short follow-ups.
- Summarize or compress history when it grows; send only what's necessary.

Example 2: Batch operations
- For embeddings, batch requests to respect provider limits and reduce overhead.
- For tool-heavy flows, use timeouts and parallelize independent calls (outside the agent if needed).

Tips:
- Cache external API results when acceptable.
- Use lighter models for classification/structure and heavier ones for reasoning where needed.

Security and Safety Considerations

Agents interact with your data and tools,treat them like any other privileged service.

Example 1: Principle of least privilege
- Limit which tools are available per agent and per request.
- Inject only necessary dependencies; never expose raw credentials inside tool docstrings or logs.

Example 2: Input sanitization
- For CodeExecutionTool, sandbox and rate-limit; scrub inputs for risky patterns.
- For web search, store source URLs and filter domains where required.

Tip: Keep secrets out of prompts and tool arguments; use handles or IDs instead.

Common Use Cases and Patterns

Here are practical, production-ready applications where Pydantic AI shines.

Example 1: Data extraction from emails or documents
- Define models: Contact, Invoice, PurchaseOrder.
- Pass raw text and enforce output_type=Invoice.
- Store the validated object directly into your DB,no brittle regexes.

Example 2: AI-backed API endpoints
- A FastAPI route calls your agent with output_type=QuoteResponse.
- You get a validated model every time, serialize to JSON, and return.

Example 3: Analyst assistant with tools
- Tools: query_warehouse(sql), usd_to_eur, plot_data.
- Dependency injection: db connection + user permissions.
- The agent computes answers and returns structured KPI objects for dashboards.

Example 4: Retrieval-Augmented Generation (RAG)
- Generate embeddings for docs, store in a vector DB.
- Retrieve top-k passages, pass to agent with instructions for grounded answers.
- Enforce a structured "AnswerWithCitations" model as output.

Advanced Patterns: Chaining Agents Cleanly

You don't need a complex "chain" framework to build multi-step logic. Chain agents by passing validated outputs and selective histories.

Example 1: Extract → Enrich → Save
- Agent A: Extract a Product model from raw text (output_type=Product).
- Agent B: Enrich the product with external data via tools (inventory, price).
- Service: Save to DB; if validation fails, log and retry with corrective instructions.

Example 2: Multi-identity workflows
- Agent "Librarian": system_prompt "you retrieve facts only."
- Agent "Writer": system_prompt "you craft final narrative."
- Pass only required fields and new_messages between them for clarity and cost control.

Comparisons: When to Choose Pydantic AI

If you need guarantees. If you want clean contracts. If your next step after the LLM is another system that can't parse poetry. Other frameworks bring batteries-included approaches or specialized retrieval workflows; Pydantic AI doubles down on structured, validated outputs and explicit code. That's the difference.

- You'd reach for Pydantic AI when responses feed APIs, CRMs, ERPs, or billing systems.
- You'd reach for it when dependency injection and strict models are non-negotiable.
- You'd reach for it when you want a framework that feels like modern Python, not a maze of abstractions.

Hands-On: Two Complete Mini-Projects

Let's synthesize everything with concrete build patterns you can adapt quickly.

Example 1: Invoice extractor with validation and retries
- Define Invoice(BaseModel): vendor, items: list[Item], total, currency.
- Agent: system_prompt "Extract invoices and validate numbers. If unsure, ask clarifying questions."
- Set output_type=Invoice. On validation failure, retry with a short corrective instruction: "Ensure items sum to total."
- Store the Invoice object directly in your DB.

Example 2: Finance assistant with DI and tool sets
- Dependencies: dataclass with db connection and user_id.
- Tools: list_transactions(user_id), sum_by_category, forecast_cash_flow.
- ToolSet: finance_tools.
- Agent: built_in_tools=[CodeExecutionTool()], tool_sets=[finance_tools], dependencies_type=Deps.
- Run: "What were my top three categories last month and a projection for next month?" Return a structured Report model.

Study Checkpoints and Reflection

- Can you initialize an Agent, set a system prompt, and run it asynchronously?
- Can you define a Pydantic BaseModel and enforce it as output_type?
- Do you know when to use response.all_messages vs. response.new_messages?
- Can you register a simple tool with @agent.tool_plain and a context-aware tool with @agent.tool?
- Can you set tool_timeout and retries for reliability?
- Have you used WebSearchTool, CodeExecutionTool, and Embedder in practice?
- Do you understand how to design a dataclass for dependency injection (DBs, APIs, sessions)?
- Can you explain system_prompt vs. instructions and why that distinction is useful?
- Do you know how to integrate or conceptually plan for MCP-hosted tools?

Troubleshooting and Gotchas

- If structured output fails, inspect validation errors. Add tighter field definitions (Enums, stricter types) or tune your system prompt and examples.
- If tools aren't called, refine tool docstrings and names,make them obviously suited for the task. Ambiguous names lead to poor tool selection.
- If conversations drift, prefer instructions for per-turn guidance and keep system_prompt tight.
- If timeouts hit too often, profile the tool and consider caching or batching external requests.
- For embeddings, inconsistent chunking leads to inconsistent retrieval. Standardize chunk sizes and overlap policies across your pipeline.

Extra Examples for Mastery

To reinforce the major concepts, here are additional quick-hit patterns.

Structured output: classification
class Intent(BaseModel):
  label: str
  confidence: float
intent_agent = Agent(model="openai:gpt-4o", output_type=Intent)
# "Is this a refund request?" → Intent(label="refund_request", confidence=0.92)

Conversation control: user profiles
res1 = await agent.run("I prefer short answers.")
res2 = await agent.run("What's a decorator in Python?", message_history=res1.all_messages)
# The preference persists because it's reflected in history/system behavior.

Tools: math helper
@agent.tool_plain
def add(a: int, b: int) -> int:
  """Return the sum of two integers."""
# "Add 17 and 43." → tool call → 60

DI: feature flags
@dataclass
class DepsFlags:
  flags: dict[str, bool]
ff_agent = Agent(model="openai:gpt-4o", dependencies_type=DepsFlags)
@ff_agent.tool
def is_enabled(ctx: RunContext[DepsFlags], flag: str) -> bool:
  """Check if a feature flag is enabled."""
  return ctx.deps.flags.get(flag, False)

Retries: flaky external service
res = await agent.run("Query currency rates and summarize.", retries=3)

Embeddings: similarity scoring
q_vec = await embedder.embed_query("reset password help")
# Compare against FAQ vectors; return the nearest article with a threshold.

MCP: analytics tools
# mcp_tools could expose run_report(name: str, params: dict)
# Agent asks MCP for "run_report('revenue_by_region', {...})" and returns a structured Report model.

Key Insights & Takeaways

- Validation is paramount: Pydantic AI turns slippery text into dependable data models.
- Clean and explicit code: the framework encourages clarity and maintainability, similar to FastAPI's ethos.
- Dependency injection is a game-changer: pass real context to tools without global state.
- Fine-grained context control: system_prompt persists, instructions don't,use this to your advantage.
- Flexible tool management: decorators, init lists, and per-run options help you build modular systems.
- Built-ins and integrations: streaming, timeouts, retries, web search, code execution, embeddings, MCP,everything you need for robust agents.
- "Why use the derivative when you can go straight to the source." When other frameworks bolt on Pydantic, Pydantic AI centers it. That's the point.

Action Plan: From Zero to Production

- Start small: build one agent with a strict output_type for a real task (e.g., Contact extraction).
- Add a simple tool: enrich results with a currency converter or lookup API.
- Add dependency injection: connect a real database, but keep the dataclass minimal.
- Wire in streaming for UX and timeouts/retries for reliability.
- Introduce embeddings and a vector store for RAG if you're surfacing knowledge.
- Add observability: logging, metrics, and basic dashboards.
- Package behind an API (FastAPI is a natural fit), and ship it.
- Iterate with tests and stricter models as you learn from real data.

Conclusion

Pydantic AI gives you a disciplined way to build agents that play well with the rest of your software stack. Instead of forcing unstructured text into systems that demand structure, you define the structure first and let the model meet that contract. You get typed, validated outputs; a clean tool pattern; dependency injection for real-world context; and the reliability features you need to trust this in production,streaming, timeouts, retries, embeddings, and remote tool integrations.

If you're serious about building AI that does real work,interfaces with APIs, writes to databases, orchestrates tools,this framework will feel like home. Start with one agent, one model, one output type. Add tools. Inject context. Observe. Improve. You'll quickly move from demos to dependable systems that your team and your customers can rely on.

Now, go build something real.

Frequently Asked Questions

This FAQ gives direct, practical answers about using Pydantic AI to build production-grade agentic systems. It moves from fundamentals to advanced deployment, with examples you can adapt to client projects, internal tools, or data products. Each answer highlights key points and provides a short example so you can move from reading to building in minutes.

What is Pydantic AI?

Pydantic AI is a Python framework for building Large Language Model (LLM) applications with strict, validated, and predictable outputs. It leans on Pydantic's models to enforce schemas, reduce parsing errors, and keep your interfaces stable across services. You create an Agent, configure prompts, add tools, and optionally enforce a Pydantic BaseModel for structured replies. This keeps your app reliable even under messy user input or noisy model outputs. Key point:
Pydantic AI centers on type safety and schema-enforced outputs, which is essential for production where responses feed downstream systems.
Example:
- Enforce an "Invoice" BaseModel for an AP workflow, then call external payment and ERP tools safely knowing the structure is valid.

How does Pydantic AI compare to LangChain and LlamaIndex?

All three help you build agentic apps, but their focus differs. LangChain offers many integrations and chains out of the box,great for rapid prototyping across varied components. LlamaIndex focuses on RAG pipelines (ingest, index, query) with strong retrieval utilities. Pydantic AI emphasizes minimalism, type safety, and structured outputs by default. If your priority is dependable schemas, fewer moving parts, and clean control over behavior, Pydantic AI fits well. Key point:
Pick the framework that maps to your primary risk: Pydantic AI for predictable structure, LangChain for broad integrations, LlamaIndex for RAG depth.
Example:
- Financial reporting assistant: choose Pydantic AI to guarantee schema-validated results consumed by BI tools.

What are the main steps to set up a Pydantic AI project?

Setup involves isolating dependencies, installing packages, and managing API keys. 1) Create a virtual environment. 2) Install packages: pip install pydantic-ai python-dotenv jupyterlab. 3) Add your provider key to a .env file (e.g., OPENAI_API_KEY="..."). 4) Load environment variables with python-dotenv before running your agent. From there, you can initialize an Agent, set a system prompt, and run test calls. Key point:
Keep secrets in .env files and never commit them; use environment variables in CI/CD.
Example:
- Commands: pip install pydantic-ai python-dotenv jupyterlab
- .env: OPENAI_API_KEY="sk-xxxxx"
- Code: from dotenv import load_dotenv; load_dotenv()

How do I create and run a basic Pydantic AI agent?

Instantiate Agent with a model and a system prompt, then call run within an async function. If you're new to async, treat it like a non-blocking call that lets I/O operations happen in parallel. Access the final message via response.output. For simple scripts, you can also use run_sync. Key point:
Use async/await for server apps and tools; use run_sync for quick scripts or CLI utilities.
Example:
- agent = Agent(model="openai:gpt-4o", system_prompt="Concise answers.")
- async def main(): resp = await agent.run("Define recursion."); print(resp.output)

What is the difference between run() and run_sync()?

run() is asynchronous and must be awaited inside an async function. It plays nicely with web servers and other async I/O. run_sync() is synchronous and blocks until the result is ready,handy for scripts and environments without an event loop. Key point:
Pick run() for scalable services; pick run_sync() for REPLs, cron jobs, or simple CLI flows.
Example:
- Async: response = await agent.run("...")
- Sync: response = agent.run_sync("...")

How can I get structured output from an LLM?

Define a Pydantic BaseModel for your target schema and pass it as output_type to the Agent. The framework validates the model's response and raises errors if fields are missing or types don't match. This is ideal when your output feeds into databases, APIs, or analytics. Key point:
Validated outputs mean fewer brittle regex/parsing hacks and safer downstream automation.
Example:
- class Person(BaseModel): name: str; age: int; job: str
- agent = Agent(..., output_type=Person)
- resp = await agent.run("Mike is a 20-year-old plumber.")

How can I stream responses from an agent?

Use run_stream() to read the model's tokens as they're generated. Wrap it in an async context, iterate over run.text(delta=True), and forward chunks to your UI. This improves perceived latency and user engagement for longer outputs. Key point:
Streaming is a UX optimization,render partials immediately and finalize when the stream closes.
Example:
- async with agent.run_stream("Summarize this report") as run:
async for chunk in run.text(delta=True): print(chunk, end="")

How does an agent remember the conversation history?

Agents are stateless unless you pass prior messages. Capture response1 and use response1.all_messages as message_history when calling agent.run again. For shorter carryover, use response.new_messages (just the last turn). This gives you control over how much context you include and helps manage token costs. Key point:
State is explicit; you decide what to persist for the next turn.
Example:
- resp1 = await agent.run("What is Python?")
- resp2 = await agent.run("Give a short origin fact.", message_history=resp1.all_messages)

What is the difference between system_prompt and instructions?

system_prompt defines durable behavior and is included when you pass all_messages between runs or agents. instructions are ephemeral,only applied to the current run and not carried forward. Use instructions for one-off constraints (e.g., "answer in 50 words") without altering future behavior. Key point:
Use system_prompt for identity; use instructions for single-turn nuance.
Example:
- Agent has "be a legal assistant" system_prompt.
- One call adds instructions: "Return only bullet points." Next turn returns to the baseline behavior.

How do I give an agent access to custom tools?

Create regular Python functions with type hints and clear docstrings; register them with @agent.tool_plain or via tools=[...]. The agent can decide when to call them based on the model's function-calling abilities and the tool's signature. Use @agent.tool (not tool_plain) when you need dependency injection via RunContext. Key point:
Type hints and docstrings are your contract,all arguments and behavior should be clear.
Example:
- @agent.tool_plain
def get_favorite_color(name: str) -> str: """Returns a color."""

What is Dependency Injection and why is it useful?

Dependency Injection (DI) lets you pass external context (user, db, configs) into tools at run time without hardcoding globals. You define dependencies_type on the Agent, receive a RunContext in tools, and access ctx.deps. This pattern enables testability, multi-tenant isolation, and clean separation between logic and environment. Key point:
DI gives you safe, explicit access to resources per request or user.
Example:
- dependencies_type=MyDeps(username: str, db: Connection)
- @agent.tool def read_orders(ctx: RunContext[MyDeps]) -> list: use ctx.deps.db

How do I use simple Dependency Injection?

Set dependencies_type on the Agent to a simple type (e.g., int) and annotate the tool with RunContext[int]. Then pass dependencies=... at run time. This pattern applies equally to strings, dicts, or small data classes when you don't need many fields. Key point:
DI scales from one scalar to complex graphs; start small and refactor into data classes as needs grow.
Example:
- roulette_agent = Agent(..., dependencies_type=int, output_type=bool)
- @roulette_agent.tool async def guess(ctx: RunContext[int], n: int) -> str: return "Won" if n==ctx.deps else "Lost"

How can I inject complex dependencies like a database connection?

Create a dataclass that holds everything your tools need (db connections, api clients, user ids). Set dependencies_type to that dataclass and access ctx.deps in your tools. This keeps your tools pure and swaps environments easily in tests or staging. Key point:
Bundle related dependencies into one dataclass for clarity and evolvability.
Example:
- @dataclass class Deps: username: str; db: sqlite3.Connection
- agent = Agent(..., dependencies_type=Deps)
- @agent.tool def list_employees(ctx: RunContext[Deps]) -> list: query ctx.deps.db

How can a dependency be used in the system prompt?

Use @agent.system_prompt to return a prompt string derived from ctx.deps. This lets you personalize the agent's persona or constraints per user/session,without leaking prompts between agents. Combine this with DI to tailor voice, permissions, or locale. Key point:
System prompt composition can be dynamic while staying isolated per run.
Example:
- @agent.system_prompt def identity(ctx): return f"The user is {ctx.deps.username}."

What are Tool Sets?

Tool Sets group related functions into a reusable package (e.g., reporting, date/time, CRM). Build a FunctionToolSet from a list of callables and pass it via tool_sets=[...]. This is helpful for modularizing features across multiple agents. Key point:
Think of Tool Sets as capability bundles you can enable per scenario.
Example:
- datetime_tools = FunctionToolSet([get_current_date, get_current_time])
- agent = Agent(..., tool_sets=[datetime_tools])

How can I manage timeouts and retries for tools?

Configure tool_timeout (seconds) and retries at Agent initialization. If a tool exceeds the timeout, the framework retries up to the configured limit, then returns an error the agent can handle gracefully. Use this with idempotent tools or add internal backoff for external APIs. Key point:
Protect your user experience from slow or flaky dependencies by setting sane defaults.
Example:
- agent = Agent(..., tool_timeout=5, retries=2)
- Slow tools will be cutoff and retried predictably.

Does Pydantic AI have built-in tools?

Yes. Common ones include CodeExecutionTool for sandboxed Python execution and WebSearchTool for live web queries. Add them via built_in_tools=[...]. Use these to handle programmatic tasks (math, parsing, quick scripts) or gather fresh context. Always review security implications for code execution and apply guardrails. Key point:
Start with built-ins to move fast; replace with custom tools as requirements sharpen.
Example:
- agent = Agent(..., built_in_tools=[CodeExecutionTool()])
- resp = await agent.run("Compute the factorial of 24.")

How can I use embedding models with Pydantic AI?

Use the Embedder class to turn text into vectors for semantic search and RAG. You can embed queries and documents and store vectors in a vector DB. At query time, retrieve similar docs and feed them as context to the agent. Key point:
Embeddings let you connect LLMs to your proprietary data with semantic recall.
Example:
- embedder = Embedder(model="openai:text-embedding-3-small")
- vec = await embedder.embed_query("I love Python")

How can an agent consume tools from an MCP server?

An MCP server exposes tools remotely; your agent connects via an MCP client (e.g., MCPStreamableHTTP). Once attached as a tool set, the agent discovers and calls remote tools as if they were local. This enables distributed architectures where capabilities live behind network boundaries. Key point:
Decouple tool hosting from app code; scale and govern tools centrally.
Example:
- server = MCPStreamableHTTP(url="http://localhost:8000/mcp")
- agent = Agent(..., tool_sets=[server])

Which models and providers are supported, and how do I configure them?

Pydantic AI works with provider-qualified names like "openai:gpt-4o" or other adapters if available. Set API keys via environment variables (.env + load_dotenv). Choose model sizes that balance quality, latency, and cost. If your org requires a gateway (e.g., Azure OpenAI), configure credentials accordingly. Key point:
Keep provider settings externalized,env vars, config files, or secrets managers.
Example:
- .env: OPENAI_API_KEY=...
- agent = Agent(model="openai:gpt-4o")

How do I run Pydantic AI in synchronous environments or notebooks?

In plain scripts, use run_sync(). In notebooks, you can use await directly inside async cells or leverage utilities that manage the event loop for you. For web servers (FastAPI, Starlette), stick to async run(). Avoid mixing blocking calls inside async handlers. Key point:
Pick one concurrency model per app; don't nest loops or block the event loop.
Example:
- Notebook cell: response = await agent.run("...")
- Script: response = agent.run_sync("...")

How do I handle token limits and manage conversation history?

Cap history length by passing only response.new_messages or by summarizing older turns into a compact note. You can also strip irrelevant tool calls and keep only essential facts. For RAG, prefer retrieved context over long raw history. Key point:
Use intentional memory: pass the minimum needed to answer the next turn.
Example:
- Keep: last user prompt + last assistant answer + a short summary of prior context.

How can I cache LLM or tool results to reduce cost and latency?

Wrap agent.run calls with your own cache layer keyed by prompt + system prompt + toolset signature. Cache deterministic tools directly (e.g., product catalog queries). For non-deterministic outputs, attach versioned prompts and TTL-based cache policies. Key point:
Cache where the function is deterministic or you can accept staleness.
Example:
- In-memory LRU for dev; Redis/Memcached in production keyed by hash(prompt_config).

How do I log, trace, and debug agent runs and tool calls?

Log the following: system_prompt hash, instructions, model name, input, output, tool calls (name, args, duration, result), errors, and retries. Add correlation IDs per request to trace through services. Redact PII before persisting logs. Key point:
Good logs make intermittent production issues obvious and reproducible.
Example:
- Log JSON lines: {run_id, prompt_id, tool:"get_price", ms:143, ok:true}

Author, Links & Resources

Unlock this content to view the author bio and resources by Logging in or Signing up.

Certification

About the Certification

Get certified in building type-safe LLM agents with Pydantic AI. Prove you can return validated typed data, wire tools and dependencies, stream results, set timeouts/retries, integrate search, code, embeddings, and ship reliable production services.

Get your: Certification in Building Production-Ready Type-Safe LLM Agents with Pydantic

Official Certification

Upon successful completion of the "Certification in Building Production-Ready Type-Safe LLM Agents with Pydantic", you will receive a verifiable digital certificate. This certificate demonstrates your expertise in the subject matter covered in this course.

Benefits of Certification

Enhance your professional credibility and stand out in the job market.
Validate your skills and knowledge in cutting-edge AI technologies.
Unlock new career opportunities in the rapidly growing AI field.
Share your achievement on your resume, LinkedIn, and other professional platforms.

How to complete your certification successfully?

To earn your certification, you’ll need to complete all video lessons, study the guide carefully, and review the FAQ. After that, you’ll be prepared to pass the certification requirements.

Join 20,000+ Professionals, Using AI to transform their Careers

Join professionals who didn’t just adapt, they thrived. You can too, with AI training designed for your job.