PydanticAI SDK Track Part 7: Hooks, Agent Specs & Extensibility

                        
                        What You’ll Learn: Multi-agent systems in PydanticAI connect specialized agents through typed interfaces — one agent’s structured output becomes another’s validated input. This article covers agent composition: chaining agents sequentially, routing between specialists, and building supervisor patterns where one agent orchestrates others. The type system ensures agents can only communicate in compatible formats.
                    

1. Hooks Overview

Hooks are lifecycle callbacks that let you intercept agent behavior at key points during execution — before and after tool calls, before and after model requests. They enable logging, metrics collection, access control, input/output transformation, and caching without modifying the agent’s core logic.

1.1 Common Use Cases

                        
                        Hook Use Cases: (1) Observability — log every tool call and model request for debugging. (2) Security — block or redact sensitive data before it reaches the model. (3) Performance — cache repeated tool results. (4) Cost control — track token usage across runs. (5) Testing — mock tool responses during development.
                    

1.2 Basic Hook Registration

from pydantic_ai import Agent
from pydantic_ai.agent import CallContext
import time

agent = Agent("openai:gpt-4o", system_prompt="You are a helpful assistant.")

@agent.on_tool_call
async def log_tool_calls(context: CallContext) -> None:
    """Log every tool call for observability."""
    print(f"[HOOK] Tool called: {context.tool_name}")
    print(f"[HOOK] Arguments: {context.args}")
    print(f"[HOOK] Timestamp: {time.strftime('%Y-%m-%d %H:%M:%S')}")

@agent.tool
async def get_user_info(user_id: str) -> str:
    """Get user information by ID.

    Args:
        user_id: The user's unique identifier.
    """
    return f"User {user_id}: name=Alice, email=alice@example.com"

result = agent.run_sync("Look up user USR-001")
print(result.output)

2. Hook Types in Detail

2.1 Pre Tool Call Hook

The pre-tool-call hook fires before a tool executes. You can inspect the tool name and arguments, modify them, or even prevent execution by raising an exception:

from pydantic_ai import Agent
from pydantic_ai.agent import PreToolCallContext
import json

agent = Agent("openai:gpt-4o")

@agent.on_pre_tool_call
async def audit_and_validate(context: PreToolCallContext) -> None:
    """Audit log and validate tool inputs before execution."""
    # Log for audit trail
    audit_entry = {
        "event": "tool_call_initiated",
        "tool": context.tool_name,
        "args": context.args,
        "timestamp": "2026-05-24T10:30:00Z",
    }
    print(f"[AUDIT] {json.dumps(audit_entry)}")

    # Block dangerous operations
    if context.tool_name == "delete_record" and not context.args.get("confirmed"):
        raise ValueError("Delete operations require explicit confirmation")

@agent.tool
async def fetch_record(record_id: str) -> str:
    """Fetch a record from the database.

    Args:
        record_id: The record identifier.
    """
    return f"Record {record_id}: status=active, created=2026-01-15"

@agent.tool
async def delete_record(record_id: str, confirmed: bool = False) -> str:
    """Delete a record from the database.

    Args:
        record_id: The record to delete.
        confirmed: Must be True to proceed with deletion.
    """
    return f"Record {record_id} deleted."

result = agent.run_sync("Fetch record REC-456")
print(result.output)

2.2 Post Tool Call Hook

The post-tool-call hook fires after a tool returns. Use it to transform results, track metrics, or log outcomes:

from pydantic_ai import Agent
from pydantic_ai.agent import PostToolCallContext
import time

# Metrics collector
tool_metrics: dict[str, list[float]] = {}

agent = Agent("openai:gpt-4o")

@agent.on_post_tool_call
async def collect_metrics(context: PostToolCallContext) -> None:
    """Track tool execution time and success/failure."""
    duration = context.duration_ms
    tool_name = context.tool_name

    if tool_name not in tool_metrics:
        tool_metrics[tool_name] = []
    tool_metrics[tool_name].append(duration)

    avg_time = sum(tool_metrics[tool_name]) / len(tool_metrics[tool_name])
    print(f"[METRICS] {tool_name}: {duration:.1f}ms (avg: {avg_time:.1f}ms)")

@agent.tool
async def search_products(query: str, category: str = "all") -> str:
    """Search the product catalog.

    Args:
        query: Search terms.
        category: Product category filter.
    """
    time.sleep(0.05)  # Simulate latency
    return f"Found 12 products matching '{query}' in {category}"

result = agent.run_sync("Find me wireless headphones in electronics")
print(result.output)
print(f"\nCollected metrics: {tool_metrics}")

2.3 Model Request Hooks

Model request hooks intercept the messages before they’re sent to the LLM and after the response returns. This is powerful for content filtering, token tracking, and response caching:

from pydantic_ai import Agent
from pydantic_ai.agent import PreModelRequestContext, PostModelRequestContext

# Token usage tracker
total_tokens = {"input": 0, "output": 0, "cost_usd": 0.0}

agent = Agent("openai:gpt-4o")

@agent.on_pre_model_request
async def redact_pii(context: PreModelRequestContext) -> None:
    """Redact PII patterns before sending to model."""
    import re
    for message in context.messages:
        if hasattr(message, "content") and isinstance(message.content, str):
            # Redact email addresses
            message.content = re.sub(
                r'[\w.-]+@[\w.-]+\.\w+',
                '[EMAIL_REDACTED]',
                message.content
            )
            # Redact phone numbers
            message.content = re.sub(
                r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b',
                '[PHONE_REDACTED]',
                message.content
            )

@agent.on_post_model_request
async def track_token_usage(context: PostModelRequestContext) -> None:
    """Track cumulative token usage and estimated cost."""
    usage = context.usage
    if usage:
        total_tokens["input"] += usage.input_tokens
        total_tokens["output"] += usage.output_tokens
        # Approximate GPT-4o pricing
        cost = (usage.input_tokens * 2.50 + usage.output_tokens * 10.00) / 1_000_000
        total_tokens["cost_usd"] += cost
        print(f"[TOKENS] +{usage.input_tokens} in / +{usage.output_tokens} out (total cost: ${total_tokens['cost_usd']:.4f})")

@agent.tool
async def lookup_customer(name: str) -> str:
    """Look up customer details by name.

    Args:
        name: Customer name to search for.
    """
    return f"Customer: {name}, email: alice@company.com, phone: 555-123-4567"

result = agent.run_sync("Find customer Alice and summarize her info")
print(result.output)

3. Agent Specs

Real-World Application

Insurance Claims Processing

An insurer built a multi-agent pipeline: Agent 1 extracts claim data (typed ClaimData model), Agent 2 assesses damage (typed DamageAssessment), Agent 3 calculates payout (typed PayoutDecision). Each agent’s output is validated before passing to the next. Result: end-to-end claims processing in 30 seconds with full audit trail.

InsuranceMulti-Agent Pipeline

Agent specifications define the expected behavior of an agent in a declarative format. They serve as documentation, enable contract-based testing, and can generate API references automatically:

from pydantic_ai import Agent
from pydantic import BaseModel

class CustomerQuery(BaseModel):
    """Structured customer query response."""
    customer_name: str
    account_status: str
    recent_orders: int
    recommendation: str

agent = Agent(
    "openai:gpt-4o",
    output_type=CustomerQuery,
    system_prompt=(
        "You are a customer service agent. "
        "Look up customer information and provide structured responses."
    ),
    name="customer-service-agent",
    description="Handles customer inquiries with structured output",
)

@agent.tool
async def get_customer_data(customer_id: str) -> str:
    """Retrieve customer data from CRM.

    Args:
        customer_id: The customer's unique ID.
    """
    return (
        f"Customer {customer_id}: name=Bob Smith, status=active, "
        f"orders_last_30d=7, lifetime_value=$4,500"
    )

# Run with structured output
result = agent.run_sync("Get info for customer CUST-789")
print(f"Name: {result.output.customer_name}")
print(f"Status: {result.output.account_status}")
print(f"Recent Orders: {result.output.recent_orders}")
print(f"Recommendation: {result.output.recommendation}")

3.1 Contract Testing with Specs

Use agent specs to write deterministic tests that verify your agent’s tool usage patterns without calling a real LLM:

from pydantic_ai import Agent
from pydantic_ai.testing import TestModel, capture_run

agent = Agent("openai:gpt-4o", system_prompt="You help with math.")

@agent.tool
async def calculate(expression: str) -> str:
    """Evaluate a mathematical expression.

    Args:
        expression: The math expression to evaluate (e.g., '2 + 2').
    """
    try:
        result = eval(expression, {"__builtins__": {}})  # Sandboxed eval
        return str(result)
    except Exception as e:
        return f"Error: {e}"

# Contract test: verify the agent calls 'calculate' for math questions
def test_agent_uses_calculator():
    """Verify the agent delegates math to the calculate tool."""
    with agent.override(model=TestModel()):
        result = agent.run_sync("What is 15 * 23?")
        # TestModel records all tool calls
        assert any(
            call.tool_name == "calculate"
            for call in result.all_messages()
            if hasattr(call, "tool_name")
        )
        print("✓ Contract test passed: agent uses calculate tool for math")

test_agent_uses_calculator()

                        
                        Testing Strategy: Use TestModel for unit tests that verify tool selection logic without LLM costs. Use real models in integration tests with known inputs/outputs. Agent specs bridge the gap — they document expected behavior and enable both test types.
                    

4. Extensibility Patterns

PydanticAI’s architecture is designed for extension. You can create custom model backends for proprietary LLMs, define new tool types, and build plugin systems:

4.1 Custom Model Backend

from pydantic_ai.models import Model, ModelResponse, ModelMessage
from pydantic_ai import Agent
from dataclasses import dataclass

@dataclass
class CustomModelConfig:
    """Configuration for a custom model backend."""
    endpoint: str
    api_key: str
    model_name: str
    max_tokens: int = 4096

class CustomModel(Model):
    """A custom model backend for a proprietary LLM."""

    def __init__(self, config: CustomModelConfig):
        self.config = config

    async def request(self, messages: list[ModelMessage]) -> ModelResponse:
        """Send messages to the custom model endpoint."""
        # In production: make HTTP request to self.config.endpoint
        # with self.config.api_key authentication
        response_text = f"Response from {self.config.model_name} at {self.config.endpoint}"
        return ModelResponse(content=response_text)

    @property
    def model_name(self) -> str:
        return self.config.model_name

# Usage with custom backend
config = CustomModelConfig(
    endpoint="https://my-llm.internal.company.com/v1/chat",
    api_key="internal-key-123",
    model_name="company-llm-v2",
)

# Register and use
print(f"Custom model configured: {config.model_name}")
print(f"Endpoint: {config.endpoint}")
print(f"Max tokens: {config.max_tokens}")

5. Middleware & Composition

Chain multiple hooks together for middleware-like behavior. Each hook runs in registration order, creating a pipeline of cross-cutting concerns:

5.1 Retry Middleware with Hooks

from pydantic_ai import Agent
from pydantic_ai.agent import PostToolCallContext
import time

agent = Agent("openai:gpt-4o", system_prompt="You are a data retrieval assistant.")

# Middleware 1: Timing
@agent.on_post_tool_call
async def timing_middleware(context: PostToolCallContext) -> None:
    """Log execution time for every tool call."""
    print(f"  [TIMING] {context.tool_name}: {context.duration_ms:.0f}ms")

# Middleware 2: Result caching
_cache: dict[str, str] = {}

@agent.on_pre_tool_call
async def cache_check_middleware(context) -> None:
    """Check cache before executing tool."""
    cache_key = f"{context.tool_name}:{context.args}"
    if cache_key in _cache:
        print(f"  [CACHE] Hit for {context.tool_name}")

@agent.on_post_tool_call
async def cache_store_middleware(context: PostToolCallContext) -> None:
    """Store result in cache after successful execution."""
    cache_key = f"{context.tool_name}:{context.args}"
    if context.result:
        _cache[cache_key] = str(context.result)
        print(f"  [CACHE] Stored result for {context.tool_name}")

# Middleware 3: Error alerting
@agent.on_post_tool_call
async def error_alert_middleware(context: PostToolCallContext) -> None:
    """Alert on tool failures."""
    if context.error:
        print(f"  [ALERT] Tool {context.tool_name} failed: {context.error}")

@agent.tool
async def fetch_stock_price(symbol: str) -> str:
    """Fetch current stock price.

    Args:
        symbol: Stock ticker symbol (e.g., AAPL, GOOGL).
    """
    prices = {"AAPL": 189.50, "GOOGL": 175.20, "MSFT": 420.30}
    price = prices.get(symbol.upper())
    if price is None:
        return f"Symbol {symbol} not found"
    return f"{symbol.upper()}: ${price:.2f}"

result = agent.run_sync("What's Apple's stock price?")
print(f"\nFinal output: {result.output}")

                        
                        Composition Pattern: Hooks execute in registration order. Design each hook to be independent — don’t rely on execution order between hooks. Use shared state (module-level dicts, dependency-injected services) for cross-hook communication rather than relying on hook ordering.
                    

                        
                        Try It Yourself: Build a 3-agent pipeline: (1) a ‘classifier’ agent that categorizes customer messages (returns MessageType enum), (2) a ‘handler’ agent that processes the classified message (different handler per type), (3) a ‘quality checker’ that validates the response. Chain them with typed interfaces and test with 10 diverse customer messages.
                    

Next in the PydanticAI SDK Track

In Part 8: Multimodal Input & Thinking, we’ll process images, audio, video, and documents as agent inputs, enable thinking/reasoning mode for complex tasks, and configure HTTP retry strategies for production resilience.