Anthropic SDK Track Part 6: Tool Interface Design

                        
                        What You’ll Learn: Well-designed tools are the difference between an agent that works reliably and one that hallucinates or gets stuck. This article teaches you how to create tools that Claude understands intuitively — clear names, precise descriptions, validated inputs, and helpful error messages. Think of tools like a well-designed API: the better the interface, the less likely users (or agents) are to misuse it.
                    

1. Schema Design Principles

Tool schemas are the API contract between your code and Claude’s reasoning. The quality of your tool definitions directly impacts agent reliability. Good schemas make it easy for Claude to call the right tool with correct arguments; bad schemas cause hallucinated inputs, wrong tool selection, and wasted iterations.

1.1 Description Engineering

The tool description field is the most important part of a tool definition. Claude uses it to decide when to call the tool and what to expect. A good description answers three questions: What does this tool do? When should it be used? What does it return?

# ❌ BAD: Vague description — Claude can't decide when to use this
bad_tool = {
    "name": "search",
    "description": "Search for things.",
    "input_schema": {
        "type": "object",
        "properties": {"q": {"type": "string"}},
        "required": ["q"]
    }
}

# ✅ GOOD: Specific description with usage guidance and return format
good_tool = {
    "name": "search_docs",
    "description": "Search the internal documentation knowledge base by keyword query. Returns up to 10 matching documents with titles, relevance scores, and excerpt snippets. Use this when the user asks about company policies, product features, or internal procedures. Do NOT use for general web search or real-time data.",
    "input_schema": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query — use specific keywords, not natural language questions. Example: 'refund policy enterprise' not 'what is the refund policy?'"
            },
            "category": {
                "type": "string",
                "enum": ["policies", "products", "procedures", "technical"],
                "description": "Filter by document category. Omit to search all categories."
            },
            "max_results": {
                "type": "integer",
                "description": "Maximum results to return (1-10). Default: 5.",
                "minimum": 1,
                "maximum": 10
            }
        },
        "required": ["query"]
    }
}

                        
                        CCA Task 2.1 — Description Principles: Effective descriptions include: (1) what the tool does, (2) when to use it vs. alternatives, (3) what it returns, (4) examples of good inputs, (5) negative guidance (“Do NOT use for...”). The exam tests ability to improve tool descriptions that cause misuse.
                    

1.2 input_schema Patterns

The input_schema uses JSON Schema to define parameter types, constraints, and documentation. Strong schemas prevent invalid inputs at the type level while property descriptions guide Claude’s argument generation:

# Comprehensive input_schema demonstrating best practices
create_ticket_tool = {
    "name": "create_support_ticket",
    "description": "Create a support ticket in the system. Requires verified customer_id (from get_customer). Always include reproduction steps and expected behavior in the description.",
    "input_schema": {
        "type": "object",
        "properties": {
            "customer_id": {
                "type": "string",
                "description": "Verified customer ID from a prior get_customer call. Format: cust_XXXX",
                "pattern": "^cust_[a-zA-Z0-9]+$"
            },
            "subject": {
                "type": "string",
                "description": "Brief ticket subject (5-100 chars). Be specific: 'Login fails with SSO' not 'Login issue'",
                "minLength": 5,
                "maxLength": 100
            },
            "priority": {
                "type": "string",
                "enum": ["low", "medium", "high", "critical"],
                "description": "Severity: low=cosmetic, medium=workaround exists, high=blocking, critical=data loss/security"
            },
            "description": {
                "type": "string",
                "description": "Detailed issue description. Include: 1) What happened 2) Steps to reproduce 3) Expected behavior 4) Actual behavior"
            },
            "tags": {
                "type": "array",
                "items": {"type": "string"},
                "description": "Categorization tags. Use existing tags: 'billing', 'auth', 'api', 'ui', 'performance'",
                "maxItems": 5
            }
        },
        "required": ["customer_id", "subject", "priority", "description"]
    }
}

1.3 Enums and Constrained Values

Use enum to constrain inputs to valid values. This prevents hallucinated categories, misspelled statuses, and invalid identifiers. Combine with descriptive enum documentation:

# Enums prevent hallucinated values and make tool usage predictable
filter_orders_tool = {
    "name": "filter_orders",
    "description": "Filter a customer's orders by status. Returns matching orders with IDs, amounts, and dates.",
    "input_schema": {
        "type": "object",
        "properties": {
            "customer_id": {
                "type": "string",
                "description": "Customer ID from get_customer"
            },
            "status": {
                "type": "string",
                "enum": ["pending", "processing", "shipped", "delivered", "cancelled", "refunded"],
                "description": "Order status filter. Use 'pending' for unprocessed, 'processing' for in-progress, 'shipped' for in-transit."
            },
            "date_range": {
                "type": "object",
                "description": "Optional date range filter",
                "properties": {
                    "start": {"type": "string", "description": "ISO 8601 date (YYYY-MM-DD)"},
                    "end": {"type": "string", "description": "ISO 8601 date (YYYY-MM-DD)"}
                }
            },
            "sort_by": {
                "type": "string",
                "enum": ["date_asc", "date_desc", "amount_asc", "amount_desc"],
                "description": "Sort order. Default: date_desc (newest first)"
            }
        },
        "required": ["customer_id", "status"]
    }
}

2. Error Response Patterns

2.1 The `is_error` Flag

When a tool execution fails, set is_error: true on the tool_result block. This signals to Claude that the operation failed and it should adapt its approach — retry with different arguments, try an alternative tool, or inform the user about the failure:

import json

def execute_tool_with_error_handling(tool_name: str, tool_input: dict) -> dict:
    """Execute a tool and return a properly structured result, including errors."""
    try:
        result = execute_tool(tool_name, tool_input)
        return {
            "type": "tool_result",
            "tool_use_id": tool_input["_id"],  # from the block
            "content": json.dumps(result)
        }
    except ToolNotFoundError as e:
        return {
            "type": "tool_result",
            "tool_use_id": tool_input["_id"],
            "content": json.dumps({"error": str(e), "type": "not_found"}),
            "is_error": True  # Signals failure to Claude
        }
    except ValidationError as e:
        return {
            "type": "tool_result",
            "tool_use_id": tool_input["_id"],
            "content": json.dumps({
                "error": str(e),
                "type": "validation",
                "suggestion": "Check required fields and data types"
            }),
            "is_error": True
        }
    except RateLimitError as e:
        return {
            "type": "tool_result",
            "tool_use_id": tool_input["_id"],
            "content": json.dumps({
                "error": "Rate limit exceeded",
                "type": "rate_limit",
                "retry_after_seconds": e.retry_after
            }),
            "is_error": True
        }

2.2 Structured Error Objects

Return structured error information that helps Claude reason about what went wrong and how to recover. Include the error type, a human-readable message, and actionable suggestions:

import json

def build_error_response(error_type: str, message: str, **kwargs) -> str:
    """Build a structured error response for tool_result content."""
    error = {
        "success": False,
        "error": {
            "type": error_type,
            "message": message
        }
    }

    # Add contextual fields based on error type
    if error_type == "not_found":
        error["error"]["suggestion"] = kwargs.get("suggestion", "Verify the identifier exists")
    elif error_type == "permission_denied":
        error["error"]["required_prerequisite"] = kwargs.get("prerequisite")
    elif error_type == "validation":
        error["error"]["invalid_fields"] = kwargs.get("fields", [])
        error["error"]["expected_format"] = kwargs.get("format")

    return json.dumps(error)

# Examples of structured errors that help Claude self-correct:

# Customer not found — Claude should ask for correct identifier
not_found_error = build_error_response(
    "not_found",
    "No customer found with email 'jon@example.com'",
    suggestion="Try alternative email addresses or ask the customer to verify"
)

# Permission denied — Claude needs to call get_customer first
permission_error = build_error_response(
    "permission_denied",
    "Cannot look up orders without verified customer",
    prerequisite="Call get_customer first to verify identity"
)

# Validation error — Claude sent wrong data types
validation_error = build_error_response(
    "validation",
    "Invalid input for create_ticket",
    fields=["priority"],
    format="Must be one of: low, medium, high, critical"
)

2.3 Self-Recovery Patterns

Well-designed error responses enable Claude to autonomously recover without human intervention. The key is providing enough context for Claude to understand what went wrong and what to try next:

import anthropic
import json

client = anthropic.Anthropic()

# Claude receives this error and autonomously adapts:
# {"success": false, "error": {"type": "not_found", "message": "Order ORD-999 not found",
#   "suggestion": "Check order ID format or search by customer email"}}

# Claude's next action will typically be:
# 1. Try an alternative approach (search by customer email)
# 2. Ask the user for clarification
# 3. Report the issue with context

# This self-recovery happens naturally when errors are descriptive.
# Compare with a BAD error that provides no recovery path:

# ❌ BAD: "Error: 404" — Claude has no idea what to do
# ✅ GOOD: {"error": {"type": "not_found", "message": "...", "suggestion": "..."}}

# The agent loop handles this transparently — errors flow back as tool_result
# with is_error=True, and Claude reasons about recovery on its next turn.

Real-World Application

E-Commerce Product Search Agent

An online retailer built tools for their Claude agent: search_products, check_inventory, apply_coupon, and calculate_shipping. Key lesson: adding enum constraints to tool inputs (e.g., category must be one of [“electronics”, “clothing”, “home”]) reduced hallucinated tool calls by 90%. The team also found that adding negative guidance (“Do NOT use for price comparisons across stores”) eliminated a class of incorrect tool selections entirely.

Enum ConstraintsE-CommerceTool Selection

3. Composable Tool Sets

3.1 Minimal Overlap Principle

Each tool should have a distinct, non-overlapping responsibility. When tools overlap, Claude struggles to choose the right one and may call both redundantly. Design tool sets where each tool has a clear “lane”:

# ❌ BAD: Overlapping tools — Claude won't know which to call
overlapping_tools = [
    {"name": "search", "description": "Search for information"},
    {"name": "find", "description": "Find data in the system"},
    {"name": "lookup", "description": "Look up records"}
]

# ✅ GOOD: Distinct responsibilities, clear lanes
well_designed_tools = [
    {
        "name": "search_knowledge_base",
        "description": "Search internal documentation for policies, procedures, and product info. Use for 'how do we...' or 'what is our policy on...' questions."
    },
    {
        "name": "get_customer_record",
        "description": "Look up a specific customer by email or ID. Returns profile, plan, and account status. Use when you know the customer's identifier."
    },
    {
        "name": "query_orders_db",
        "description": "Query the orders database with filters. Use for order-specific lookups by order ID, customer ID, or date range."
    }
]

3.2 Right Granularity

Tools should be neither too broad (doing too many things) nor too narrow (requiring many calls for simple tasks). Find the natural unit of work — one tool call should accomplish one meaningful action:

# ❌ TOO BROAD: One tool does everything — Claude loses fine-grained control
too_broad = {
    "name": "manage_customer",
    "description": "Create, read, update, or delete customer records. Also handles orders, refunds, and tickets.",
    "input_schema": {
        "type": "object",
        "properties": {
            "action": {"type": "string", "enum": ["create", "read", "update", "delete", "refund", "ticket"]},
            "data": {"type": "object"}  # Catch-all — no schema validation
        }
    }
}

# ❌ TOO NARROW: Simple task requires 4 tool calls
too_narrow = [
    {"name": "open_db_connection", "description": "Open a database connection"},
    {"name": "build_query", "description": "Build a SQL query string"},
    {"name": "execute_query", "description": "Execute the query"},
    {"name": "close_db_connection", "description": "Close the connection"}
]

# ✅ RIGHT GRANULARITY: One meaningful action per tool
right_granularity = [
    {
        "name": "get_customer",
        "description": "Look up customer by email or ID. Returns: id, name, plan, status, verified."
    },
    {
        "name": "lookup_order",
        "description": "Get order details by order ID. Requires verified customer_id."
    },
    {
        "name": "process_refund",
        "description": "Process a refund for an order. Requires verified customer + confirmed order."
    }
]

3.3 Tool Versioning

When tools evolve, use versioning to maintain backward compatibility. Never silently change a tool’s behavior — this can break agents that rely on specific input/output contracts:

# Tool versioning strategies

# Strategy 1: Version in the tool name (explicit, simple)
tools_v1 = [
    {"name": "search_docs_v1", "description": "..."},  # Deprecated
    {"name": "search_docs_v2", "description": "Enhanced search with category filters and pagination. Prefer over v1."}
]

# Strategy 2: Version as a parameter (flexible)
search_tool = {
    "name": "search_docs",
    "description": "Search documentation. Supports v1 (basic) and v2 (with filters) modes.",
    "input_schema": {
        "type": "object",
        "properties": {
            "query": {"type": "string"},
            "version": {
                "type": "string",
                "enum": ["v1", "v2"],
                "description": "API version. v2 supports filters and pagination. Default: v2"
            },
            "filters": {
                "type": "object",
                "description": "(v2 only) Category and date filters"
            }
        },
        "required": ["query"]
    }
}

# Strategy 3: Additive-only changes (backward compatible)
# Add new optional parameters — never remove or rename existing ones
# Old behavior preserved when new params are omitted

4. Tool Design Anti-Patterns

The CCA exam (Task 2.2) specifically tests your ability to identify tool design anti-patterns that degrade agent performance:

Anti-Pattern	Problem	Fix
God tool	One tool handles all operations via an `action` parameter	Split into focused, single-purpose tools
Ambiguous names	`process`, `handle`, `do_thing` — Claude can’t reason about purpose	Verb + noun: `create_ticket`, `search_docs`
Missing descriptions	No guidance on when to use	Always include when/why/what-returns
Overlapping scope	Multiple tools could handle same request	Clear lane per tool, negative guidance
Opaque errors	Returns “Error” with no details	Structured errors with type + suggestion
Missing required	All params optional but some are actually needed	Mark true requirements in `required` array
Catch-all params	`"data": {"type": "object"}` — no schema	Define explicit properties with types

Here is a concrete example showing how to refactor a poorly-designed tool set into a well-structured one:

# ❌ BEFORE: Anti-patterns everywhere
bad_tools = [
    {
        "name": "do_stuff",
        "description": "Does stuff with the system",
        "input_schema": {
            "type": "object",
            "properties": {
                "action": {"type": "string"},
                "data": {"type": "object"}
            }
        }
    }
]

# ✅ AFTER: Refactored with proper design
good_tools = [
    {
        "name": "get_customer",
        "description": "Look up a customer by email or customer ID. Returns the customer profile including name, subscription plan, account status, and whether identity is verified. Use this FIRST before any order or refund operations — other tools require a verified customer_id.",
        "input_schema": {
            "type": "object",
            "properties": {
                "identifier": {
                    "type": "string",
                    "description": "Customer email address or customer ID (format: cust_XXXX)"
                }
            },
            "required": ["identifier"]
        }
    },
    {
        "name": "lookup_order",
        "description": "Retrieve order details by order ID. Returns items, amounts, status, and shipping info. Requires a verified customer_id from get_customer. Do NOT call without verifying the customer first.",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string",
                    "description": "Order ID in format ORD-XXXXX"
                },
                "customer_id": {
                    "type": "string",
                    "description": "Verified customer ID from get_customer"
                }
            },
            "required": ["order_id", "customer_id"]
        }
    },
    {
        "name": "process_refund",
        "description": "Issue a refund for a specific order. Maximum $500 without escalation. Requires verified customer and confirmed order. Returns refund confirmation with transaction ID.",
        "input_schema": {
            "type": "object",
            "properties": {
                "order_id": {"type": "string", "description": "Order ID to refund"},
                "customer_id": {"type": "string", "description": "Verified customer ID"},
                "amount": {"type": "number", "description": "Refund amount in USD (max 500.00)"},
                "reason": {
                    "type": "string",
                    "enum": ["damaged", "wrong_item", "not_received", "quality", "changed_mind"],
                    "description": "Refund reason category"
                }
            },
            "required": ["order_id", "customer_id", "amount", "reason"]
        }
    }
]

CCA Exam Pattern

Tool Design Questions

The exam presents a tool definition and asks “which improvement would most reduce incorrect tool calls?” or “why is the agent calling tool X when it should call tool Y?” The answer usually involves: improving the description with when-to-use guidance, adding negative guidance (“Do NOT use for...”), or adding enum constraints to prevent hallucinated values.

CCA Task 2.1CCA Task 2.2

                        
                        Try It Yourself: Design a complete tool suite for a recipe assistant: create_recipe (with ingredients, steps, cuisine type), search_recipes (by ingredient or cuisine), and scale_recipe (multiply servings). Each tool should have Pydantic-validated inputs, clear descriptions, and return structured JSON. Test by asking Claude to find a pasta recipe and scale it for 8 people.
                    

4B. Strict Tool Use & Input Examples

Standard tool definitions are advisory — Claude usually follows your schema, but occasionally returns incompatible types ("2" instead of 2) or missing required fields. In production agents, these type mismatches break your functions and cause runtime errors. Strict tool use eliminates this class of bugs entirely.

4B.1 Strict Mode (`strict: true`)

Adding strict: true to a tool definition enables grammar-constrained sampling — the model’s token generation is restricted to outputs that are valid according to your JSON Schema. This is not validation-after-the-fact; it’s a hard constraint during generation:

import anthropic
import json

client = anthropic.Anthropic()

# Strict tool: Claude's output is GUARANTEED to match this schema
booking_tool = {
    "name": "book_flight",
    "description": "Book a flight for a customer. All fields are validated at generation time.",
    "strict": True,  # Enable grammar-constrained sampling
    "input_schema": {
        "type": "object",
        "properties": {
            "origin": {
                "type": "string",
                "description": "Origin airport IATA code (3 letters, e.g. 'SFO')"
            },
            "destination": {
                "type": "string",
                "description": "Destination airport IATA code (3 letters)"
            },
            "passengers": {
                "type": "integer",
                "description": "Number of passengers (1-9)"
            },
            "cabin_class": {
                "type": "string",
                "enum": ["economy", "premium_economy", "business", "first"]
            },
            "departure_date": {
                "type": "string",
                "description": "Departure date in YYYY-MM-DD format"
            }
        },
        "required": ["origin", "destination", "passengers", "cabin_class", "departure_date"],
        "additionalProperties": False  # No extra fields allowed
    }
}

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[booking_tool],
    tool_choice={"type": "tool", "name": "book_flight"},
    messages=[{"role": "user", "content": "Book me a business class flight from SFO to NRT for 2 people on Jan 15, 2027"}]
)

# GUARANTEED: passengers is int (not "2"), cabin_class is from enum, no extra fields
tool_block = next(b for b in response.content if b.type == "tool_use")
print(json.dumps(tool_block.input, indent=2))
# {"origin": "SFO", "destination": "NRT", "passengers": 2, "cabin_class": "business", "departure_date": "2027-01-15"}

# Without strict: passengers might be "2" (string) or cabin_class might be "Business" (wrong case)
# With strict: these bugs are impossible — the grammar rejects invalid tokens
print(f"passengers type: {type(tool_block.input['passengers']).__name__}")  # Always: int

                        
                        When to use strict mode: Use strict: true in production agentic systems where type mismatches would cause runtime failures. Combine with tool_choice: {{"type": "any"}} to guarantee both that a tool is called AND that its inputs are schema-valid. Skip strict mode during prototyping or when schemas are evolving rapidly (changes require recompilation of the grammar).
                    

4B.2 Input Examples (`input_examples`)

For complex tools with nested objects, optional parameters, or format-sensitive inputs, input_examples shows Claude concrete patterns for well-formed calls. Examples are validated against your schema at request time — invalid examples return a 400 error:

import anthropic
import json

client = anthropic.Anthropic()

# Tool with input_examples — teaches Claude HOW to call complex schemas
search_tool = {
    "name": "search_flights",
    "description": "Search for available flights. Supports complex filters and flexible dates.",
    "input_schema": {
        "type": "object",
        "properties": {
            "origin": {"type": "string", "description": "IATA airport code"},
            "destination": {"type": "string", "description": "IATA airport code"},
            "departure_date": {"type": "string", "description": "YYYY-MM-DD format"},
            "return_date": {"type": "string", "description": "YYYY-MM-DD (optional for one-way)"},
            "passengers": {
                "type": "object",
                "properties": {
                    "adults": {"type": "integer"},
                    "children": {"type": "integer"},
                    "infants": {"type": "integer"}
                },
                "required": ["adults"]
            },
            "filters": {
                "type": "object",
                "properties": {
                    "max_stops": {"type": "integer", "enum": [0, 1, 2]},
                    "airlines": {"type": "array", "items": {"type": "string"}},
                    "max_price_usd": {"type": "number"}
                }
            }
        },
        "required": ["origin", "destination", "departure_date", "passengers"]
    },
    # input_examples: validated array of example inputs
    "input_examples": [
        # Simple one-way search
        {
            "origin": "SFO",
            "destination": "NRT",
            "departure_date": "2027-03-15",
            "passengers": {"adults": 1}
        },
        # Round-trip with filters
        {
            "origin": "JFK",
            "destination": "LHR",
            "departure_date": "2027-06-01",
            "return_date": "2027-06-15",
            "passengers": {"adults": 2, "children": 1, "infants": 0},
            "filters": {"max_stops": 1, "airlines": ["BA", "AA"], "max_price_usd": 2500}
        },
        # Family trip, no filters
        {
            "origin": "LAX",
            "destination": "CDG",
            "departure_date": "2027-12-20",
            "return_date": "2027-01-03",
            "passengers": {"adults": 2, "children": 2}
        }
    ]
}

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[search_tool],
    messages=[{"role": "user", "content": "Find nonstop flights from Chicago to Tokyo next March for me and my wife"}]
)

tool_block = next(b for b in response.content if b.type == "tool_use")
print(json.dumps(tool_block.input, indent=2))
# Claude learned from examples: proper nested objects, correct field usage

                        
                        Limitations: input_examples works on user-defined and Anthropic-schema client tools but NOT on server tools (web search, code execution). Each example adds ~20–50 tokens (simple) or ~100–200 tokens (complex nested). Use 2–3 examples for most tools; don’t exceed 5 unless the schema is exceptionally complex.
                    

4B.3 Tool Use with Prompt Caching

Tool definitions are sent with every API request, consuming input tokens repeatedly. For agentic loops that run 10–50 iterations, caching tool definitions saves significant cost. Use cache_control on the last tool in your array to cache the entire tool block:

import anthropic
import json

client = anthropic.Anthropic()

# Cache tool definitions — saves tokens on every loop iteration
tools_with_caching = [
    {
        "name": "search_docs",
        "description": "Search documentation for answers to technical questions.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string"},
                "max_results": {"type": "integer"}
            },
            "required": ["query"]
        }
    },
    {
        "name": "run_code",
        "description": "Execute Python code in a sandboxed environment.",
        "input_schema": {
            "type": "object",
            "properties": {
                "code": {"type": "string"},
                "timeout_seconds": {"type": "integer"}
            },
            "required": ["code"]
        },
        # cache_control on the LAST tool — caches ALL tool definitions above it
        "cache_control": {"type": "ephemeral"}
    }
]

# First call: creates cache (slight overhead)
response1 = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools_with_caching,
    messages=[{"role": "user", "content": "Search for how to use asyncio"}]
)
print(f"First call - input tokens: {response1.usage.input_tokens}")
print(f"Cache creation tokens: {response1.usage.cache_creation_input_tokens}")

# Subsequent calls: reads from cache (major savings)
response2 = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools_with_caching,
    messages=[{"role": "user", "content": "Run this code: print('hello')"}]
)
print(f"Second call - input tokens: {response2.usage.input_tokens}")
print(f"Cache read tokens: {response2.usage.cache_read_input_tokens}")  # Much cheaper!

# Key rules:
# - cache_control goes on the LAST tool in the array
# - It caches everything above (system prompt + all tools)
# - Cache lives for 5 minutes after last use
# - Changing tool_choice invalidates cached message blocks (not tool definitions)
# - Tool definitions + system prompt remain cached across tool_choice changes

                        
                        CCA Exam Tip: Key distinctions: (1) strict: true guarantees schema conformance via grammar-constrained sampling — it’s deterministic, not probabilistic. (2) input_examples are teaching aids — they help Claude use tools correctly but don’t guarantee compliance. (3) Prompt caching reduces cost for multi-turn loops but requires consistent tool arrays. (4) Combine strict + tool_choice “any” for maximum reliability in production agents.
                    

5. Built-In Tools Deep Dive (CCA 7.4)

Claude Code and the Agent SDK ship with built-in tools that agents can use without custom MCP servers. Understanding how each works — and when to prefer one over another — is critical for both building effective agents and the CCA exam.

Analogy: Built-in tools are like the standard tools in a developer’s IDE (find, replace, terminal). You don’t install them separately — they’re always available, well-tested, and designed to work together.

5.1 Read / Write / Edit — File Operations

# Built-in file tools — the agent's primary way to interact with code

# READ — Read file contents (entire file or line range)
# When: Agent needs to understand existing code before making changes
# Always prefer Read BEFORE Edit (never edit blind)
read_example = {
    "name": "Read",
    "input": {
        "file_path": "/src/auth/jwt.py",
        "start_line": 45,    # Optional: read specific range
        "end_line": 80       # Optional: for focused inspection
    }
}
# Returns: file contents as text (with line numbers)

# WRITE — Create or completely overwrite a file
# When: Creating NEW files or replacing entire file contents
# Caution: overwrites without confirmation — use Edit for surgical changes
write_example = {
    "name": "Write",
    "input": {
        "file_path": "/src/auth/jwt.py",
        "content": "# Complete new file contents here\nimport jwt\n..."
    }
}
# Returns: confirmation of write

# EDIT — Surgical line-level replacement
# When: Changing specific lines without rewriting the whole file
# This is the PREFERRED tool for modifications (preserves unchanged code)
edit_example = {
    "name": "Edit",
    "input": {
        "file_path": "/src/auth/jwt.py",
        "old_string": "TOKEN_EXPIRY = 86400  # 24 hours",
        "new_string": "TOKEN_EXPIRY = 3600  # 1 hour (security hardening)"
    }
}
# Returns: confirmation with diff

# Rules:
# - Always Read BEFORE Edit — never edit blind
# - Prefer Edit over Write for existing files (preserves context)
# - Use Read with line ranges for large files (don't read 10k lines)

File Tool Decision Tree

                            flowchart TD
                                Q{"What do you need
to do?"}
                                Q -->|"Create new file"| W["Write"]
                                Q -->|"Understand existing code"| R["Read"]
                                Q -->|"Change specific lines"| E["Edit"]
                                Q -->|"Full file rewrite"| RW["Read → Write"]
                                Q -->|"Incremental fix"| RE["Read → Edit"]

                                style W fill:#16476A,color:#fff
                                style R fill:#3B9797,color:#fff
                                style E fill:#BF092F,color:#fff
                                style RW fill:#16476A,color:#fff
                                style RE fill:#BF092F,color:#fff

5.2 Bash — Command Execution

# Bash tool — execute shell commands in the agent's environment

# When to use Bash:
# - Running tests: pytest, npm test, cargo test
# - Installing dependencies: pip install, npm install
# - Git operations: git status, git diff, git log
# - Build commands: make, cargo build, npm run build
# - System inspection: ls, find, which, env

bash_examples = [
    # Run tests after making changes
    {"name": "Bash", "input": {"command": "cd /project && pytest tests/ -v --tb=short"}},

    # Check git status
    {"name": "Bash", "input": {"command": "git diff --stat"}},

    # Install a dependency
    {"name": "Bash", "input": {"command": "pip install pydantic>=2.0"}},

    # Check what's running on a port
    {"name": "Bash", "input": {"command": "lsof -i :8080"}},
]

# Security considerations:
# - Bash can execute ANYTHING — restrict with allowedTools when possible
# - In sandboxed environments, filesystem and network are limited
# - Agents should not have Bash in production customer-facing scenarios
# - Use Bash for developer tools (CI/CD, code review), not user interactions

# When NOT to use Bash (prefer file tools instead):
# ❌ Reading a file: use Read tool (structured output, line numbers)
# ❌ Finding a file: use Glob tool (pattern matching, faster)
# ❌ Searching content: use Grep tool (structured results)
# ✅ Running tests: Bash (no alternative)
# ✅ Git operations: Bash (no alternative)
# ✅ Build/compile: Bash (no alternative)

print("Bash: powerful but unrestricted — use allowedTools to control access")
print("Prefer specialized tools (Read, Grep, Glob) over Bash equivalents")

5.3 Grep & Glob — Search Tools

# Grep — Search file CONTENTS for patterns (like grep -r)
# When: Find where something is used, referenced, or defined

grep_example = {
    "name": "Grep",
    "input": {
        "pattern": "def authenticate",     # Regex pattern
        "path": "/src/",                   # Directory to search
        "include": "*.py"                  # File filter (optional)
    }
}
# Returns: list of matches with file paths, line numbers, and context

# Glob — Search for FILES by path pattern (like find)
# When: Find files by name, extension, or directory structure

glob_example = {
    "name": "Glob",
    "input": {
        "pattern": "/src/**/*test*.py"     # Glob pattern (** = recursive)
    }
}
# Returns: list of matching file paths

print("Strategy: Glob (find files) → Grep (find patterns) → Read (understand) → Edit (fix)")

Incremental Codebase Exploration Strategy

                            flowchart LR
                                G["fa:fa-search Glob
Find files by
name/pattern"] --> GR["fa:fa-crosshairs Grep
Search contents
for patterns"]
                                GR --> R["fa:fa-file-code Read
Inspect actual
code at location"]
                                R --> E["fa:fa-pen Edit
Make targeted
changes"]

                                style G fill:#3B9797,color:#fff
                                style GR fill:#16476A,color:#fff
                                style R fill:#132440,color:#fff
                                style E fill:#BF092F,color:#fff

5.4 MCP Error Metadata (isError, errorCategory, isRetryable)

import json

# When tools fail, structured error metadata helps Claude recover intelligently.
# MCP tools can return errors with metadata that guides the agent's next action.

# The tool_result with error metadata:
error_result = {
    "type": "tool_result",
    "tool_use_id": "toolu_abc123",
    "content": json.dumps({
        "success": False,
        "error": {
            "message": "Rate limited by external API",
            "type": "rate_limit",

            # isError: True — this is an error, not empty results
            "isError": True,

            # errorCategory: helps Claude decide what to do next
            # Options: "transient", "permission", "not_found", "validation", "rate_limit"
            "errorCategory": "rate_limit",

            # isRetryable: True — agent should wait and retry
            "isRetryable": True,

            # retryAfter: seconds to wait before retry (optional)
            "retryAfter": 30,

            # suggestion: what the agent should do
            "suggestion": "Wait 30 seconds, then retry with the same parameters"
        }
    }),
    "is_error": True  # MCP protocol field — tells Claude this is an error
}

# How Claude uses these fields:

Error Recovery Decision Tree

                            flowchart TD
                                ERR["Tool Returns Error"] --> RT{"isRetryable?"}
                                RT -->|"True"| WAIT["Wait retryAfter seconds
then retry same call"]
                                RT -->|"False"| CAT{"errorCategory?"}
                                CAT -->|"not_found"| ALT["Try alternative
approach"]
                                CAT -->|"permission"| ESC["Escalate to user
or request access"]
                                CAT -->|"validation"| FIX["Fix input
parameters & retry"]
                                CAT -->|"rate_limit"| WAIT

                                style WAIT fill:#3B9797,color:#fff
                                style ALT fill:#16476A,color:#fff
                                style ESC fill:#BF092F,color:#fff
                                style FIX fill:#132440,color:#fff

# IMPORTANT distinction: is_error (MCP protocol) vs isError (your metadata)
# - is_error: True → MCP protocol flag, Claude knows the tool failed
# - errorCategory → YOUR metadata helping Claude decide recovery strategy

# Anti-pattern: returning empty results for "not found"
# ❌ BAD: {"results": []}  — Claude can't tell: no results vs search failed
# ✅ GOOD: {"results": [], "isError": False, "totalAvailable": 0}

print("Error metadata tells Claude: what went wrong, can it retry, what to do instead")
print("Always distinguish: empty results (success) vs access failure (error)")

                        
                        CCA Exam Pattern (7.1, 7.4): The exam tests: (1) Prefer Edit over Read+Write for existing files (atomic, preserves context). (2) Incremental exploration: Glob → Grep → Read → Edit (never read entire codebase). (3) is_error: true tells Claude the tool failed. (4) isRetryable determines whether to retry or try an alternative. (5) Empty results ≠ error — always distinguish between “no matches found” and “search failed.”
                    

6. Agent SDK: Custom Tools (`@tool` Decorator)

The built-in tools (Section 5) cover filesystem, search, and shell access. But your agents need domain-specific tools — database queries, API calls, business logic. In the raw Client SDK (Sections 1–4), you defined tool schemas as JSON and implemented execution yourself. The Agent SDK provides a cleaner approach: the @tool decorator + create_sdk_mcp_server() creates an in-process MCP server that runs inside your application.

                        
                        Analogy: The raw approach is like defining a REST API spec (OpenAPI) and writing the route handlers separately. The @tool decorator is like using FastAPI: you define the schema and handler in one place, and the framework wires everything together automatically.
                    

6.1 The `@tool` Decorator

A tool is defined by four parts: name, description, input schema, and async handler. The decorator combines all of these into a single decorated function:

# Agent SDK Custom Tool — Complete Pattern
# Requires: pip install claude-agent-sdk httpx
# Set env var: ANTHROPIC_API_KEY=sk-ant-...

import asyncio
import httpx
from claude_agent_sdk import tool, create_sdk_mcp_server, query, ClaudeAgentOptions, ResultMessage


# --- Define a custom tool using the @tool decorator ---
@tool(
    "get_temperature",                           # Tool name (unique ID)
    "Get the current temperature at a location. " # Description (Claude reads this)
    "Returns temperature in Fahrenheit.",
    {"latitude": float, "longitude": float},     # Input schema (dict → JSON Schema)
)
async def get_temperature(args: dict) -> dict:
    """Handler function — called when Claude invokes this tool."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.open-meteo.com/v1/forecast",
            params={
                "latitude": args["latitude"],
                "longitude": args["longitude"],
                "current": "temperature_2m",
                "temperature_unit": "fahrenheit",
            },
        )
        data = response.json()

    # Return a content array — Claude sees this as the tool result
    return {
        "content": [
            {
                "type": "text",
                "text": f"Temperature: {data['current']['temperature_2m']}°F",
            }
        ]
    }


# --- Wrap tools in an in-process MCP server ---
weather_server = create_sdk_mcp_server(
    name="weather",       # Server name
    version="1.0.0",      # Version string
    tools=[get_temperature],  # List of @tool-decorated functions
)

print("Custom tool defined: get_temperature")
print("Server name: weather")
print("Full tool name in SDK: mcp__weather__get_temperature")

6.2 In-Process MCP Server — Connecting to `query()`

Once you wrap tools in create_sdk_mcp_server(), pass the server to query() via mcp_servers. The SDK registers your tools alongside the built-in ones. The tool’s fully qualified name follows the pattern mcp__<server_name>__<tool_name>:

# Calling Custom Tools via query() — Full Working Example
# Requires: pip install claude-agent-sdk httpx

import asyncio
import httpx
from claude_agent_sdk import (
    tool, create_sdk_mcp_server, query,
    ClaudeAgentOptions, AssistantMessage, ResultMessage
)


@tool(
    "get_temperature",
    "Get the current temperature at a location. Returns Fahrenheit.",
    {"latitude": float, "longitude": float},
)
async def get_temperature(args: dict) -> dict:
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.open-meteo.com/v1/forecast",
            params={
                "latitude": args["latitude"],
                "longitude": args["longitude"],
                "current": "temperature_2m",
                "temperature_unit": "fahrenheit",
            },
        )
        data = response.json()
    return {"content": [{"type": "text", "text": f"{data['current']['temperature_2m']}°F"}]}


@tool(
    "get_precipitation",
    "Get hourly precipitation probability for a location. Returns next 12 hours.",
    {"latitude": float, "longitude": float},
)
async def get_precipitation(args: dict) -> dict:
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.open-meteo.com/v1/forecast",
            params={
                "latitude": args["latitude"],
                "longitude": args["longitude"],
                "hourly": "precipitation_probability",
                "forecast_days": 1,
            },
        )
        data = response.json()
    chances = data["hourly"]["precipitation_probability"][:12]
    return {"content": [{"type": "text", "text": f"Next 12h: {chances}"}]}


# Bundle both tools into one server
weather_server = create_sdk_mcp_server(
    name="weather",
    version="1.0.0",
    tools=[get_temperature, get_precipitation],
)


async def main():
    async for message in query(
        prompt="What's the weather like in San Francisco right now? Include rain chance.",
        options=ClaudeAgentOptions(
            # Pass custom MCP server — tools become available to Claude
            mcp_servers={"weather": weather_server},
            # Pre-approve your custom tools (no permission prompt)
            # Format: mcp____
            allowed_tools=[
                "mcp__weather__get_temperature",
                "mcp__weather__get_precipitation",
            ],
        ),
    ):
        if isinstance(message, AssistantMessage):
            for block in message.content:
                if hasattr(block, "name"):
                    print(f"  [Tool call: {block.name}]")
        if isinstance(message, ResultMessage) and message.subtype == "success":
            print(f"\nResult: {message.result}")
            print(f"Cost: ${message.total_cost_usd:.4f}")


asyncio.run(main())

                        
                        Key Difference from Raw SDK: In the raw approach, you define tool schemas as JSON dicts and implement execute_tool() yourself. With the Agent SDK, @tool + create_sdk_mcp_server() handles schema generation, validation, and execution. The tool runs in-process (no subprocess or network) — it’s just an async function call.
                    

6.3 Tool Annotations (Parallel Execution Hints)

Annotations are metadata that tell the SDK how a tool behaves. The most important one: readOnlyHint: True enables the SDK to run that tool in parallel with other read-only tools (instead of sequentially). This maps directly to the parallel tool call pattern from Part 3.

# Tool Annotations — Enable Parallel Execution
# Requires: pip install claude-agent-sdk

from claude_agent_sdk import tool, ToolAnnotations


# Read-only tool — safe to run in parallel with other read-only tools
@tool(
    "search_products",
    "Search the product catalog by query. Returns top 10 matches.",
    {"query": str, "category": str},
    annotations=ToolAnnotations(
        readOnlyHint=True,       # No side effects → can run in parallel
        openWorldHint=True,      # Reaches external system (database)
    ),
)
async def search_products(args: dict) -> dict:
    # In production: query your database
    return {"content": [{"type": "text", "text": f"Found 3 products for '{args['query']}'"}]}


# Destructive tool — must run sequentially (default)
@tool(
    "place_order",
    "Place an order for a product. Charges the customer's card.",
    {"product_id": str, "quantity": int, "customer_id": str},
    annotations=ToolAnnotations(
        readOnlyHint=False,      # Has side effects (charges card)
        destructiveHint=True,    # Irreversible action
        idempotentHint=False,    # Calling twice = double charge!
    ),
)
async def place_order(args: dict) -> dict:
    return {"content": [{"type": "text", "text": f"Order placed for product {args['product_id']}"}]}


print("Annotation summary:")
print("  readOnlyHint=True  → SDK may batch with other read-only calls")
print("  destructiveHint    → informational, for logging/auditing")
print("  idempotentHint     → informational, hints retry safety")
print("  openWorldHint      → informational, reaches external systems")

6.4 Error Handling: `is_error` vs Exceptions

How your handler reports errors determines whether the agent loop continues or stops. This is the most critical design decision for tool reliability:

Pattern	Agent Loop	What Claude Sees	When to Use
Return `is_error: True`	Continues	Error message as data	Expected failures (404, validation, rate limit)
Throw uncaught exception	Stops	Nothing (query fails)	Never in production

# Error Handling in Agent SDK Tools — Keep the Loop Alive
# Requires: pip install claude-agent-sdk httpx

import httpx
from claude_agent_sdk import tool


@tool(
    "fetch_user_profile",
    "Fetch a user profile from the API by user ID.",
    {"user_id": str},
)
async def fetch_user_profile(args: dict) -> dict:
    """Handler that gracefully reports errors without stopping the loop."""
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"https://api.example.com/users/{args['user_id']}"
            )

            if response.status_code == 404:
                # ✅ Return is_error — Claude sees "user not found" and can try alternatives
                return {
                    "content": [{"type": "text", "text": f"User {args['user_id']} not found"}],
                    "is_error": True,
                }

            if response.status_code == 429:
                # ✅ Rate limited — Claude may wait and retry or use cached data
                return {
                    "content": [{"type": "text", "text": "Rate limited. Try again in 30s."}],
                    "is_error": True,
                }

            data = response.json()
            return {
                "content": [{"type": "text", "text": f"User: {data['name']}, Plan: {data['plan']}"}]
            }

    except Exception as e:
        # ✅ Catch ALL exceptions — never let them propagate
        # An uncaught exception STOPS the agent loop entirely
        return {
            "content": [{"type": "text", "text": f"API error: {str(e)}"}],
            "is_error": True,
        }


print("Rule: ALWAYS catch exceptions in tool handlers")
print("Return is_error=True → loop continues, Claude adapts")
print("Uncaught exception → loop STOPS, query() fails")

6.5 Returning Images & Resources

Tools can return non-text content. Image blocks carry base64-encoded bytes directly in the result — Claude processes them as visual input:

# Returning Images from Tools — Visual Data for Claude
# Requires: pip install claude-agent-sdk httpx

import base64
import httpx
from claude_agent_sdk import tool


@tool(
    "capture_screenshot",
    "Capture a screenshot of a webpage and return it for visual analysis.",
    {"url": str},
)
async def capture_screenshot(args: dict) -> dict:
    """Return an image block — Claude sees this as visual input."""
    # In production: use Playwright, Puppeteer, or a screenshot service
    async with httpx.AsyncClient() as client:
        response = await client.get(args["url"])
        image_bytes = response.content  # Raw bytes

    return {
        "content": [
            {
                "type": "image",
                "data": base64.b64encode(image_bytes).decode("utf-8"),
                "mimeType": "image/png",  # Required: image/png, image/jpeg, image/webp
            }
        ]
    }


@tool(
    "generate_chart",
    "Generate a chart from data and return as image for analysis.",
    {"data": list, "chart_type": str},
)
async def generate_chart(args: dict) -> dict:
    """Generate a chart image (matplotlib) and return to Claude."""
    import matplotlib.pyplot as plt
    import io

    fig, ax = plt.subplots(figsize=(8, 5))
    ax.plot(args["data"])
    ax.set_title(f"{args['chart_type'].title()} Chart")

    buf = io.BytesIO()
    fig.savefig(buf, format="png", dpi=100)
    plt.close(fig)
    buf.seek(0)

    return {
        "content": [
            {
                "type": "image",
                "data": base64.b64encode(buf.read()).decode("utf-8"),
                "mimeType": "image/png",
            }
        ]
    }


print("Image blocks: type='image', data=base64 (no data: prefix), mimeType required")
print("Claude sees images as visual input — can describe, analyze, compare")

                        
                        Design Principle: Every tool on your create_sdk_mcp_server() consumes context window space on every turn (the SDK sends tool definitions to Claude). If you have 50+ tools, most will never be used on any given query. Use tool search (covered in Part 7) to load tools on demand instead of registering all of them upfront.
                    

Next in the SDK Track

In Part 7: MCP Servers & Built-In Tools, we’ll explore the Model Context Protocol — MCP servers, resource types, built-in tools (file I/O, web fetch, code execution), transport layers (stdio, HTTP), and how MCP enables tool discovery. Covers CCA Domain 2 Tasks 2.3 and 2.4.