1. The Agentic Loop
An agentic loop is the core architectural pattern for building autonomous agents with Claude. Instead of a single request-response, the agent iterates: send a request to Claude, inspect the stop_reason, execute any requested tools, append the results back to the conversation, and repeat until Claude signals completion. This loop is what transforms Claude from a text generator into an autonomous problem-solver.
1.1 Loop Lifecycle
The lifecycle has exactly three steps that repeat until termination:
flowchart TD
A["Send request to Claude"] --> B{"Inspect stop_reason"}
B -->|"tool_use"| C["Execute requested tool(s)"]
C --> D["Append tool results to messages"]
D --> A
B -->|"end_turn"| E["Return final response to user"]
B -->|"max_tokens"| F["Handle truncation"]
stop_reason is "tool_use" and terminating when it is "end_turn". The model drives the decision about which tool to call next based on context, not a pre-configured sequence.
1.2 Minimal Implementation
Here is the canonical minimal agentic loop. The key insight is that the loop condition checks stop_reason == "tool_use" — as long as Claude wants to call tools, we keep iterating:
import anthropic
import json
client = anthropic.Anthropic()
def execute_tool(name: str, input_data: dict) -> dict:
"""Stub tool executor — replace with real implementations."""
print(f" [Executing tool: {name} with input: {input_data}]")
return {"status": "success", "result": f"Mock result for {name}"}
def run_agent(user_message: str, tools: list, system: str = "") -> str:
"""Minimal agentic loop — the foundation of all Claude agents."""
messages = [{"role": "user", "content": user_message}]
while True:
# Step 1: Send request to Claude
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
system=system,
tools=tools,
messages=messages
)
# Step 2: Check stop_reason
if response.stop_reason == "end_turn":
# Claude is done — extract and return text
text_blocks = [b.text for b in response.content if b.type == "text"]
return "\n".join(text_blocks)
if response.stop_reason == "tool_use":
# Step 3: Execute tool(s) and append results
# First, append Claude's response (with tool_use blocks) as assistant message
messages.append({"role": "assistant", "content": response.content})
# Then append tool results as user message
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
})
messages.append({"role": "user", "content": tool_results})
# Loop continues — Claude will process results and decide next action
elif response.stop_reason == "max_tokens":
# Handle truncation gracefully
return "[Response truncated — increase max_tokens]"
# Demo: run with a simple tool
demo_tools = [{
"name": "get_time",
"description": "Get the current time.",
"input_schema": {"type": "object", "properties": {}, "required": []}
}]
result = run_agent("What time is it?", demo_tools)
print(f"Agent result: {result}")
1.3 Appending Tool Results
A critical detail: tool results must be appended in the correct message structure. Claude’s response (containing ToolUseBlocks) becomes an assistant message, followed by a user message containing tool_result blocks. Each result must reference the original tool_use_id.
Here is the correct message structure after one tool call iteration:
import anthropic
# After Claude returns a tool_use response, the message array looks like:
messages = [
# Original user request
{"role": "user", "content": "What's the weather in London and Paris?"},
# Claude's response — contains text + tool_use blocks (passed as-is)
{"role": "assistant", "content": [
{"type": "text", "text": "I'll check the weather for both cities."},
{"type": "tool_use", "id": "toolu_01A", "name": "get_weather", "input": {"city": "London"}},
{"type": "tool_use", "id": "toolu_01B", "name": "get_weather", "input": {"city": "Paris"}}
]},
# Your tool results — one entry per tool_use_id
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_01A", "content": "London: 15°C, rainy"},
{"type": "tool_result", "tool_use_id": "toolu_01B", "content": "Paris: 22°C, sunny"}
]}
]
# Next iteration: Claude sees both results and can synthesize a final answer
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=messages
)
# stop_reason will likely be "end_turn" now — Claude has all the info it needs
print(response.content[0].text)
2. Stop Reason Control Flow
2.1 Model-Driven Decision Making
A key principle of agentic architecture is that Claude decides which tool to call next based on context, not a pre-configured sequence. You provide the tools and their descriptions; Claude reasons about what information it needs and calls tools accordingly. This is fundamentally different from a decision tree or workflow engine.
The following example demonstrates model-driven reasoning — Claude autonomously decides the order of operations based on what it discovers:
import anthropic
import json
client = anthropic.Anthropic()
def execute_tool(name: str, input_data: dict) -> dict:
"""Stub tool executor — replace with real implementations."""
print(f" [Executing tool: {name} with input: {input_data}]")
return {"status": "success", "result": f"Mock result for {name}"}
def run_agent(user_message: str, tools: list, system: str = "") -> str:
"""Agentic loop — continues until Claude sets stop_reason to end_turn."""
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-6", max_tokens=4096,
system=system, tools=tools, messages=messages
)
if response.stop_reason == "end_turn":
return "\n".join(b.text for b in response.content if b.type == "text")
if response.stop_reason == "tool_use":
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({"type": "tool_result", "tool_use_id": block.id, "content": json.dumps(result)})
messages.append({"role": "user", "content": tool_results})
# Tools with clear descriptions — Claude decides which to call and when
tools = [
{
"name": "search_docs",
"description": "Search internal documentation by query. Returns relevant doc snippets with titles and URLs. Use when you need to find information about company policies, procedures, or product details.",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"}
},
"required": ["query"]
}
},
{
"name": "get_customer",
"description": "Look up a customer by email or ID. Returns customer profile including name, plan, account status, and order history summary.",
"input_schema": {
"type": "object",
"properties": {
"identifier": {"type": "string", "description": "Customer email or ID"}
},
"required": ["identifier"]
}
},
{
"name": "create_ticket",
"description": "Create a support ticket. Use only after gathering sufficient context about the customer's issue.",
"input_schema": {
"type": "object",
"properties": {
"customer_id": {"type": "string"},
"subject": {"type": "string"},
"priority": {"type": "string", "enum": ["low", "medium", "high"]},
"description": {"type": "string"}
},
"required": ["customer_id", "subject", "priority", "description"]
}
}
]
# Claude will autonomously decide: look up customer → search docs → create ticket
# The ORDER is determined by Claude's reasoning, not hard-coded
result = run_agent(
user_message="Customer jane@example.com can't access the premium features she paid for. Create a ticket.",
tools=tools,
system="You are a support agent. Always verify the customer exists before taking action."
)
print(result)
2.2 Parallel Tool Calls
Claude can emit multiple ToolUseBlocks in a single response when tasks are independent. Your loop should handle all of them before the next iteration, enabling parallel execution for better latency:
import anthropic
import json
import asyncio
client = anthropic.Anthropic()
async def execute_tool_async(name: str, input_data: dict) -> str:
"""Execute a tool asynchronously."""
# In production, this would call actual services
if name == "get_weather":
return json.dumps({"temp": "20°C", "city": input_data["city"]})
return json.dumps({"error": "unknown tool"})
async def run_agent_parallel(user_message: str, tools: list) -> str:
"""Agentic loop with parallel tool execution."""
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=tools,
messages=messages
)
if response.stop_reason == "end_turn":
return "\n".join(b.text for b in response.content if b.type == "text")
if response.stop_reason == "tool_use":
messages.append({"role": "assistant", "content": response.content})
# Execute ALL tool calls in parallel
tool_blocks = [b for b in response.content if b.type == "tool_use"]
results = await asyncio.gather(*[
execute_tool_async(b.name, b.input) for b in tool_blocks
])
# Collect all results
tool_results = [
{"type": "tool_result", "tool_use_id": b.id, "content": r}
for b, r in zip(tool_blocks, results)
]
messages.append({"role": "user", "content": tool_results})
# Demo: run the parallel agent (Jupyter/IPython compatible)
demo_tools = [{
"name": "get_weather",
"description": "Get weather for a city.",
"input_schema": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}]
async def main():
result = await run_agent_parallel("What's the weather in London?", demo_tools)
print(f"Agent result: {result}")
await main()
Automated Code Review Agent
An engineering team built an agent that reviews pull requests: it reads the diff, checks for common issues (missing tests, security vulnerabilities, style violations), and posts structured feedback. The agentic loop allows it to read multiple files and cross-reference them. The agent reduced human review time by 35% and caught 28% more security issues than manual reviews alone.
2B. Tool Runner (SDK Managed Loop)
The manual loop from Section 1–2 gives full control but requires boilerplate: while True, stop_reason checks, tool dispatch, message array management. The Tool Runner (available in the Python, TypeScript, C#, Go, Java, PHP, and Ruby SDKs as a beta feature) automates this entire cycle with type-safe tool definitions and automatic execution.
anthropic package) that automates the manual loop. The Agent SDK (claude-agent-sdk, Section 8) is a higher-level abstraction with built-in tools. Use Tool Runner when you have custom tool implementations and want loop automation without switching to the Agent SDK.
2B.1 Basic Usage — @beta_tool Decorator
Define tools with the @beta_tool decorator — it inspects function arguments and docstrings to derive the JSON Schema automatically. No manual input_schema required:
import json
from anthropic import Anthropic, beta_tool
client = Anthropic()
@beta_tool
def get_weather(location: str, unit: str = "fahrenheit") -> str:
"""Get the current weather in a given location.
Args:
location: The city and state, e.g. San Francisco, CA
unit: Temperature unit, either 'celsius' or 'fahrenheit'
"""
# Your actual implementation here
return json.dumps({"temperature": "20°C", "condition": "Sunny"})
@beta_tool
def calculate_sum(a: int, b: int) -> str:
"""Add two numbers together.
Args:
a: First number
b: Second number
"""
return str(a + b)
# Create the tool runner — it handles the ENTIRE agentic loop
runner = client.beta.messages.tool_runner(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[get_weather, calculate_sum],
messages=[
{
"role": "user",
"content": "What's the weather like in Paris? Also, what's 15 + 27?",
}
],
)
# Iterate over messages — runner calls tools automatically
for message in runner:
print(message)
# First iteration: Claude calls get_weather + calculate_sum
# Runner executes both, sends results back
# Second iteration: Claude responds with final text answer
2B.2 Getting the Final Result — until_done()
If you don’t need intermediate messages, skip the iteration and get the final response directly:
import json
from anthropic import Anthropic, beta_tool
client = Anthropic()
@beta_tool
def search_database(query: str, limit: int = 10) -> str:
"""Search the product database.
Args:
query: Search query string
limit: Maximum number of results to return
"""
# Simulated database search
return json.dumps({"results": [{"name": "Widget Pro", "price": 29.99}], "total": 1})
# until_done() runs the entire loop and returns the final message
runner = client.beta.messages.tool_runner(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[search_database],
messages=[{"role": "user", "content": "Find products related to widgets"}],
)
final_message = runner.until_done()
# Extract the text response
for block in final_message.content:
if block.type == "text":
print(block.text)
2B.3 Tool Runner with Streaming
Enable streaming to process each turn’s response incrementally. Each iteration yields a stream object for real-time token delivery:
import json
from anthropic import Anthropic, beta_tool
client = Anthropic()
@beta_tool
def get_stock_price(ticker: str) -> str:
"""Get the current stock price for a ticker symbol.
Args:
ticker: Stock ticker symbol (e.g., AAPL, GOOGL)
"""
prices = {"AAPL": 198.50, "GOOGL": 175.30, "MSFT": 420.10}
price = prices.get(ticker.upper(), 0)
return json.dumps({"ticker": ticker.upper(), "price": price, "currency": "USD"})
# Streaming tool runner — get token-by-token output
runner = client.beta.messages.tool_runner(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[get_stock_price],
messages=[{"role": "user", "content": "What are the current prices of AAPL and GOOGL?"}],
stream=True, # Enable streaming
)
# Each iteration yields a BetaMessageStream
for message_stream in runner:
for event in message_stream:
# Real-time events: content_block_start, content_block_delta, etc.
if hasattr(event, 'delta') and hasattr(event.delta, 'text'):
print(event.delta.text, end="", flush=True)
# Get accumulated message for this turn
print("\n---")
print("Turn complete:", message_stream.get_final_message().stop_reason)
# Get the final result after all turns
print("\nFinal:", runner.until_done().content[0].text[:100])
3. Anti-Patterns to Avoid
The CCA exam specifically tests your ability to identify agentic loop anti-patterns. These are common mistakes that lead to unreliable agent behavior.
3.1 Parsing Natural Language for Termination
Never parse Claude’s text output to determine if the loop should end. The stop_reason field is the only reliable signal:
import anthropic
client = anthropic.Anthropic()
tools = [{"name": "search", "description": "Search docs.", "input_schema": {"type": "object", "properties": {"q": {"type": "string"}}, "required": ["q"]}}]
# ❌ ANTI-PATTERN: Parsing text for completion signals
def bad_agent_loop(messages, tools):
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=tools,
messages=messages
)
text = response.content[0].text if response.content[0].type == "text" else ""
# WRONG: Checking text content for "done", "complete", "final answer"
if "DONE" in text or "final answer" in text.lower():
return text # Unreliable — Claude might say "I'm not done yet"
# WRONG: Checking if response contains text as a completion indicator
if any(b.type == "text" for b in response.content):
return text # Wrong — tool_use responses can contain text too
# ✅ CORRECT: Use stop_reason exclusively
def good_agent_loop(messages, tools):
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=tools,
messages=messages
)
# The ONLY reliable termination signal
if response.stop_reason == "end_turn":
return response.content
elif response.stop_reason == "tool_use":
# Continue loop...
pass
# Demonstrate the correct approach
messages = [{"role": "user", "content": "Hello, how are you?"}]
result = good_agent_loop(messages, tools)
print(f"stop_reason-based result: {result[0].text[:100]}...")
3.2 Arbitrary Iteration Caps as Primary Stopping Mechanism
Using a hard iteration limit as the primary way to stop the loop is an anti-pattern. Iteration caps are fine as a safety net, but stop_reason should drive normal termination:
import anthropic
import json
client = anthropic.Anthropic()
tools = [{"name": "search", "description": "Search docs.", "input_schema": {"type": "object", "properties": {"q": {"type": "string"}}, "required": ["q"]}}]
def process_tools(content):
"""Stub: execute tools and return results."""
return [{"type": "tool_result", "tool_use_id": b.id, "content": json.dumps({"result": "ok"})} for b in content if b.type == "tool_use"]
# ❌ ANTI-PATTERN: Iteration cap as primary mechanism
def bad_capped_loop(messages, tools):
for i in range(5): # Arbitrary cap drives termination
response = client.messages.create(
model="claude-sonnet-4-6", max_tokens=4096,
tools=tools, messages=messages
)
# Always runs exactly 5 times regardless of stop_reason
if i == 4:
return response.content
# ✅ CORRECT: stop_reason drives termination, cap is a safety net
def good_capped_loop(messages, tools, max_iterations=25):
for i in range(max_iterations):
response = client.messages.create(
model="claude-sonnet-4-6", max_tokens=4096,
tools=tools, messages=messages
)
# Primary: stop_reason
if response.stop_reason == "end_turn":
return response.content
if response.stop_reason == "tool_use":
# Append results and continue...
messages.append({"role": "assistant", "content": response.content})
tool_results = process_tools(response.content)
messages.append({"role": "user", "content": tool_results})
# Safety net only — should rarely trigger
raise RuntimeError(f"Agent exceeded {max_iterations} iterations without completing")
# Demo: correct approach terminates on stop_reason
messages = [{"role": "user", "content": "What is 2+2?"}]
result = good_capped_loop(messages, tools)
print(f"Result: {result[0].text[:100]}...")
3.3 Checking for Text Content as Completion
A response can contain both text and tool_use blocks simultaneously. The presence of text does not mean the agent is done — only stop_reason == "end_turn" signals completion:
import anthropic
client = anthropic.Anthropic()
# ❌ ANTI-PATTERN: Assuming text means "done"
def bad_text_check(response):
for block in response.content:
if block.type == "text":
return block.text # WRONG — might also have tool_use blocks!
# ✅ CORRECT: A response can mix text and tool_use
def correct_handling(response):
if response.stop_reason == "end_turn":
# Now it's safe to extract text as the final answer
return "\n".join(b.text for b in response.content if b.type == "text")
elif response.stop_reason == "tool_use":
# Even if there's text, we still need to execute tools
# Example: "I'll check the weather for you." + ToolUseBlock
return None # Signal to continue loop
# Demo: show correct handling
response = client.messages.create(
model="claude-sonnet-4-6", max_tokens=256,
messages=[{"role": "user", "content": "Hello!"}]
)
print(f"stop_reason: {response.stop_reason}")
print(f"correct_handling result: {correct_handling(response)[:80]}...")
Exam Question Pattern: Loop Termination
The CCA exam presents scenarios where an agent behaves incorrectly (e.g., stopping too early, running forever) and asks you to identify the root cause. The correct answer almost always involves checking stop_reason rather than parsing text, counting iterations, or checking message content.
4. Task Decomposition Strategies
Complex tasks require breaking work into manageable sub-tasks. The CCA exam (Task 1.6) tests your ability to choose between two decomposition patterns: prompt chaining (fixed sequential) and dynamic adaptive decomposition (model-driven).
4.1 Prompt Chaining (Fixed Sequential Pipelines)
Prompt chaining breaks a complex task into a fixed sequence of focused steps. Each step has a clear input and output, and the output of one step feeds into the next. This is ideal for predictable, multi-aspect tasks like code reviews or document processing.
import anthropic
from dataclasses import dataclass
client = anthropic.Anthropic()
@dataclass
class DiffFile:
name: str
content: str
def parse_diff_to_files(diff: str) -> list:
"""Parse a unified diff into individual files."""
# Simplified parser for demonstration
return [DiffFile(name="app.py", content=diff)]
def review_pull_request(diff: str) -> dict:
"""Prompt chaining: multi-pass code review (CCA Task 1.6)."""
# Pass 1: Per-file local analysis
file_reviews = []
files = parse_diff_to_files(diff)
for file in files:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
system="You are a code reviewer. Analyze this single file for bugs, security issues, and logic errors. Ignore style issues.",
messages=[{"role": "user", "content": f"Review this file:\n\n{file.content}"}]
)
file_reviews.append({
"file": file.name,
"findings": response.content[0].text
})
# Pass 2: Cross-file integration analysis
combined = "\n\n".join(f"## {r['file']}\n{r['findings']}" for r in file_reviews)
integration_response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
system="You are a senior architect. Given per-file review findings, identify cross-file issues: data flow bugs, inconsistent error handling, missing integration points.",
messages=[{"role": "user", "content": f"Per-file findings:\n\n{combined}"}]
)
return {
"file_reviews": file_reviews,
"integration_review": integration_response.content[0].text
}
# Demo: review a sample diff
sample_diff = "def get_user(id):\n return db.query(f'SELECT * FROM users WHERE id={id}')"
result = review_pull_request(sample_diff)
print(f"Files reviewed: {len(result['file_reviews'])}")
print(f"Integration review: {result['integration_review'][:120]}...")
4.2 Dynamic Adaptive Decomposition
For open-ended tasks where the structure isn’t known in advance, let the agent dynamically generate subtasks based on what it discovers. The agent creates an investigation plan, executes it, and adapts as new information emerges:
import anthropic
import json
client = anthropic.Anthropic()
def execute_tool(name: str, input_data: dict) -> dict:
"""Stub tool executor — replace with real implementations."""
print(f" [Executing tool: {name} with input: {input_data}]")
return {"status": "success", "result": f"Mock result for {name}"}
def run_agent(user_message: str, tools: list, system: str = "") -> str:
"""Agentic loop — continues until Claude sets stop_reason to end_turn."""
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-6", max_tokens=4096,
system=system, tools=tools, messages=messages
)
if response.stop_reason == "end_turn":
return "\n".join(b.text for b in response.content if b.type == "text")
if response.stop_reason == "tool_use":
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({"type": "tool_result", "tool_use_id": block.id, "content": json.dumps(result)})
messages.append({"role": "user", "content": tool_results})
def investigate_codebase(objective: str, tools: list) -> str:
"""Dynamic adaptive decomposition: agent creates its own plan (CCA Task 1.6)."""
system = """You are a senior engineer investigating a codebase.
Your approach:
1. First, understand the high-level structure (search for entry points, config files)
2. Based on what you find, form specific hypotheses about how the system works
3. Investigate each hypothesis using targeted searches and file reads
4. Adapt your plan based on discoveries — if you find unexpected patterns, investigate them
5. When you have sufficient understanding, synthesize your findings
Think step-by-step. Use tools to gather evidence. Don't guess — verify."""
# The agent drives its own decomposition through the agentic loop
return run_agent(user_message=objective, tools=tools, system=system)
# Codebase investigation tools
codebase_tools = [
{"name": "search_code", "description": "Search codebase by query.", "input_schema": {"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]}},
{"name": "read_file", "description": "Read a file's contents.", "input_schema": {"type": "object", "properties": {"path": {"type": "string"}}, "required": ["path"]}}
]
# Example: open-ended investigation
# The agent will autonomously decide:
# 1. Search for entry points → discovers it's a FastAPI app
# 2. Read main.py → finds router imports
# 3. Search for database connections → finds SQLAlchemy models
# 4. Trace the refund flow → discovers a missing validation
# This order emerges from Claude's reasoning, not pre-coded steps
result = investigate_codebase(
objective="Map the refund processing flow and identify potential issues.",
tools=codebase_tools
)
print(result)
4.3 Choosing the Right Pattern
The CCA exam tests your judgment in selecting the appropriate decomposition strategy. Use this decision framework:
| Characteristic | Prompt Chaining | Dynamic Adaptive |
|---|---|---|
| Task structure | Known in advance | Discovered during execution |
| Number of steps | Fixed | Variable |
| Example | Code review, document extraction | Debugging, codebase exploration |
| Predictability | High — same steps every time | Low — adapts to findings |
| Error handling | Retry individual steps | Agent self-corrects |
| CCA scenario | S5: CI/CD review (per-file + integration) | S4: Codebase exploration |
5. Case Study: Customer Support Agent (CCA Scenario 1)
The CCA’s Scenario 1 presents a customer support agent with access to get_customer, lookup_order, process_refund, and escalate_to_human. Here is a production-ready implementation using the agentic loop pattern:
import anthropic
import json
client = anthropic.Anthropic()
def execute_tool(name: str, input_data: dict) -> dict:
"""Stub tool executor — replace with real implementations."""
print(f" [Executing tool: {name} with input: {input_data}]")
return {"status": "success", "result": f"Mock result for {name}"}
def run_agent(user_message: str, tools: list, system: str = "") -> str:
"""Agentic loop — continues until Claude sets stop_reason to end_turn."""
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-6", max_tokens=4096,
system=system, tools=tools, messages=messages
)
if response.stop_reason == "end_turn":
return "\n".join(b.text for b in response.content if b.type == "text")
if response.stop_reason == "tool_use":
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({"type": "tool_result", "tool_use_id": block.id, "content": json.dumps(result)})
messages.append({"role": "user", "content": tool_results})
SUPPORT_TOOLS = [
{
"name": "get_customer",
"description": "Look up customer by email or phone. Returns: customer_id, name, plan, account_status, verified (bool). MUST be called before any order or refund operations.",
"input_schema": {
"type": "object",
"properties": {
"identifier": {"type": "string", "description": "Customer email or phone number"}
},
"required": ["identifier"]
}
},
{
"name": "lookup_order",
"description": "Look up order details by order ID. Returns: order_id, items, total, status, shipped_date, delivery_date. Requires a verified customer_id from get_customer.",
"input_schema": {
"type": "object",
"properties": {
"order_id": {"type": "string"},
"customer_id": {"type": "string", "description": "Verified customer ID from get_customer"}
},
"required": ["order_id", "customer_id"]
}
},
{
"name": "process_refund",
"description": "Process a refund for an order. Only call after verifying the customer and order details. Refunds over $500 require human approval.",
"input_schema": {
"type": "object",
"properties": {
"order_id": {"type": "string"},
"customer_id": {"type": "string"},
"amount": {"type": "number"},
"reason": {"type": "string"}
},
"required": ["order_id", "customer_id", "amount", "reason"]
}
},
{
"name": "escalate_to_human",
"description": "Escalate to a human agent. Use when: customer explicitly requests a human, policy is unclear, or you cannot resolve the issue after investigation.",
"input_schema": {
"type": "object",
"properties": {
"customer_id": {"type": "string"},
"reason": {"type": "string"},
"summary": {"type": "string", "description": "Brief summary for the human agent"}
},
"required": ["customer_id", "reason", "summary"]
}
}
]
SUPPORT_SYSTEM = """You are a customer support agent for an e-commerce platform.
Your goals:
- Resolve customer issues with 80%+ first-contact resolution
- Always verify the customer (get_customer) before any order or refund operations
- Escalate when: customer requests a human, policy is ambiguous, or refund exceeds $500
Workflow:
1. Greet the customer and identify their issue
2. Verify their identity with get_customer
3. Investigate using available tools
4. Resolve or escalate appropriately
Be empathetic but efficient. Never make assumptions about order details — always look them up."""
def handle_support_request(customer_message: str) -> str:
"""Run the support agent loop."""
return run_agent(
user_message=customer_message,
tools=SUPPORT_TOOLS,
system=SUPPORT_SYSTEM
)
# The agent autonomously: verifies customer → looks up order → processes refund
result = handle_support_request(
"Hi, I'm jane@example.com. Order #ORD-789 arrived damaged. I'd like a refund."
)
print(result)
get_customer first because the system prompt says to verify identity, (2) call lookup_order to confirm order details, then (3) either call process_refund if it has sufficient information, escalate_to_human if the amount exceeds $500, or ask a clarifying question if the tool results lack details (e.g., which items were damaged, the refund amount). The order emerges from reasoning, not hard-coding — and importantly, the agent may add conversational turns before taking irreversible actions like issuing refunds.
web_search and calculate_market_cap. The agent should search for the company, find its stock price and shares outstanding, calculate the market cap, and return a structured summary. Test with at least 3 different companies.
6. Ticket Routing Pattern (CCA 1.2)
Ticket routing is one of the most common production patterns: a user message comes in, and your system must classify it and send it to the right handler — billing, technical, sales, or escalation. This is a single-turn classification task (not an agentic loop), making it fast and cheap.
Analogy: Think of ticket routing like a hospital triage desk. The patient describes their symptoms (user message), the triage nurse (Claude) classifies urgency and department (structured output), and the patient is sent to the right specialist (downstream handler).
import anthropic
import json
client = anthropic.Anthropic()
# Ticket Routing — Single-turn classification + routing
# Key features: tool_choice forced, confidence threshold, fallback handling
routing_tool = {
"name": "route_ticket",
"description": "Classify and route a support ticket to the appropriate team.",
"input_schema": {
"type": "object",
"properties": {
"intent": {
"type": "string",
"enum": ["billing", "technical", "account", "sales", "other"],
"description": "Primary intent category"
},
"sub_intent": {
"type": "string",
"description": "Specific sub-category (e.g., 'refund_request', 'password_reset')"
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "urgent"]
},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Classification confidence. Below 0.7 = route to human triage."
},
"reasoning": {
"type": "string",
"description": "Brief explanation of classification decision"
}
},
"required": ["intent", "sub_intent", "priority", "confidence", "reasoning"]
}
}
def route_ticket(message: str, confidence_threshold: float = 0.7) -> dict:
"""Classify and route a ticket. Falls back to human triage if uncertain."""
response = client.messages.create(
model="claude-haiku-4-5", # Fast + cheap for classification
max_tokens=200,
temperature=0, # Deterministic classification
tools=[routing_tool],
tool_choice={"type": "tool", "name": "route_ticket"}, # FORCED
system=(
"Classify this support ticket. Be conservative with confidence: "
"if the message is ambiguous or could fit multiple categories, "
"set confidence below 0.7 to trigger human triage."
),
messages=[{"role": "user", "content": message}]
)
result = next(b for b in response.content if b.type == "tool_use")
classification = result.input
# Confidence-based routing
if classification["confidence"] < confidence_threshold:
classification["routed_to"] = "human_triage"
classification["reason"] = "Below confidence threshold"
else:
classification["routed_to"] = classification["intent"] + "_team"
return classification
# Test routing
tickets = [
"I was charged twice for my subscription last month",
"My API calls are returning 500 errors since this morning",
"I'm considering upgrading but also have a billing question", # Ambiguous!
]
for ticket in tickets:
result = route_ticket(ticket)
print(f"'{ticket[:50]}...' \u2192 {result['routed_to']} (conf: {result['confidence']})")
tool_choice: forced + temperature: 0 + confidence threshold for reliable routing. Messages below the confidence threshold go to human triage — this prevents miscategorization on ambiguous inputs. Use Claude Haiku for routing (fast, cheap, sufficient accuracy).
7. Choosing the Right Pattern (CCA 1.1)
Not every task needs an agentic loop. The CCA exam tests your ability to choose the simplest pattern that solves the problem. Here’s the decision framework:
| Pattern | When to Use | Example | Cost |
|---|---|---|---|
| Single-turn (direct API) | Simple classification, extraction, Q&A | Ticket routing, sentiment analysis | Lowest |
| Prompt chain (fixed steps) | Multi-step but predictable sequence | Summarize → Translate → Format | Low |
| Single agent (tool loop) | Dynamic decisions, tool usage needed | Customer support, code generation | Medium |
| Multi-agent (coordinator + specialists) | Complex tasks requiring diverse expertise | Research pipelines, code review systems | High |
import anthropic
client = anthropic.Anthropic()
# Decision framework for choosing the right pattern:
def choose_pattern(task_description: str) -> str:
"""Guide: which pattern fits this task?"""
# Does the task require tools? (database, API, file access)
needs_tools = any(word in task_description.lower()
for word in ["look up", "search", "check", "query", "create", "update"])
# Does the task require multiple steps with branching?
needs_branching = any(word in task_description.lower()
for word in ["if", "depending on", "based on", "decide"])
# Does the task require multiple areas of expertise?
needs_specialists = any(word in task_description.lower()
for word in ["research", "analyze code", "review", "multiple"])
if not needs_tools and not needs_branching:
return "single_turn" # Direct API call with tool_choice forced
elif needs_tools and not needs_branching:
return "prompt_chain" # Fixed sequence of API calls
elif needs_tools and needs_branching and not needs_specialists:
return "single_agent" # One agent with tool loop
else:
return "multi_agent" # Coordinator + specialist subagents
# Examples:
tasks = [
"Classify this ticket into billing/technical/sales",
"Summarize this document, then translate to Spanish",
"Help this customer resolve their billing issue (may need lookups)",
"Research competitors, analyze their code, and write a comparison report"
]
for task in tasks:
pattern = choose_pattern(task)
print(f" {pattern:15} \u2190 {task[:60]}")
8. Agent SDK: The Production Loop
Everything in Sections 1–7 teaches you how agentic loops work under the hood. In production, you don’t build this machinery yourself. The Claude Agent SDK (claude_agent_sdk for Python, @anthropic-ai/claude-agent-sdk for TypeScript) provides the same agentic loop that powers Claude Code — with built-in tool execution, context management, automatic retries, and parallel tool calls — all behind a single query() function.
anthropic SDK is like writing HTTP requests with socket — educational and gives full control. The Agent SDK is like using requests — it handles connection pooling, retries, redirects, and encoding for you. Both talk to the same server; the Agent SDK just removes the boilerplate.
8.1 The query() Entry Point
The Agent SDK’s query() function replaces the entire while True loop you built in Section 1. You give it a prompt and options; it runs the full agentic loop internally — calling tools, processing results, and repeating until done — and streams messages back to you as they happen:
# Agent SDK Agentic Loop — Production Equivalent of Section 1
# Requires: pip install claude-agent-sdk
# Set env var: ANTHROPIC_API_KEY=sk-ant-...
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, ResultMessage
async def main():
"""The entire agentic loop from Section 1 in 15 lines."""
async for message in query(
prompt="What files are in this directory? Summarize the project structure.",
options=ClaudeAgentOptions(
# These tools are BUILT-IN — no implementation needed
allowed_tools=["Bash", "Glob", "Read"],
# Auto-approve file reads (no permission prompts)
permission_mode="acceptEdits",
),
):
# Print Claude's reasoning as it works
if isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text"):
print(block.text)
elif hasattr(block, "name"):
print(f" [Tool: {block.name}]")
# Final result with cost and session info
if isinstance(message, ResultMessage):
if message.subtype == "success":
print(f"\nDone: {message.result}")
print(f"Cost: ${message.total_cost_usd:.4f}")
print(f"Turns: {message.num_turns}")
asyncio.run(main())
Compare this to the manual loop from Section 1.2: no while True, no execute_tool() function, no message array management, no stop_reason checking. The SDK handles all of that internally — running the exact same loop pattern we built manually, but with production-grade error handling, parallel tool execution, and automatic context management.
8.2 Message Types & Lifecycle
As the SDK loop runs, it yields a stream of typed messages. These correspond directly to the loop stages from Section 1.1:
sequenceDiagram
participant App as Your Application
participant SDK as Agent SDK
participant API as Claude API
SDK->>App: SystemMessage (subtype: "init")
Note over App: Session metadata, tools loaded
loop Each Turn
SDK->>API: Send prompt + tools + history
API->>SDK: Response with tool calls
SDK->>App: AssistantMessage (text + tool calls)
SDK->>SDK: Execute tools (built-in)
SDK->>App: UserMessage (tool results)
end
SDK->>App: AssistantMessage (final, no tool calls)
SDK->>App: ResultMessage (success/error + cost)
The five message types map to specific loop stages:
| Message Type | When Emitted | What It Contains | Raw SDK Equivalent |
|---|---|---|---|
SystemMessage | Session start | Session ID, metadata, tools loaded | N/A (you managed this yourself) |
AssistantMessage | Each Claude response | Text blocks + tool call blocks | response.content from client.messages.create() |
UserMessage | After tool execution | Tool results fed back to Claude | Your tool_results append in the loop |
StreamEvent | Real-time (if enabled) | Text deltas, tool input chunks | SSE events from streaming API |
ResultMessage | Loop ends | Final text, cost, turns, session ID | Your return statement when stop_reason == "end_turn" |
# Handling All Message Types — Complete Pattern
# Requires: pip install claude-agent-sdk
import asyncio
from claude_agent_sdk import (
query, ClaudeAgentOptions,
SystemMessage, AssistantMessage, UserMessage, ResultMessage
)
async def main():
"""Demonstrate all message types in the SDK loop."""
async for message in query(
prompt="Find all Python files in this directory and count the total lines of code.",
options=ClaudeAgentOptions(
allowed_tools=["Bash", "Glob", "Grep", "Read"],
permission_mode="acceptEdits",
),
):
# 1. SystemMessage — session lifecycle (init, compaction)
if isinstance(message, SystemMessage):
if message.subtype == "init":
print(f"[Session started: {message.data.get('session_id', 'N/A')}]")
elif message.subtype == "compact_boundary":
print("[Context compacted — older messages summarized]")
# 2. AssistantMessage — Claude's response each turn
elif isinstance(message, AssistantMessage):
for block in message.content:
if hasattr(block, "text") and block.text:
print(f"Claude: {block.text[:120]}...")
elif hasattr(block, "name"):
print(f" → Calling tool: {block.name}")
# 3. UserMessage — tool results (automatic, usually skip)
elif isinstance(message, UserMessage):
pass # SDK handles tool results internally
# 4. ResultMessage — loop complete
elif isinstance(message, ResultMessage):
print(f"\n{'='*50}")
print(f"Status: {message.subtype}")
if message.subtype == "success":
print(f"Result: {message.result[:200]}")
if message.total_cost_usd is not None:
print(f"Cost: ${message.total_cost_usd:.4f}")
print(f"Turns: {message.num_turns}")
print(f"Session: {message.session_id}")
asyncio.run(main())
8.3 Result Handling & Error Subtypes
In the raw approach (Section 3.2), you had to manually handle max_tokens truncation and add safety caps. The Agent SDK encodes all termination states in the ResultMessage.subtype field:
| Subtype | Meaning | Has .result? | Action |
|---|---|---|---|
success | Task completed normally | Yes | Use the result |
error_max_turns | Hit max_turns limit | No | Resume session with higher limit |
error_max_budget_usd | Hit cost ceiling | No | Resume or report to user |
error_during_execution | API failure or cancelled | No | Retry or investigate |
# Production Result Handling — All Error Subtypes
# Requires: pip install claude-agent-sdk
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def run_with_safety(prompt: str, max_turns: int = 30, max_budget: float = 0.50):
"""Run an agent with budget and turn limits, handle all outcomes."""
session_id = None
async for message in query(
prompt=prompt,
options=ClaudeAgentOptions(
allowed_tools=["Read", "Glob", "Grep", "Bash"],
permission_mode="acceptEdits",
max_turns=max_turns, # Safety cap (like Section 3.2)
max_budget_usd=max_budget, # Cost ceiling
),
):
if isinstance(message, ResultMessage):
session_id = message.session_id
if message.subtype == "success":
print(f"Done: {message.result[:200]}")
elif message.subtype == "error_max_turns":
# Agent ran out of turns — resume with higher limit
print(f"Hit turn limit ({max_turns}). Session: {session_id}")
print("Resume this session with a higher max_turns to continue.")
elif message.subtype == "error_max_budget_usd":
print(f"Hit budget limit (${max_budget}). Session: {session_id}")
elif message.subtype == "error_during_execution":
print(f"Execution error. Session: {session_id}")
# Cost is available on ALL result subtypes
if message.total_cost_usd is not None:
print(f"Total cost: ${message.total_cost_usd:.4f}")
print(f"Turns used: {message.num_turns}")
return session_id
# Run with safety limits
asyncio.run(run_with_safety(
prompt="Analyze the test coverage in this project and suggest improvements.",
max_turns=15,
max_budget=0.25
))
max_tokens truncation. The Agent SDK provides max_turns and max_budget_usd as first-class options with proper error subtypes. Always check message.subtype before reading message.result — only the "success" subtype has it.
8.4 Effort Levels
The effort option controls how deeply Claude reasons on each turn. Lower effort = fewer tokens = less cost. This maps directly to the “Choosing the Right Pattern” framework from Section 7 — simple tasks don’t need deep reasoning:
| Effort | Reasoning Depth | Use Case | Cost |
|---|---|---|---|
"low" | Minimal | File lookups, listing directories, simple classification | Lowest |
"medium" | Balanced | Routine edits, standard tasks, summarization | Low |
"high" | Thorough | Code review, refactoring, debugging | Medium |
"xhigh" | Deep analysis | Complex agentic tasks (recommended on Opus 4.7) | High |
"max" | Maximum depth | Multi-step problems requiring deep reasoning | Highest |
# Effort Levels — Match reasoning depth to task complexity
# Requires: pip install claude-agent-sdk
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def run_with_effort(prompt: str, effort: str):
"""Demonstrate effort level impact on cost and behavior."""
async for message in query(
prompt=prompt,
options=ClaudeAgentOptions(
allowed_tools=["Glob", "Read"],
permission_mode="acceptEdits",
effort=effort, # Controls reasoning depth per turn
),
):
if isinstance(message, ResultMessage) and message.subtype == "success":
cost = message.total_cost_usd or 0
print(f" effort={effort:8} | cost=${cost:.4f} | turns={message.num_turns}")
print(f" result: {message.result[:80]}...")
async def main():
# Same task, different effort levels — watch cost decrease
prompt = "List the Python files in this directory."
print("Low effort (simple file listing):")
await run_with_effort(prompt, "low")
print("\nHigh effort (same task, more reasoning tokens spent):")
await run_with_effort(prompt, "high")
asyncio.run(main())
effort trades latency and token cost for reasoning depth. It is independent of Extended Thinking (Part 14). You can set effort: "low" with extended thinking enabled, or effort: "max" without it. Use lower effort for simple, well-scoped tasks to reduce cost.
8.5 Raw Client SDK vs Agent SDK — Side-by-Side
Here’s the same task implemented both ways. The raw approach (Sections 1–7) requires ~40 lines of loop management. The Agent SDK requires ~15 lines focused on your actual logic:
# === RAW CLIENT SDK (from Sections 1–2) ===
# You build and manage the entire loop yourself
import anthropic
import json
client = anthropic.Anthropic()
def execute_tool(name: str, input_data: dict) -> dict:
"""You must implement EVERY tool yourself."""
if name == "get_weather":
return {"temp": "18°C", "condition": "cloudy"}
return {"error": "unknown tool"}
def run_agent_raw(prompt: str) -> str:
"""Manual loop — you handle everything."""
tools = [{"name": "get_weather", "description": "Get weather.", "input_schema": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}}]
messages = [{"role": "user", "content": prompt}]
for _ in range(25): # Safety cap
response = client.messages.create(
model="claude-sonnet-4-6", max_tokens=4096,
tools=tools, messages=messages
)
if response.stop_reason == "end_turn":
return "\n".join(b.text for b in response.content if b.type == "text")
if response.stop_reason == "tool_use":
messages.append({"role": "assistant", "content": response.content})
results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
results.append({"type": "tool_result", "tool_use_id": block.id, "content": json.dumps(result)})
messages.append({"role": "user", "content": results})
return "[max iterations]"
# Usage: raw approach
result = run_agent_raw("What's the weather in London?")
print(f"Raw result: {result}")
# === AGENT SDK (production approach) ===
# The SDK handles the loop, tools, retries, and context for you
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, ResultMessage
async def run_agent_sdk(prompt: str) -> str:
"""SDK loop — tools are built-in, no implementation needed."""
async for message in query(
prompt=prompt,
options=ClaudeAgentOptions(
allowed_tools=["WebSearch"], # Built-in web search
permission_mode="acceptEdits",
max_turns=25, # Safety cap (same as raw)
effort="medium", # Balanced reasoning
),
):
if isinstance(message, ResultMessage):
if message.subtype == "success":
return message.result
return f"[{message.subtype}]"
return "[no result]"
# Usage: SDK approach
result = asyncio.run(run_agent_sdk("What's the weather in London?"))
print(f"SDK result: {result}")
| Aspect | Raw Client SDK | Agent SDK |
|---|---|---|
| Loop management | You build while True + stop_reason checks | SDK runs internally |
| Tool execution | You implement execute_tool() | 15+ built-in tools (Read, Edit, Bash, WebSearch…) |
| Message array | You manage messages.append() | SDK tracks conversation automatically |
| Parallel tools | You implement asyncio.gather() | Automatic (read-only tools run concurrently) |
| Error handling | Manual retries + exception handling | Automatic retries + typed ResultMessage.subtype |
| Cost tracking | Sum response.usage yourself | ResultMessage.total_cost_usd |
| Context overflow | Manual summarization (Part 12) | Automatic compaction |
| Safety caps | Manual iteration counter | max_turns + max_budget_usd options |
| When to use | Custom tool implementations, learning, edge cases | Production agents, rapid prototyping, standard patterns |
create_sdk_mcp_server() (covered in Part 6).
Next in the SDK Track
In Part 4: Multi-Agent Orchestration, we’ll scale from single agents to multi-agent systems — hub-and-spoke coordinators, the Agent tool for subagent spawning, parallel execution, prerequisite gates, and structured handoffs. Covers CCA Domain 1 Tasks 1.2, 1.3, and 1.4.