tool_choice modes and confidence-based routing.
1. Use Cases Overview
The CCA exam organizes Claude applications into distinct use case families, each with preferred architectural patterns. Choosing the wrong pattern leads to over-engineering simple tasks or under-powering complex workflows. The key insight: start simple and escalate only when the task demands it.
1.1 Use Case Families
Anthropic groups production use cases into five families, each mapping naturally to specific API features:
| Family | Examples | Key API Features | Typical Pattern |
|---|---|---|---|
| Classification | Ticket routing, content moderation, sentiment | tool_use with forced tool, structured output | Direct API |
| Generation | Email drafts, code generation, summaries | System prompts, temperature control, streaming | Direct API |
| Extraction | Invoice parsing, entity recognition, data mining | Structured output, vision (PDFs), tool_use | Direct API |
| Conversation | Customer support, tutoring, sales assistants | Multi-turn, tools, memory management | Agentic Loop |
| Code | Code review, refactoring, test generation | Extended thinking, tools, file operations | Agentic / Multi-Agent |
1.2 Choosing the Right Pattern
The decision between direct API calls, an agentic loop, and a multi-agent system depends on three factors: task complexity, tool requirements, and coordination needs.
flowchart TD
A["New Use Case"] --> B{"Single API call sufficient?"}
B -->|"Yes"| C["Direct API Call"]
B -->|"No"| D{"Needs multiple steps or tools?"}
D -->|"Yes"| E{"Steps need different specializations?"}
D -->|"No"| C
E -->|"No"| F["Agentic Loop
(single agent + tools)"]
E -->|"Yes"| G{"Parallel execution needed?"}
G -->|"No"| H["Pipeline
(sequential agents)"]
G -->|"Yes"| I["Multi-Agent System
(orchestrator + specialists)"]
style C fill:#3B9797,color:#fff
style F fill:#16476A,color:#fff
style H fill:#132440,color:#fff
style I fill:#BF092F,color:#fff
Direct API — Use when a single Claude call produces the complete answer. Classification, extraction, and simple generation tasks fit here. No loop, no tools, no state management.
Agentic Loop — Use when the task requires multiple steps, tool calls, or iterative refinement, but a single agent can handle all steps. Customer support bots, research assistants, and code editors fit here.
Multi-Agent — Use when distinct specializations are needed (e.g., a router + domain experts), or when parallel execution improves latency. Complex workflows where no single prompt can cover all domains.
1.3 Scope & Complexity Assessment
Before building, assess your use case on two axes:
| Complexity | Scope | Pattern | Example |
|---|---|---|---|
| Low | Single domain, no tools | Direct API | Classify email sentiment |
| Medium | Single domain, 2-5 tools | Agentic Loop | Customer support with FAQ + order lookup |
| High | Multi-domain, 5+ tools | Agentic Loop (advanced) | Research assistant with web + DB + files |
| Very High | Multi-domain, parallel, specialized | Multi-Agent | Code review pipeline (lint + security + style) |
import anthropic
# Pattern 1: Direct API — single-call classification
client = anthropic.Anthropic()
def classify_sentiment(text: str) -> str:
"""Direct API pattern: one call, one answer, no tools needed."""
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=50,
system="Classify the sentiment as positive, negative, or neutral. Reply with only the label.",
messages=[{"role": "user", "content": text}]
)
return message.content[0].text.strip().lower()
# Test
result = classify_sentiment("I love this product! Best purchase ever.")
print(f"Sentiment: {result}")
# Output: Sentiment: positive
2. Ticket Routing with Intent Classification
Ticket routing is the canonical CCA exam use case for classification + tool_use + structured output. The pattern: Claude classifies user intent using a forced tool call that returns structured JSON, then your application routes based on the classification result.
2.1 Intent Classification with Forced Tool
The key technique is defining a classification tool and forcing Claude to use it via tool_choice. This guarantees structured output without hoping Claude follows formatting instructions.
import anthropic
import json
client = anthropic.Anthropic()
# Define the classification tool
classify_ticket_tool = {
"name": "classify_ticket",
"description": "Classify a support ticket into a department with priority and confidence.",
"input_schema": {
"type": "object",
"properties": {
"department": {
"type": "string",
"enum": ["billing", "technical", "sales", "general"],
"description": "The department best suited to handle this ticket"
},
"priority": {
"type": "string",
"enum": ["urgent", "high", "medium", "low"],
"description": "Priority level based on issue severity and customer impact"
},
"confidence": {
"type": "number",
"minimum": 0.0,
"maximum": 1.0,
"description": "Confidence in the classification (0.0 to 1.0)"
},
"reasoning": {
"type": "string",
"description": "Brief explanation of why this classification was chosen"
}
},
"required": ["department", "priority", "confidence", "reasoning"]
}
}
# Force Claude to use the classification tool
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=300,
system="You are a ticket routing system. Classify incoming support tickets accurately.",
tools=[classify_ticket_tool],
tool_choice={"type": "tool", "name": "classify_ticket"},
messages=[{
"role": "user",
"content": "I've been charged twice for my subscription this month and I need a refund immediately!"
}]
)
# Extract structured classification from tool_use block
tool_use_block = next(b for b in message.content if b.type == "tool_use")
classification = tool_use_block.input
print(json.dumps(classification, indent=2))
# {
# "department": "billing",
# "priority": "urgent",
# "confidence": 0.97,
# "reasoning": "Customer reports duplicate charge requiring immediate refund - clear billing issue"
# }
2.2 tool_choice Modes
The tool_choice parameter controls how Claude uses available tools. Understanding the three modes is critical for the CCA exam:
| Mode | Syntax | Behavior | Use Case |
|---|---|---|---|
| auto | {"type": "auto"} | Claude decides whether to use tools | Conversational agents that sometimes need tools |
| any | {"type": "any"} | Claude must use at least one tool (chooses which) | Multi-tool scenarios where a tool call is always needed |
| forced | {"type": "tool", "name": "X"} | Claude must use the specified tool | Classification, extraction — guaranteed structured output |
tool_choice modes. Key rules: (1) {"type": "tool", "name": "X"} forces a specific tool — guarantees structured output for classification. (2) {"type": "any"} forces some tool use but lets Claude choose which — use when multiple tools exist and one must be called. (3) {"type": "auto"} (default) lets Claude decide — use in agentic loops where text responses are valid. (4) Forced tool mode produces no text content — stop_reason is always "tool_use".
import anthropic
client = anthropic.Anthropic()
# Demonstrate all three tool_choice modes
tools = [{
"name": "route_ticket",
"description": "Route a ticket to a department",
"input_schema": {
"type": "object",
"properties": {
"department": {"type": "string", "enum": ["billing", "technical", "sales"]},
"priority": {"type": "string", "enum": ["high", "medium", "low"]}
},
"required": ["department", "priority"]
}
}, {
"name": "escalate_to_human",
"description": "Escalate to a human agent when confidence is low",
"input_schema": {
"type": "object",
"properties": {
"reason": {"type": "string"},
"suggested_department": {"type": "string"}
},
"required": ["reason"]
}
}]
ticket = "My API keeps returning 503 errors and my team is blocked"
# Mode 1: auto — Claude decides whether to use a tool
response_auto = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=200,
tools=tools,
tool_choice={"type": "auto"},
messages=[{"role": "user", "content": ticket}]
)
print(f"Auto - stop_reason: {response_auto.stop_reason}")
# Mode 2: any — Claude must use a tool but picks which one
response_any = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=200,
tools=tools,
tool_choice={"type": "any"},
messages=[{"role": "user", "content": ticket}]
)
print(f"Any - stop_reason: {response_any.stop_reason}")
tool_used = next(b for b in response_any.content if b.type == "tool_use")
print(f"Any - tool chosen: {tool_used.name}")
# Mode 3: forced — Claude must use the specified tool
response_forced = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=200,
tools=tools,
tool_choice={"type": "tool", "name": "route_ticket"},
messages=[{"role": "user", "content": ticket}]
)
print(f"Forced - stop_reason: {response_forced.stop_reason}")
forced_result = next(b for b in response_forced.content if b.type == "tool_use")
print(f"Forced - result: {forced_result.input}")
2.3 Structured Triage Output
In production, the classification tool’s schema defines your routing contract. Design schemas to include everything downstream systems need: department, priority, confidence, tags, and suggested actions.
import anthropic
import json
client = anthropic.Anthropic()
# Production-grade triage tool with comprehensive schema
triage_tool = {
"name": "triage_ticket",
"description": "Perform comprehensive ticket triage with routing, priority, and metadata.",
"input_schema": {
"type": "object",
"properties": {
"intent": {
"type": "string",
"enum": ["billing_dispute", "technical_bug", "feature_request",
"account_access", "cancellation", "general_inquiry"],
"description": "Primary intent of the customer message"
},
"department": {
"type": "string",
"enum": ["billing", "engineering", "product", "account_ops", "retention", "general"]
},
"priority": {
"type": "string",
"enum": ["p0_critical", "p1_high", "p2_medium", "p3_low"]
},
"confidence": {
"type": "number",
"minimum": 0.0,
"maximum": 1.0
},
"tags": {
"type": "array",
"items": {"type": "string"},
"description": "Relevant tags for filtering and analytics"
},
"suggested_action": {
"type": "string",
"description": "Recommended first action for the handling agent"
},
"requires_human": {
"type": "boolean",
"description": "Whether this ticket needs human review regardless of routing"
}
},
"required": ["intent", "department", "priority", "confidence", "tags",
"suggested_action", "requires_human"]
}
}
def triage_ticket(user_message: str) -> dict:
"""Classify and triage a support ticket with full metadata."""
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=400,
system=(
"You are an expert ticket triage system. Classify tickets accurately. "
"Set requires_human=true for: legal threats, data breaches, VIP customers, "
"or ambiguous multi-intent messages."
),
tools=[triage_tool],
tool_choice={"type": "tool", "name": "triage_ticket"},
messages=[{"role": "user", "content": user_message}]
)
tool_block = next(b for b in response.content if b.type == "tool_use")
return tool_block.input
# Test with various ticket types
tickets = [
"I was charged $99 but my plan is $49. Please fix this now.",
"The dashboard keeps showing a blank page after the latest update.",
"I'd like to cancel my account. Your competitor offers better pricing.",
]
for ticket in tickets:
result = triage_ticket(ticket)
print(f"\nTicket: {ticket[:50]}...")
print(f" Intent: {result['intent']} | Dept: {result['department']}")
print(f" Priority: {result['priority']} | Confidence: {result['confidence']}")
print(f" Tags: {result['tags']}")
print(f" Human needed: {result['requires_human']}")
3. Routing Architecture Patterns
In production, classification is just the first step. The routing architecture determines what happens after classification — direct routing, clarification, escalation, or splitting multi-intent tickets.
sequenceDiagram
participant U as User
participant R as Router Agent
participant C as Classifier
participant D as Department Agent
participant H as Human Agent
U->>R: Submit ticket
R->>C: Classify intent (forced tool)
C-->>R: {department, confidence}
alt confidence > 0.9
R->>D: Route directly
D-->>U: Automated response
else confidence 0.7-0.9
R->>U: Ask clarifying question
U->>R: Provide clarification
R->>C: Reclassify with context
C-->>R: Updated classification
R->>D: Route with context
else confidence < 0.7
R->>H: Escalate to human
H-->>U: Human handles ticket
end
3.1 Confidence Threshold Logic
Confidence thresholds are the bridge between AI classification and business routing. Set them based on your tolerance for misrouting:
| Threshold | Action | Rationale |
|---|---|---|
> 0.9 | Route directly to department | High confidence — automated handling is safe |
0.7 – 0.9 | Ask clarifying question | Moderate confidence — one more signal resolves ambiguity |
< 0.7 | Escalate to human agent | Low confidence — risk of misrouting exceeds automation benefit |
import anthropic
import json
client = anthropic.Anthropic()
# Classification tool (same as before)
classify_tool = {
"name": "classify_ticket",
"description": "Classify a support ticket with confidence score.",
"input_schema": {
"type": "object",
"properties": {
"department": {"type": "string", "enum": ["billing", "technical", "sales", "general"]},
"confidence": {"type": "number", "minimum": 0.0, "maximum": 1.0},
"reasoning": {"type": "string"}
},
"required": ["department", "confidence", "reasoning"]
}
}
def route_ticket(user_message: str) -> dict:
"""Route a ticket using confidence-based thresholds."""
# Step 1: Classify
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=200,
system="Classify support tickets. Be calibrated with confidence: use 0.9+ only when very certain.",
tools=[classify_tool],
tool_choice={"type": "tool", "name": "classify_ticket"},
messages=[{"role": "user", "content": user_message}]
)
tool_block = next(b for b in response.content if b.type == "tool_use")
classification = tool_block.input
# Step 2: Route based on confidence
confidence = classification["confidence"]
department = classification["department"]
if confidence > 0.9:
action = "route_directly"
target = department
elif confidence >= 0.7:
action = "ask_clarification"
target = department # tentative department
else:
action = "escalate_to_human"
target = "human_queue"
return {
"action": action,
"target": target,
"classification": classification
}
# Test routing decisions
test_tickets = [
"I need a refund for order #12345", # Clear billing → high confidence
"Something weird is happening with my account", # Ambiguous → medium confidence
"asdf help please broken thing", # Unclear → low confidence
]
for ticket in test_tickets:
result = route_ticket(ticket)
print(f"\nTicket: '{ticket}'")
print(f" Action: {result['action']} → {result['target']}")
print(f" Confidence: {result['classification']['confidence']}")
print(f" Reasoning: {result['classification']['reasoning']}")
3.2 Multi-Intent Detection
Real-world tickets often contain multiple intents: “I need a refund AND my dashboard is broken.” Handle this by allowing the classification tool to return multiple intents, then split into separate routing actions.
intents array rather than a single intent field for production systems.
4. Production Patterns
4.1 Batch Classification with Message Batches API
For high-volume classification (e.g., processing overnight ticket backlog), use the Message Batches API for 50% cost savings. Batch requests run asynchronously and return results within 24 hours.
import anthropic
import json
client = anthropic.Anthropic()
# Batch classification for cost efficiency (50% discount)
# Prepare batch requests for multiple tickets
tickets_to_classify = [
{"id": "ticket-001", "text": "Double charged on my credit card this month"},
{"id": "ticket-002", "text": "API returns 500 error when uploading files over 10MB"},
{"id": "ticket-003", "text": "Can I get a demo of your enterprise features?"},
{"id": "ticket-004", "text": "Need to update the billing email on our account"},
{"id": "ticket-005", "text": "Your service has been down for 2 hours — we're losing revenue"},
]
classify_tool = {
"name": "classify_ticket",
"description": "Classify a support ticket.",
"input_schema": {
"type": "object",
"properties": {
"department": {"type": "string", "enum": ["billing", "technical", "sales", "general"]},
"priority": {"type": "string", "enum": ["urgent", "high", "medium", "low"]},
"confidence": {"type": "number"}
},
"required": ["department", "priority", "confidence"]
}
}
# Build batch requests
batch_requests = []
for ticket in tickets_to_classify:
batch_requests.append({
"custom_id": ticket["id"],
"params": {
"model": "claude-sonnet-4-20250514",
"max_tokens": 200,
"system": "Classify support tickets accurately by department and priority.",
"tools": [classify_tool],
"tool_choice": {"type": "tool", "name": "classify_ticket"},
"messages": [{"role": "user", "content": ticket["text"]}]
}
})
# Submit batch (async — results available within 24h, 50% cost savings)
batch = client.messages.batches.create(requests=batch_requests)
print(f"Batch submitted: {batch.id}")
print(f"Status: {batch.processing_status}")
print(f"Total requests: {len(batch_requests)}")
print(f"Cost savings: 50% vs individual API calls")
# Poll for results (in production, use webhooks or polling)
# results = client.messages.batches.results(batch.id)
# for result in results:
# print(f"{result.custom_id}: {result.result.message.content}")
4.2 Eval-Driven Iteration
Classification accuracy must be measured and improved iteratively. Build an eval set of labeled tickets, run your classifier against them, and iterate on the system prompt until you hit your accuracy target.
E-Commerce Support Routing at Scale
A mid-size e-commerce support team was routing a few thousand tickets per day. Before Claude-based routing, tickets often sat in a shared queue for hours before manual triage. In a representative rollout, teams look for outcomes like:
- Classification quality: Low-90s accuracy on a held-out human-labeled eval set
- Routing speed: Queue-based manual triage replaced by near-real-time automated classification
- Operating cost: Haiku-based routing stayed inexpensive relative to staffing a fully manual triage layer
- Architecture: Forced
tool_choicewith Haiku for classification, explicit confidence thresholds, and human escalation for the ambiguous minority of tickets
Key insight: Using Haiku (cheapest model) with forced tool_choice is sufficient for classification tasks. Reserve Sonnet/Opus for generation and complex reasoning.
The eval-driven approach follows a simple loop:
- Label — Create a gold-standard set of 50-200 tickets with human-assigned labels
- Run — Classify all eval tickets with your current prompt/model
- Measure — Calculate accuracy, precision, recall per category
- Iterate — Adjust system prompt, add edge-case examples, tune confidence calibration
- Repeat — Until accuracy meets your production threshold (typically >90%)
A/B testing routing prompts: In production, run two prompt variants simultaneously (50/50 split), measure downstream resolution time and customer satisfaction, then promote the winner. This is especially valuable when your classification categories evolve.
Next in the SDK Track
In Part 18: Customer Support Agent, we’ll build an end-to-end support agent with MCP tools for CRM integration, identity verification gates, escalation criteria, human handoff patterns, and production guardrails.