Back to AI App Dev Series

Anthropic SDK Track Part 1: Platform & SDK Setup

May 22, 2026 Wasil Zafar 40 min read

Master the Anthropic platform and Claude SDK fundamentals — API key management, Python and TypeScript client installation, model families and versioning, service tiers, rate limits, regions, beta headers, error handling, and retry strategies for production applications.

Table of Contents

  1. Platform Overview
  2. SDK Installation
  3. Client Configuration
  4. Error Handling & Retries
  5. Production Checklist
  6. Console Prototyping
  7. Models API
What You’ll Learn: This article gets you from zero to your first working Claude agent in under 10 minutes. You’ll install the SDK, understand how API keys and model selection work, make your first API call, and learn the patterns that every Anthropic application uses — whether it’s a simple chatbot or a complex multi-agent system.

1. Anthropic Platform Overview

The Anthropic platform provides access to Claude models through a straightforward API. Unlike multi-layered organizational hierarchies, Anthropic uses a flat structure: your account holds API keys, each with workspace-level access. The Console at console.anthropic.com is your control plane for key management, usage monitoring, and model access.

1.1 Console & API Keys

The Anthropic Console is where you create and manage API keys, monitor usage, set spending limits, and prototype with the Workbench. Each API key is scoped to a workspace and can be revoked independently.

Anthropic Platform Structure
flowchart TD
    A["Anthropic Account"] --> B["Workspace"]
    B --> C["API Key: production"]
    B --> D["API Key: development"]
    B --> E["API Key: ci-pipeline"]
    C --> F["Rate Limits & Usage"]
    D --> F
    E --> F
    B --> G["Console Workbench"]
    B --> H["Usage Dashboard"]
                        
ConceptPurposeKey Actions
AccountTop-level entity (individual or org)Manage billing, spending limits
WorkspaceIsolated environment for teamsSeparate keys, usage tracking
API KeyAuthentication credentialCreate, rotate, revoke per workspace
WorkbenchConsole prompt playgroundPrototype prompts, test tools

1.2 Model Families & Versioning

Anthropic organizes models into families optimized for different tradeoffs between capability, speed, and cost. As of May 2026, the most relevant first-party model families for new Claude API work are Opus 4.8, Sonnet 4.6, and Haiku 4.5.

ModelStrengthsContextBest For
Claude Opus 4.8Highest capability, strongest long-horizon reasoningUp to 1MResearch, complex analysis, high-autonomy coding
Claude Sonnet 4.6Best balance of speed and intelligenceUp to 1MProduction workloads, agents, coding, general tasks
Claude Haiku 4.5Fastest, lowest cost200KClassification, routing, summarization, simple extraction
Model Versioning: As of May 2026, starting with the Claude 4.6 generation, Anthropic uses dateless IDs (for example, claude-sonnet-4-6) that are pinned snapshots — not rolling aliases. Older model families still use dated snapshot IDs and convenience aliases. Always verify the currently active IDs in the models overview before changing production defaults.

Here is a minimal example showing how to pin a model version and make your first API call:

import anthropic

# Pin to a specific model version for production stability
client = anthropic.Anthropic()

# Production: pinned version
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude!"}]
)
print(response.content[0].text)

1.3 Service Tiers & Rate Limits

Anthropic applies rate limits per-workspace based on your usage tier. Limits are measured in requests per minute (RPM), tokens per minute (TPM), and tokens per day (TPD). Higher tiers unlock greater throughput as you build usage history and increase spending.

TierRequirementRPMTPM (Input)TPD (Input)
Tier 1Credit purchase ($5+)5040,0001,000,000
Tier 2$40+ spend, 7+ days1,00080,0002,500,000
Tier 3$200+ spend, 30+ days2,000160,0005,000,000
Tier 4$400+ spend, 60+ days4,000400,00050,000,000
Important: Rate limits apply per-workspace, not per-key. Distributing requests across multiple API keys in the same workspace does not increase your rate limit. For higher throughput, request a tier upgrade through the Console.

2. SDK Installation

Anthropic provides official SDKs for Python and TypeScript. Both are fully typed, support streaming, and include built-in retry logic with exponential backoff.

2.1 Python SDK

The Python SDK (anthropic) is the primary SDK for server-side applications, data pipelines, and agent systems. It supports both synchronous and asynchronous usage patterns.

# Install the Anthropic Python SDK
pip install anthropic

# Or with optional dependencies for streaming
pip install anthropic[streaming]

# Verify installation
python -c "import anthropic; print(anthropic.__version__)"

Once installed, initialize the client and make your first request. The SDK reads your API key from the ANTHROPIC_API_KEY environment variable by default:

import anthropic

# The SDK reads ANTHROPIC_API_KEY from environment by default
client = anthropic.Anthropic()

# Or pass the key explicitly (not recommended for production)
client = anthropic.Anthropic(api_key="sk-ant-api03-...")

# Make your first API call
message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is the Anthropic API?"}
    ]
)
print(message.content[0].text)
print(f"Usage: {message.usage.input_tokens} in, {message.usage.output_tokens} out")

2.2 TypeScript SDK

The TypeScript SDK provides the same functionality with full type safety. It works in Node.js, Deno, and edge runtimes (Cloudflare Workers, Vercel Edge).

# Install the Anthropic TypeScript SDK
npm install @anthropic-ai/sdk

# Or with yarn/pnpm
yarn add @anthropic-ai/sdk
pnpm add @anthropic-ai/sdk

After installation, create a client instance and make a request. The TypeScript SDK mirrors the Python SDK’s interface:

import Anthropic from "@anthropic-ai/sdk";

// Reads ANTHROPIC_API_KEY from environment
const client = new Anthropic();

async function main() {
    const message = await client.messages.create({
        model: "claude-sonnet-4-6",
        max_tokens: 1024,
        messages: [
            { role: "user", content: "What is the Anthropic API?" }
        ]
    });

    console.log(message.content[0].text);
    console.log(`Usage: ${message.usage.input_tokens} in, ${message.usage.output_tokens} out`);
}

main();

2.3 API Key Management

API keys should never be hardcoded or committed to version control. Use environment variables or a secrets manager for all environments.

# Set the API key as an environment variable
export ANTHROPIC_API_KEY="sk-ant-api03-your-key-here"

# For production: use a secrets manager
# AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, HashiCorp Vault

For production deployments, load the API key programmatically from a secrets manager rather than relying on environment variables:

import os
import anthropic

# Best practice: let the SDK read from environment
client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY automatically

# Alternative: load from a secrets manager
def get_api_key():
    """Load API key from your secrets provider."""
    # Example: AWS Secrets Manager
    # import boto3
    # client = boto3.client('secretsmanager')
    # response = client.get_secret_value(SecretId='anthropic-api-key')
    # return response['SecretString']
    return os.environ["ANTHROPIC_API_KEY"]

client = anthropic.Anthropic(api_key=get_api_key())
Security: Anthropic API keys start with sk-ant-api03-. If you accidentally expose a key, revoke it immediately in the Console and generate a new one. Use separate keys for development, staging, and production so compromised dev keys don’t affect prod.
Real-World Application

Building a Customer FAQ Bot

A startup used Claude to build a FAQ bot that reduced support tickets by 40%. Key patterns: system prompts for consistent tone, temperature=0 for deterministic answers, and structured output for ticket routing. The team started with claude-haiku for speed during prototyping, then upgraded to claude-sonnet for production where answer quality mattered more than latency.

System PromptsDeterministic OutputTicket Routing

3. Client Configuration

3.1 Sync vs Async Clients

The Python SDK provides both synchronous (Anthropic) and asynchronous (AsyncAnthropic) clients. Use async for high-concurrency applications (web servers, agent systems) and sync for scripts and notebooks.

import anthropic
import asyncio

# Synchronous client — for scripts, notebooks, simple applications
sync_client = anthropic.Anthropic()
response = sync_client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    messages=[{"role": "user", "content": "Hello!"}]
)

print(f"Synchronous response: {response.content[0].text}")

# Asynchronous client — for web servers, high-concurrency workloads
async_client = anthropic.AsyncAnthropic()

async def generate_response(prompt: str) -> str:
    response = await async_client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text

# Run multiple requests concurrently (Jupyter/IPython compatible)
async def main():
    prompts = ["Summarize AI safety", "Explain RAG", "Define MCP"]
    results = await asyncio.gather(*[generate_response(p) for p in prompts])
    for prompt, result in zip(prompts, results):
        print(f"Asynchronous response for : {prompt} -> \n {result[:80]}...\n")

await main()

3.2 Beta Headers

Some Anthropic features are released as betas. Access them by passing the anthropic-beta header or using the SDK’s beta interface. Beta features may change without notice, so pin versions and test thoroughly.

import anthropic

client = anthropic.Anthropic()

# Access beta features via the betas namespace
# Example: using prompt caching (beta)
response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    betas=["prompt-caching-2024-07-31"],
    system=[{
        "type": "text",
        "text": "You are a helpful assistant.",
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[{"role": "user", "content": "What features are in beta?"}]
)
print(f"What features are in beta?: {response.content[0].text}")
# Or pass headers directly for raw API access
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    extra_headers={"anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15"},
    messages=[{"role": "user", "content": "Hello!"}]
)
print(f"Hello!: {response.content[0].text}")

3.3 Region Configuration

Anthropic provides regional API endpoints for data residency and latency optimization. Configure the base URL to route requests to a specific region.

import anthropic

# Default: US region (api.anthropic.com)
client = anthropic.Anthropic()

# EU region for data residency requirements
eu_client = anthropic.Anthropic(
    base_url="https://api.eu.anthropic.com"
)

# Custom base URL (e.g., for proxy or gateway)
proxied_client = anthropic.Anthropic(
    base_url="https://your-gateway.company.com/anthropic"
)

# Verify configured endpoints
print(f"US client base URL: {client.base_url}")
print(f"EU client base URL: {eu_client.base_url}")
print(f"Proxy client base URL: {proxied_client.base_url}")

4. Error Handling & Retries

Production applications must handle API errors gracefully. The Anthropic SDK provides typed exception classes and built-in retry logic with exponential backoff for transient failures.

4.1 Error Types

ExceptionHTTP CodeCauseAction
AuthenticationError401Invalid or expired API keyCheck/rotate key
PermissionDeniedError403Key lacks required permissionsCheck workspace access
NotFoundError404Invalid model or endpointVerify model name
RateLimitError429Too many requestsBack off, retry with delay
APIStatusError500+Server errorRetry with backoff
APIConnectionErrorNetwork failureCheck connectivity, retry
APITimeoutErrorRequest timed outIncrease timeout or retry

Here is a comprehensive error-handling pattern covering all exception types with appropriate recovery actions:

import anthropic

client = anthropic.Anthropic()

try:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response.content[0].text)

except anthropic.AuthenticationError as e:
    print(f"Authentication failed: {e.message}")
    # Action: check API key, rotate if needed

except anthropic.RateLimitError as e:
    print(f"Rate limited: {e.message}")
    # Action: the SDK retries automatically, but you may want custom logic
    # Check response headers: retry-after, x-ratelimit-limit-requests

except anthropic.APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")
    # Action: log and retry for 5xx, don't retry for 4xx

except anthropic.APIConnectionError as e:
    print(f"Connection error: {e.message}")
    # Action: check network, retry with backoff

except anthropic.APITimeoutError as e:
    print(f"Request timed out: {e.message}")
    # Action: increase timeout or reduce max_tokens

4.2 Retry Strategies

The Anthropic SDK includes built-in retry logic with exponential backoff for transient errors (429, 5xx, connection errors). You can configure the maximum number of retries and customize the behavior.

import anthropic

# Default: 2 retries with exponential backoff
client = anthropic.Anthropic(max_retries=2)

# Increase retries for critical production workloads
resilient_client = anthropic.Anthropic(max_retries=5)

# Disable automatic retries (handle manually)
no_retry_client = anthropic.Anthropic(max_retries=0)

For more sophisticated retry logic, implement custom backoff with jitter. Exponential backoff doubles the wait time after each failure (1s → 2s → 4s → 8s), preventing thundering herd problems — a scenario where many clients fail simultaneously (e.g., after a brief API outage), then all retry at the same fixed interval, flooding the server with a synchronized burst that causes it to fail again. Jitter adds randomness to the delay so that multiple clients hitting a rate limit don’t all retry at the exact same moment, spreading the load over time instead of concentrating it:

import anthropic
import time
import random

def call_with_retry(client, max_attempts=5, base_delay=1.0, **kwargs):
    """Custom retry with exponential backoff and jitter."""
    for attempt in range(max_attempts):
        try:
            return client.messages.create(**kwargs)
        except anthropic.RateLimitError:
            if attempt == max_attempts - 1:
                raise
            delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
            print(f"Rate limited. Retrying in {delay:.1f}s (attempt {attempt + 1})")
            time.sleep(delay)
        except anthropic.APIStatusError as e:
            if e.status_code >= 500 and attempt < max_attempts - 1:
                delay = base_delay * (2 ** attempt)
                time.sleep(delay)
            else:
                raise

client = anthropic.Anthropic(max_retries=0)  # disable built-in retries
response = call_with_retry(
    client,
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(f"Response: {response.content[0].text}")

4.3 Timeout Configuration

Configure timeouts based on your use case. Long-running requests (extended thinking, large outputs) need higher timeouts than simple classification tasks.

import anthropic
import httpx

# Default timeout: 10 minutes
client = anthropic.Anthropic()

# Custom timeout for all requests
client = anthropic.Anthropic(
    timeout=httpx.Timeout(300.0, connect=5.0)  # 5s connect, 300s total
)

# Per-request timeout override
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    messages=[{"role": "user", "content": "Write a detailed essay..."}],
    timeout=600.0  # 10 minutes for this specific request
)

5. Production Checklist

Before deploying your Claude-powered application to production, verify these configurations:

Production Readiness

Deployment Checklist

  1. API Key Security — Key stored in secrets manager, not environment variables or code
  2. Model Pinning — Using versioned model ID (not -latest alias)
  3. Error Handling — All exception types caught with appropriate recovery actions
  4. Rate Limit Awareness — Know your tier limits; implement backoff and queuing
  5. Timeout Configuration — Timeouts set appropriate to use case (not default)
  6. Retry Logic — max_retries configured; custom logic for critical paths
  7. Logging — Request IDs logged for debugging with Anthropic support
  8. Cost Monitoring — Usage dashboard alerts set; per-request token tracking
  9. Separate Keys — Different keys for dev, staging, production
  10. Region Selection — Correct region for data residency and latency needs
SecurityReliabilityObservability
import anthropic
import httpx
import logging

# Production-ready client configuration
logger = logging.getLogger(__name__)

def create_production_client() -> anthropic.Anthropic:
    """Create a production-configured Anthropic client."""
    return anthropic.Anthropic(
        # Key loaded from secrets manager (SDK reads ANTHROPIC_API_KEY)
        max_retries=3,
        timeout=httpx.Timeout(
            timeout=120.0,   # 2 min total timeout
            connect=5.0      # 5s connection timeout
        ),
    )

def safe_generate(client: anthropic.Anthropic, prompt: str) -> str:
    """Generate a response with comprehensive error handling."""
    try:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=2048,
            messages=[{"role": "user", "content": prompt}]
        )
        logger.info(
            "Request succeeded",
            extra={
                "request_id": response.id,
                "input_tokens": response.usage.input_tokens,
                "output_tokens": response.usage.output_tokens,
                "model": response.model,
                "stop_reason": response.stop_reason,
            }
        )

        print(f"Request succeeded. extras: \n"+
        f"request_id: {response.id}\n"+
        f"input_tokens: {response.usage.input_tokens}\n"+
        f"output_tokens: {response.usage.output_tokens}\n"+
        f"model: {response.model}\n"+
        f"stop_reason: {response.stop_reason} \n"
        )
        return response.content[0].text

    except anthropic.RateLimitError:
        logger.warning("Rate limited — request will be retried by SDK")
        raise

    except anthropic.APIStatusError as e:
        logger.error(f"API error: {e.status_code} - {e.message}")
        raise

    except anthropic.APIConnectionError:
        logger.error("Connection failed — check network")
        raise

# Demo: create client and generate a response
client = create_production_client()
result = safe_generate(client, "What is exponential backoff?")
print(f"Response: {result[:120]}...")
Try It Yourself: Create a simple agent that takes a topic as input and returns a haiku about it. Use claude-haiku-4-5 for speed. Then modify it to return the haiku in JSON format with fields for lines 1, 2, and 3. Bonus: add error handling for rate limits.

6. Console Prototyping (CCA 0.2)

Before writing code, the Anthropic Console (console.anthropic.com) lets you prototype prompts, test tool schemas, and configure agents interactively. Think of it as a playground for experimentation — you can iterate on system prompts 10x faster in the Console than in code because there’s no deploy cycle.

Analogy: The Console is like a REPL for AI development. Just as you wouldn’t write a complex Python script without first testing ideas in the REPL, you shouldn’t write production agent code without first validating your approach in the Console.

6.1 Workbench & Agent Setup

When you first open the Workbench, it shows a clean interface with guidance on getting started — write a prompt, click Run, and see Claude’s response immediately:

Claude Console Workbench welcome screen showing the prompt editor interface with instructions for getting started
The Workbench welcome screen — write a prompt, select a model, use variables with {{VARIABLE_NAME}} syntax, and click Run.

Here’s a real example: a Poetry Generator prompt using a {{THEME}} variable. Notice how the prompt is structured with clear constraints and output format, and the response appears instantly in the right panel:

Claude Console Workbench Prompt tab showing a Poetry Generator with template variables and Claude's response
Prompt tab — A Poetry Generator using {{THEME}} variable. Claude produces a couplet about “autumn leaves” in real time. Use Get Code (top right) to export as Python/TypeScript.
Console Workbench → Production Code Pipeline
flowchart TD
    subgraph Console["Anthropic Console"]
        PE[Prompt Editor
Write & test system prompts] TT[Tool Tester
Define schemas, see tool calls] AB[Agent Builder
Configure multi-tool agents] EV[Evaluation Tool
Run test cases against prompts] end subgraph Workflow["Development Pipeline"] S1["Step 1: Prototype
Try prompts, test inputs,
iterate until correct"] S2["Step 2: Export to Code
Python / TypeScript / curl
snippets auto-generated"] S3["Step 3: Production Features
Error handling, streaming,
caching, integration"] end Console --> S1 S1 --> S2 S2 --> S3 subgraph Extras["Also Available"] MCP[MCP Connector] VH[Version History] TS[Team Sharing] CE[Cost Estimation] end

Key Console URLs:

  • console.anthropic.com/workbench — Prompt playground
  • console.anthropic.com/settings — API keys and org settings
  • console.anthropic.com/usage — Token usage dashboard
  • console.anthropic.com/agents — Agent configuration

6.2 Console Evaluation Tool

Switch to the Evaluate tab to run your prompt against multiple test inputs simultaneously. Each row tests a different variable value, and you can compare outputs, add scoring, and export results:

Claude Console Evaluate tab showing test cases with different theme inputs and their model outputs
Evaluate tab — test the Poetry Generator with 3 themes (“autumn leaves”, “bright sun”, “morning coffee”). Use + Add Comparison to A/B test prompt versions. Export to CSV for CI/CD integration.
Console Evaluation Cycle
flowchart LR
    A["Define Test Cases
(input + expected behavior)"] --> B["Run All Cases
Against Current Prompt"] B --> C{"All Pass?"} C -->|Yes| D["Export to JSON
for CI/CD Pipeline"] C -->|No| E["Review Failures
+ Response Details"] E --> F["Iterate on Prompt"] F --> B
# Test case format in Console:
console_test_cases = [
    {
        "name": "Billing classification",
        "user_message": "I was charged twice for my subscription",
        "expected": {
            "tool_called": "classify_ticket",
            "category": "billing",
            "confidence_above": 0.8
        }
    },
    {
        "name": "Ambiguous — should trigger low confidence",
        "user_message": "I have a question about my account and also a tech issue",
        "expected": {
            "tool_called": "classify_ticket",
            "confidence_below": 0.7  # Should trigger human review
        }
    },
    {
        "name": "Out-of-scope rejection",
        "user_message": "What's the weather like today?",
        "expected": {
            "contains": "I can only help with",
            "tool_called": None  # Should NOT call any tool
        }
    }
]

# Console eval features:
# - Batch run: test all cases at once
# - Comparison mode: run same cases on two different prompts (A/B)
# - Cost tracking: see total tokens used for the eval suite
# - Export: download test cases as JSON (import into CI/CD evals)
CCA Exam Pattern (0.2) — Console Prototyping:
  1. Console Workbench is for prototyping BEFORE writing code. The exam tests whether you know the correct workflow order: prototype in Workbench first, validate behaviour interactively, then export to SDK code. A question might present a scenario where a developer jumps straight to coding — the correct answer is “use the Console first to iterate faster.” The Workbench provides a zero-deploy-cycle feedback loop that makes prompt engineering 10x faster than code-test-deploy cycles.
  2. Console Workbench tool tester lets you define tool schemas and see how Claude calls them. The { } button in the Workbench lets you define JSON tool schemas and observe exactly what JSON Claude generates for tool calls — without writing any client code. You manually provide mock responses to simulate the tool. This is distinct from claude.ai Connectors (see Section 6.3 below), which connect to live remote MCP servers.
  3. Console eval tool generates test cases exportable to CI/CD. You can define input/expected-output pairs in the Console, run them interactively, then export as JSON. That exported JSON feeds directly into your CI/CD evaluation pipeline (covered in Part 9). The key insight: Console evals are a starting point, not a replacement for automated evals — they bootstrap your golden dataset.
  4. Version history allows rolling back system prompt changes. Every edit to a system prompt in the Console is versioned. If a new prompt degrades performance, you roll back to the previous version instantly. Exam questions test this as a safety mechanism: “How do you recover from a prompt regression?” → Answer: Console version history (not manually reverting code commits).

6.3 claude.ai Connectors (Remote MCP Testing)

While the Console Workbench lets you test tool schemas, claude.ai Connectors let you connect Claude to live remote MCP servers — so you can test actual tool execution end-to-end. This feature is at claude.ai/customize/connectors (not in the developer Console).

Claude.ai Settings page showing Connectors section with link to Customize
Settings → Connectors redirects to Customize. Available on Free (1 connector), Pro, Max, Team, and Enterprise plans.
Claude.ai Customize Connectors page showing Browse connectors and Add custom connector options
Customize → Connectors: Browse pre-built connectors (GitHub, Jira, etc.) or add a custom connector pointing to your own remote MCP server.
Add custom connector modal showing fields for Name, Remote MCP server URL, OAuth Client ID and Secret
Add custom connector dialog — provide your remote MCP server URL and optional OAuth credentials. Claude connects from Anthropic’s cloud infrastructure (your server must be publicly reachable).
Key Distinction: The Console Workbench (console.anthropic.com) is where you prototype prompts and tool schemas — you define tools manually and provide mock responses. claude.ai Connectors (claude.ai/customize/connectors) connect to live remote MCP servers for end-to-end testing. For the CCA exam: “fastest way to validate tool schema correctness” = Console Workbench. “Fastest way to test a live MCP server with Claude” = claude.ai Connectors.

7. Models API (CCA 0.3)

The Models API lets you programmatically discover which models are available, their capabilities, and pricing — instead of hardcoding model names. This is essential for production systems that need to adapt when new models launch or old ones are deprecated.

7.1 List & Get Models

import anthropic

client = anthropic.Anthropic()

# LIST available models — GET /v1/models
models = client.models.list()
for model in models.data:
    print(f"{model.id}: {model.display_name} (max output: {model.max_tokens})")

# Example output (as of May 2026, abbreviated):
# claude-opus-4-8: Claude Opus 4.8 (max output: 128000)
# claude-sonnet-4-6: Claude Sonnet 4.6 (max output: 64000)
# claude-haiku-4-5-20251001: Claude Haiku 4.5 (max output: 64000)
# ...

# NOTE: client.models.get() does NOT exist in the SDK.
# To find a specific model, filter from the list:
target_id = "claude-sonnet-4-6"
match = next((m for m in models.data if m.id == target_id), None)
if match:
    print(f"\nFound: {match.display_name}")
    print(f"  Max output: {match.max_tokens} tokens")
    print(f"  ID: {match.id}")
Model Versioning Strategy: As of May 2026, model IDs come in two common patterns: pinned dateless snapshots for Claude 4.6+ and older alias-plus-dated-snapshot patterns for pre-4.6 families.
FormatExampleBehaviourUse When
Pinned dateless snapshot (4.6+)claude-sonnet-4-6Frozen to a specific release even without a date suffixProduction and general use on Claude 4.6+ families
Legacy alias (pre-4.6)claude-sonnet-4-5Convenience pointer that resolves to a dated snapshotDevelopment or legacy migrations where automatic pointer behavior is acceptable
Dated snapshot (pre-4.6)claude-sonnet-4-5-20250929Frozen to a specific historical releaseLegacy production workloads that must stay on an older exact version
Production Rule: For Claude 4.6+ families, the dateless ID is already pinned (for example, claude-sonnet-4-6). For older families, prefer dated snapshots in production instead of convenience aliases. Re-check the model-versioning docs before adopting a new family, because the naming rules changed starting with 4.6.

7.2 Capability-Based Model Selection

Each model family trades off capability, speed, and cost. As of May 2026, choose the cheapest current model that meets your task requirements — don’t default to the most capable model for every request.

ModelVisionToolsThinkingMax OutputContextBest ForCost
claude-opus-4-8128K1MComplex reasoning, high-autonomy coding, architecture$$$
claude-sonnet-4-664K1MCoding, analysis, general tasks$$
claude-haiku-4-564K200KClassification, routing, extraction$
Model Selection by Task Type
flowchart LR
    TASK{"Task Type?"} -->|"Classification
Routing
Extraction
Summarization"| H["Haiku
Fast · Cheap · Sufficient"] TASK -->|"Coding
Analysis
Conversation"| S["Sonnet
Balanced · Default choice"] TASK -->|"Architecture
Research
Complex reasoning"| O["Opus + Thinking
Maximum quality"] style H fill:#3B9797,color:#fff style S fill:#16476A,color:#fff style O fill:#132440,color:#fff
CCA Exam Pattern (0.3): Questions test: (1) Model naming/versioning changed starting with Claude 4.6, so check whether a family uses pinned dateless IDs or legacy aliases. (2) Haiku fits simple low-cost tasks, Sonnet is the common default, and Opus is for the highest-complexity work. (3) Models API lets you check capabilities programmatically. (4) All current first-party Claude models support vision, but context and output limits vary by model family.

Next in the SDK Track

In Part 2: Messages API & Content Blocks, we’ll explore the Messages API in depth — system prompts as a separate parameter, content block architecture (TextBlock, ToolUseBlock, ToolResultBlock, ThinkingBlock), streaming events, stop_reason handling, and token counting.