1. The Agent Harness
Deep Agents is a standalone library built on top of LangChain’s core building blocks and the LangGraph runtime. It provides an opinionated but extensible agent harness — the same core tool-calling loop as other agent frameworks, but with built-in capabilities for planning, virtual filesystems, context management, subagent delegation, and code execution.
Rather than manually wiring LangGraph nodes, edges, and state schemas, Deep Agents gives you one function — create_deep_agent() — that assembles a production-ready agent graph with sensible defaults, fully customizable at every layer.
create_app() call.
flowchart TD
A["create_deep_agent()"] --> B["Agent Harness"]
B --> C["Planning Engine
(write_todos)"]
B --> D["Virtual Filesystem
(ls, read, write, edit, glob, grep)"]
B --> E["Context Manager
(offload, summarize, compress)"]
B --> F["Subagent Orchestrator
(task delegation)"]
B --> G["Sandbox Runtime
(code execution)"]
C --> H["To-Do List State"]
D --> I["Backend Protocol
(State / Store / Composite)"]
F --> J["Custom Subagents"]
F --> K["General-Purpose Subagent"]
G --> L["Interpreters / Sandbox"]
1.1 The create_deep_agent() API
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
# Minimal Deep Agent — just a model and a system prompt
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
system_prompt="You are a research assistant. Be thorough and cite sources.",
)
# Invoke the agent
result = agent.invoke(
{"messages": [{"role": "user", "content": "Summarize the latest advances in protein folding."}]}
)
print(result["messages"][-1].content)
The full API signature with all configuration options:
| Parameter | Type | Description |
|---|---|---|
model | str | BaseChatModel | Model in provider:model format or pre-configured instance |
tools | list[BaseTool | Callable] | Custom tools alongside built-in filesystem tools |
system_prompt | str | Prepended to the built-in harness prompt |
middleware | list[AgentMiddleware] | Model call middleware for runtime interception |
subagents | list[SubAgent | CompiledSubAgent] | Subagent definitions for task delegation |
backend | BackendProtocol | Storage backend for the virtual filesystem |
interrupt_on | dict[str, bool] | Tools requiring human approval before execution |
permissions | list[FilesystemPermission] | Filesystem access control rules |
skills | list[str] | Paths to SKILL.md files for progressive disclosure |
memory | list[str] | Paths to AGENTS.md memory files (always loaded) |
context_schema | type[ContextT] | Typed runtime context shape |
checkpointer | Checkpointer | State persistence for conversations |
store | BaseStore | Cross-thread persistent storage |
1.2 Planning with To-Do Lists
Every Deep Agent has a built-in write_todos tool that maintains a structured task list. For complex requests, the agent breaks work into discrete tasks and works through each item systematically:
# pip install deepagents langchain-openai
from deepagents import create_deep_agent
agent = create_deep_agent(
model="openai:gpt-4.1-mini",
system_prompt="You are a technical writer. Break complex requests into clear steps.",
)
result = agent.invoke({
"messages": [{
"role": "user",
"content": "Write a comprehensive comparison of PostgreSQL vs MySQL for a new SaaS app."
}]
})
# The agent internally creates a to-do list:
# [x] Research PostgreSQL strengths for SaaS workloads
# [x] Research MySQL strengths for SaaS workloads
# [x] Compare performance benchmarks
# [x] Write recommendation summary
# Then works through each item, checking them off as it goes.
print(result["messages"][-1].content)
2. Context Engineering
The most critical challenge for long-running agents is context management. An agent executing a 50-step task may generate hundreds of tool calls and thousands of tokens. Without active management, the conversation exceeds the context window and the agent fails. Deep Agents solve this with a layered context engineering system.
| Context Type | When Loaded | Persistence | Example |
|---|---|---|---|
| Input Context | Every turn | Per-thread | System prompt, memory files (AGENTS.md), skill frontmatter |
| Runtime Context | Per invocation | Per-call | User identity, API keys, session metadata |
| Compression | At 85% capacity | Automatic | Offloading tool results, summarizing history |
| Isolation | Per subagent | Scoped | Subagent work stays in its own context |
| Long-term Memory | On demand | Cross-thread | Preferences in /memories/ path |
2.1 Input Context
Input context is the foundation — content loaded into every turn. The system prompt you provide is prepended to the harness’s built-in prompt:
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
system_prompt="""You are a senior data engineer specializing in ETL pipelines.
Rules:
- Always validate data schemas before transformation
- Use incremental loading over full refreshes when possible
- Log all data quality issues to the /quality-reports/ directory""",
memory=["AGENTS.md"], # Always loaded into context alongside system prompt
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Design an ETL pipeline for daily sales data."}]
})
print(result["messages"][-1].content)
2.2 Runtime Context
Runtime context carries per-invocation data (user identity, API keys, roles) that tools can access but is never visible to the LLM — creating a security boundary for sensitive values:
# pip install deepagents langchain-anthropic
from dataclasses import dataclass
from deepagents import create_deep_agent
from langchain.tools import tool, ToolRuntime
@dataclass
class UserContext:
user_id: str
user_role: str # "admin" | "viewer"
api_key: str # Never visible to the LLM
@tool
def fetch_user_orders(query: str, runtime: ToolRuntime[UserContext]) -> str:
"""Fetch orders for the currently authenticated user.
Args:
query: Search filter for orders
"""
user_id = runtime.context.user_id
return f"Orders for {user_id} matching '{query}': [order-001, order-002]"
@tool
def update_order_status(order_id: str, status: str, runtime: ToolRuntime[UserContext]) -> str:
"""Update an order's status. Requires admin role.
Args:
order_id: The order to update
status: New status value
"""
if runtime.context.user_role != "admin":
return f"Permission denied: admin role required (your role: {runtime.context.user_role})"
return f"Order {order_id} updated to '{status}'"
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
tools=[fetch_user_orders, update_order_status],
context_schema=UserContext,
system_prompt="You are an order management assistant.",
)
# Context is passed per invocation — different per user/session
result = agent.invoke(
{"messages": [{"role": "user", "content": "Show my recent orders and ship order-001."}]},
context=UserContext(user_id="user-789", user_role="admin", api_key="sk-secret"),
)
print(result["messages"][-1].content)
2.3 Compression & Offloading
The harness automatically manages context window usage with two mechanisms:
- Offloading (at 20k tokens): Large tool results are saved to the virtual filesystem, replaced with a reference pointer
- Summarization (at 85% capacity): Conversation history is compressed into a structured summary preserving the to-do list, key findings, and file paths
stateDiagram-v2
[*] --> Normal: Agent starts
Normal --> Offloading: Tool result > 20k tokens
Offloading --> Normal: Content saved to filesystem
Normal --> Summarization: Context at 85% capacity
Summarization --> Compressed: Summary replaces history
Compressed --> Normal: Agent continues with summary
Normal --> [*]: Task complete
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
# Context compression is automatic — no configuration needed
# When a tool returns a large result:
# 1. Result saved to filesystem at a generated path
# 2. Message history replaces result with reference pointer
# 3. Agent can read_file to retrieve specific sections as needed
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
system_prompt="""You are a research agent. When summarizing progress, preserve:
1. Current to-do list with status
2. Key findings and decisions
3. All file paths created/modified
4. Pending tasks and dependencies""",
)
result = agent.invoke({
"messages": [{
"role": "user",
"content": "Research all OWASP Top 10 vulnerabilities with code examples for each."
}]
})
print(result["messages"][-1].content)
3. Backends & Virtual Filesystem
3.1 Virtual Filesystem Tools
Every Deep Agent has access to a configurable virtual filesystem, regardless of which backend is configured:
| Tool | Description | Example |
|---|---|---|
ls | List directory contents with metadata | ls(path="/") |
read_file | Read file contents (text + multimodal) | read_file(path="/report.md") |
write_file | Create new files | write_file(path="/output.csv", content="...") |
edit_file | Exact string replacements in files | edit_file(path="/main.py", old="...", new="...") |
glob | Find files matching patterns | glob(pattern="**/*.py") |
grep | Search file contents | grep(pattern="TODO", path="/") |
execute | Run shell commands (sandbox only) | execute(command="python main.py") |
3.2 Backend Types
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
from deepagents.backends import StateBackend, CompositeBackend, StoreBackend
from langgraph.store.memory import InMemoryStore
# 1. StateBackend (default) — in-memory, thread-scoped
# Files exist only for the duration of the conversation
agent_memory = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
# backend=StateBackend() # This is the default
)
# 2. StoreBackend — persists files across threads via LangGraph Store
store = InMemoryStore() # Use PostgresStore for production
agent_persistent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
backend=StoreBackend(store=store, namespace=["project", "docs"]),
)
# 3. CompositeBackend — routes paths to different backends
agent_composite = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
backend=CompositeBackend(
StateBackend(), # Default for unmatched paths
routes={
"/memories/": StoreBackend(store=store, namespace=["user", "memory"]),
},
),
)
result = agent_composite.invoke({
"messages": [{"role": "user", "content": "Remember that I prefer Python over JavaScript."}]
})
print(result["messages"][-1].content)
3.3 Sandbox Execution
For agents that need to run generated code, sandbox backends provide isolated execution environments with an execute tool:
# pip install deepagents langchain-anthropic langsmith
from deepagents import create_deep_agent
from deepagents.backends import LangSmithSandbox
from langsmith.sandbox import Sandbox
# Create a sandboxed agent
sandbox = Sandbox() # Requires LANGSMITH_API_KEY
backend = LangSmithSandbox(sandbox)
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
backend=backend,
system_prompt="""You are a data scientist. When you need to run code:
1. Write the script to the filesystem
2. Use the execute tool to run it
3. Analyze the output""",
)
result = agent.invoke({
"messages": [{
"role": "user",
"content": "Generate a Monte Carlo simulation for option pricing and run it."
}]
})
print(result["messages"][-1].content)
From Notebook to 10K Users
A startup launched their LangChain-powered product to 10,000 users. Key production decisions: LangServe for the API layer, Redis for response caching (saved 40% of API costs), a circuit breaker for graceful degradation during OpenAI outages, and LangSmith monitoring with alerts on latency spikes. They went from prototype to production in 3 weeks with zero downtime incidents in the first month.
4. Subagents & Task Delegation
Subagents solve the context bloat problem. When the main agent encounters a heavy task (web search, large file analysis), it spawns a subagent with a fresh context window. The subagent completes the task and returns only the final result — keeping the main agent’s context clean.
flowchart LR
A["Main Agent
(clean context)"] -->|"task tool"| B["Subagent
(fresh context)"]
B --> C["Tool Call 1
50k tokens"]
B --> D["Tool Call 2
30k tokens"]
B --> E["Tool Call N
25k tokens"]
C & D & E --> F["Final Summary
~300 words"]
F -->|"single result"| A
4.1 SubAgent Configuration
# pip install deepagents langchain-anthropic langchain-openai tavily-python
import os
from deepagents import create_deep_agent
from tavily import TavilyClient
tavily_client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
def internet_search(query: str, max_results: int = 5) -> str:
"""Run a web search using Tavily."""
results = tavily_client.search(query, max_results=max_results)
return str(results)
# Define specialized subagents
research_subagent = {
"name": "research-agent",
"description": "Researches topics using web search. Delegate here for information gathering.",
"system_prompt": "You are a research specialist. Search thoroughly and synthesize findings.",
"tools": [internet_search],
"model": "openai:gpt-4.1-mini", # Cheap and fast for search tasks
}
code_subagent = {
"name": "code-agent",
"description": "Writes, tests, and debugs code. Delegate here for programming tasks.",
"system_prompt": "You are a senior Python developer. Write clean, tested code.",
"tools": [], # Uses filesystem tools by default
"model": "anthropic:claude-sonnet-4-6", # Capable model for code
}
# Main agent delegates to specialized subagents
agent = create_deep_agent(
model="anthropic:claude-haiku-4-5", # Cheap model for orchestration
subagents=[research_subagent, code_subagent],
system_prompt="You are a project lead. Delegate research to research-agent and coding to code-agent.",
)
result = agent.invoke({
"messages": [{
"role": "user",
"content": "Research rate limiting best practices, then write a Python implementation."
}]
})
print(result["messages"][-1].content)
4.2 CompiledSubAgent (Custom LangGraph Graphs)
For complex workflows, provide a pre-compiled LangGraph graph as a subagent:
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent, CompiledSubAgent
from langchain.agents import create_agent
# Create a custom agent graph using LangChain's create_agent
custom_graph = create_agent(
model="anthropic:claude-sonnet-4-6",
tools=[internet_search],
prompt="You are a specialized data analysis agent...",
)
# Wrap it as a CompiledSubAgent
data_analyzer = CompiledSubAgent(
name="data-analyzer",
description="Specialized agent for complex data analysis tasks",
runnable=custom_graph,
)
agent = create_deep_agent(
model="anthropic:claude-haiku-4-5",
subagents=[data_analyzer],
system_prompt="Delegate data analysis tasks to the data-analyzer subagent.",
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Analyze Q1 sales trends by region."}]
})
print(result["messages"][-1].content)
4.3 Context Isolation & Propagation
UserContext (user_id, role, etc.), so multi-agent workflows stay consistently authorized without manual plumbing. Conversation history does not propagate — each subagent starts with a fresh context window.
5. Skills & Memory
5.1 Skills System
Skills provide specialized workflows and domain knowledge using progressive disclosure — the agent reads only the frontmatter at startup and loads full skill content on demand when relevant:
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
# Skills are SKILL.md files with YAML frontmatter
# The agent loads frontmatter at startup, reads full content when needed
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
skills=[
"/skills/database-optimization/",
"/skills/api-design/",
"/skills/testing-patterns/",
],
system_prompt="You are a backend engineer. Use your skills when relevant.",
)
# The agent will only load the full "database-optimization" skill
# content when the task requires database work
result = agent.invoke({
"messages": [{"role": "user", "content": "Optimize the slow query on the orders table."}]
})
print(result["messages"][-1].content)
5.2 Memory Files (AGENTS.md)
Memory files are always loaded into context (unlike skills, which use progressive disclosure). Use them for persistent instructions, coding standards, and domain knowledge:
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
# Memory files are always loaded — they survive context compression
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
memory=["AGENTS.md"], # Project-level conventions and preferences
system_prompt="You are a code assistant for the acme-api project.",
)
# AGENTS.md might contain:
# - Code style preferences
# - Project architecture decisions
# - Naming conventions
# - Testing requirements
# All of this is available every turn, regardless of context compression.
result = agent.invoke({
"messages": [{"role": "user", "content": "Add a new endpoint for user preferences."}]
})
print(result["messages"][-1].content)
6. Middleware, HITL & Permissions
6.1 Custom Middleware
Middleware intercepts every model call for routing, logging, or transformation:
# pip install deepagents langchain-anthropic langchain-openai
from langchain.agents.middleware.types import AgentMiddleware
from langchain.chat_models import init_chat_model
from deepagents import create_deep_agent
class RouteByComplexity(AgentMiddleware):
"""Route simple messages to a cheaper model, complex ones to a capable model."""
def wrap_model_call(self, request, handler):
last_msg = request.messages[-1].content if request.messages else ""
if len(last_msg) < 200:
request = request.override(model=init_chat_model("openai:gpt-4.1-mini"))
else:
request = request.override(model=init_chat_model("anthropic:claude-sonnet-4-6"))
return handler(request)
agent = create_deep_agent(
model="anthropic:claude-haiku-4-5", # Default (cheap)
middleware=[RouteByComplexity()],
system_prompt="You are a research assistant.",
)
result = agent.invoke({
"messages": [{"role": "user", "content": "What is the capital of France?"}]
})
print(result["messages"][-1].content)
6.2 Human-in-the-Loop
The interrupt_on parameter pauses execution before critical tool calls, returning control for human approval:
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
interrupt_on={
"execute": True, # Pause before running shell commands
"write_file": True, # Pause before creating files
"edit_file": True, # Pause before editing files
},
system_prompt="You are a code assistant. Explain what you plan to do before doing it.",
)
# First invocation — agent plans work and hits the interrupt
result = agent.invoke({
"messages": [{"role": "user", "content": "Fix the bug in main.py"}]
})
# Check for pending interrupt
if "interrupt" in result:
pending = result["interrupt"]
print(f"Agent wants to call: {pending['tool']}")
print(f"Arguments: {pending['args']}")
# Approve and continue:
# result = agent.invoke({"messages": [], "approve": True})
# Reject with feedback:
# result = agent.invoke({"messages": [], "approve": False, "feedback": "Don't edit that."})
6.3 Filesystem Permissions
Declarative rules restrict filesystem access with first-match-wins semantics:
# pip install deepagents langchain-anthropic
from deepagents import create_deep_agent
from deepagents.middleware.permissions import FilesystemPermission
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
permissions=[
# Allow read/write to workspace
FilesystemPermission(operations=["read", "write"], paths=["/workspace/**"], mode="allow"),
# Deny access to secrets
FilesystemPermission(operations=["read", "write"], paths=["**/.env", "**/secrets/**"], mode="deny"),
# Allow read-only access to docs
FilesystemPermission(operations=["read"], paths=["/docs/**"], mode="allow"),
],
system_prompt="You are a code assistant with restricted file access.",
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Read the .env file and show me the API keys."}]
})
# Agent will be denied access to .env files
print(result["messages"][-1].content)
7. Production Deployment
| Capability | Development | Production |
|---|---|---|
| Backend | StateBackend (in-memory) | StoreBackend with PostgresStore |
| Checkpointer | MemorySaver | AsyncPostgresSaver |
| Sandbox | Local execution | LangSmith Sandbox (isolated) |
| Tracing | Optional | LangSmith with LANGCHAIN_TRACING_V2=true |
| Hosting | Local invocation | Managed Deep Agents or Agent Server |
# pip install deepagents langchain-anthropic langgraph-checkpoint-postgres
import os
from deepagents import create_deep_agent
from deepagents.backends import CompositeBackend, StateBackend, StoreBackend
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
from langgraph.store.postgres import AsyncPostgresStore
# Production configuration
async def create_production_agent():
# Persistent store for cross-thread data
store = AsyncPostgresStore(conn_string=os.environ["DATABASE_URL"])
# Durable checkpointer for conversation state
checkpointer = await AsyncPostgresSaver.from_conn_string(os.environ["DATABASE_URL"])
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
system_prompt="You are a production assistant.",
backend=CompositeBackend(
StateBackend(),
routes={"/memories/": StoreBackend(store=store, namespace=["user", "prefs"])},
),
checkpointer=checkpointer,
store=store,
interrupt_on={"execute": True}, # Safety gate for code execution
)
return agent
LangChain vs LangGraph vs Deep Agents
LangChain provides core building blocks (models, tools, prompts). LangGraph adds a stateful runtime (graphs, persistence, human-in-the-loop). Deep Agents adds the harness layer on top — opinionated built-in tools (planning, filesystem, subagents) with automatic context management. Choose the level of abstraction that matches your needs.