We use cookies to enhance your browsing experience, serve personalized content, and analyze our traffic.
By clicking "Accept All", you consent to our use of cookies. See our
Privacy Policy
for more information.
AI Application Development Mastery Part 11: AI Application Design Patterns
April 1, 2026Wasil Zafar44 min read
A complete catalog of proven AI application design patterns — from simple prompt-response to complex multi-agent orchestration. Learn which pattern to apply for each use case, which framework implements each pattern best, and how to avoid the anti-patterns that plague production AI systems.
Series Overview: This is Part 11 of our 18-part AI Application Development Mastery series. Having mastered agents and multi-agent systems, we now catalog the complete set of design patterns that underpin every AI application — giving you a systematic framework for architectural decisions.
Just as the Gang of Four cataloged design patterns for object-oriented programming, the AI application ecosystem has developed a set of recurring architectural patterns that solve common problems. Understanding these patterns lets you make faster, better architectural decisions — instead of reinventing solutions, you can apply proven patterns and focus on what makes your application unique.
In this installment, we catalog every major AI application design pattern, organized by complexity tier. For each pattern, we explain what it is, when to use it, how to implement it, and which framework does it best.
Key Insight: Most AI applications are compositions of 2-3 patterns. A customer support bot combines the Chat + Memory pattern with the RAG pattern and possibly the Tool Use pattern. Understanding the building blocks lets you compose them effectively.
1. Core Patterns
Core patterns are the foundational building blocks. Every AI application uses at least one of these. They are simple to implement, well-understood, and form the basis for more complex patterns.
1.1 Prompt-Response Pattern
The simplest AI pattern: send a prompt to an LLM, receive a response. Despite its simplicity, this pattern powers a surprising number of production applications when combined with well-engineered prompts.
# Pattern: Prompt-Response
# Complexity: Low | Latency: Low | Cost: Low
# pip install langchain-openai
import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
# Simple prompt-response with structured output
llm = ChatOpenAI(model="gpt-4", temperature=0)
# Email classifier — a pure prompt-response application
email_classifier = ChatPromptTemplate.from_messages([
("system", """You are an email classifier. Classify the
incoming email into exactly one category:
- URGENT: Requires immediate action
- SUPPORT: Customer support request
- SALES: Sales inquiry or lead
- INTERNAL: Internal team communication
- SPAM: Unwanted or irrelevant
Respond with JSON: {{"category": "...", "confidence": 0.0-1.0,
"reasoning": "..."}}"""),
("human", "{email_text}")
])
chain = email_classifier | llm
# Usage
result = chain.invoke({
"email_text": "Our production database is down and customers "
"cannot access their accounts. Please fix ASAP!"
})
# Output: {"category": "URGENT", "confidence": 0.95,
# "reasoning": "Production outage affecting customers"}
Use Prompt-Response when: Classification, summarization, translation, formatting, extraction from small inputs, and any task where the LLM's training data is sufficient and no external data is needed.
1.2 RAG (Retrieval-Augmented Generation) Pattern
The most important pattern in AI application development. RAG augments the LLM's knowledge with external data by retrieving relevant documents before generating a response.
# Pattern: RAG (Retrieval-Augmented Generation)
# Complexity: Medium | Latency: Medium | Cost: Medium
# pip install langchain-openai langchain-chroma
import os
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
# Components
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
collection_name="company_docs",
embedding_function=embeddings,
persist_directory="./chroma_db"
)
retriever = vectorstore.as_retriever(
search_type="mmr", # Maximum Marginal Relevance
search_kwargs={"k": 5, "fetch_k": 20}
)
llm = ChatOpenAI(model="gpt-4", temperature=0)
# RAG prompt template
rag_prompt = ChatPromptTemplate.from_messages([
("system", """Answer the user's question using ONLY the
provided context. If the answer is not in the context,
say "I don't have information about that in our docs."
Context:
{context}"""),
("human", "{question}")
])
# RAG chain
def format_docs(docs):
return "\n\n---\n\n".join(
f"Source: {d.metadata.get('source', 'Unknown')}\n"
f"{d.page_content}" for d in docs
)
rag_chain = (
{"context": retriever | format_docs,
"question": RunnablePassthrough()}
| rag_prompt
| llm
)
# Usage
answer = rag_chain.invoke("What is our refund policy?")
1.3 Tool Use Pattern
The LLM decides which external tools to call, constructs the arguments, interprets the results, and formulates a response. This pattern gives LLMs the ability to take actions in the real world.
# Pattern: Tool Use
# Complexity: Medium | Latency: Medium-High | Cost: Medium
# pip install langchain langchain-openai
import os
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
@tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
# In production: call a real weather API
return f"Weather in {city}: 72F, sunny, humidity 45%"
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression safely."""
allowed = set("0123456789+-*/.()")
if all(c in allowed or c == ' ' for c in expression):
return str(eval(expression))
return "Invalid expression"
@tool
def search_database(query: str) -> str:
"""Search the company database for information."""
# In production: query your actual database
return f"Database results for '{query}': 42 matching records"
# Create tool-calling agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
tools = [get_weather, calculate, search_database]
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant with access to tools. "
"Use them when needed to answer questions accurately."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}")
])
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# The LLM decides which tools to call
result = executor.invoke({
"input": "What is the weather in Tokyo and what is 15% of 2500?"
})
2. Intermediate Patterns
Intermediate patterns combine core patterns with state management, persistence, and more complex data flows. These patterns power most production AI applications.
2.1 Chat + Memory Pattern
Maintains conversation history across turns, enabling contextual multi-turn conversations. The LLM references previous messages to provide coherent, context-aware responses.
# Pattern: Chat + Memory
# Complexity: Medium | Latency: Low-Medium | Cost: Medium
# pip install langchain-openai
import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
# Prompt with memory placeholder
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful AI assistant. Use the "
"conversation history to provide contextual answers."),
MessagesPlaceholder(variable_name="history"),
("human", "{input}")
])
chain = prompt | llm
# Session-based memory store
store = {}
def get_session_history(session_id: str):
if session_id not in store:
store[session_id] = InMemoryChatMessageHistory()
return store[session_id]
# Wrap chain with message history
chat_with_memory = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="input",
history_messages_key="history"
)
# Multi-turn conversation
config = {"configurable": {"session_id": "user-123"}}
r1 = chat_with_memory.invoke(
{"input": "My name is Alice and I work at Acme Corp."},
config=config
)
r2 = chat_with_memory.invoke(
{"input": "What company do I work at?"}, # Uses memory
config=config
)
# r2 correctly answers "Acme Corp" from memory
2.2 Document QA Pattern
A specialized combination of RAG and Chat + Memory optimized for question-answering over a specific document corpus. Includes source citation, confidence scoring, and follow-up question handling.
# Pattern: Document QA
# Complexity: Medium | Latency: Medium | Cost: Medium
# Combines: RAG + Chat + Memory + Source Citation
# pip install langchain-openai langchain-chroma pydantic
import os
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
class QAResponse(BaseModel):
answer: str = Field(description="The answer to the question")
sources: list[str] = Field(description="Source documents used")
confidence: float = Field(description="Confidence score 0-1")
follow_up: list[str] = Field(
description="Suggested follow-up questions"
)
llm = ChatOpenAI(model="gpt-4", temperature=0)
parser = JsonOutputParser(pydantic_object=QAResponse)
doc_qa_prompt = ChatPromptTemplate.from_messages([
("system", """You are a document QA assistant. Answer questions
using ONLY the provided context documents. Always cite which
source document(s) you used. Rate your confidence from 0 to 1.
Suggest 2-3 follow-up questions the user might ask.
{format_instructions}
Context Documents:
{context}"""),
("human", "{question}")
])
doc_qa_chain = (
doc_qa_prompt.partial(
format_instructions=parser.get_format_instructions()
)
| llm
| parser
)
# Returns structured response with sources, confidence, follow-ups
2.3 Workflow Automation Pattern
Chains multiple AI steps together with conditional logic, data transformations, and external system integrations. This pattern is the backbone of AI-powered business process automation.
# Pattern: Workflow Automation
# Complexity: Medium-High | Latency: High | Cost: Medium-High
# Best implemented with: n8n, Zapier, LangGraph
# pip install langgraph langchain-openai
import os
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
class WorkflowState(TypedDict):
email_text: str
classification: str
sentiment: str
response_draft: str
approved: bool
llm = ChatOpenAI(model="gpt-4", temperature=0)
def classify_email(state: WorkflowState) -> dict:
"""Step 1: Classify the incoming email."""
response = llm.invoke(
f"Classify this email as SUPPORT, SALES, or BILLING: "
f"{state['email_text']}"
)
return {"classification": response.content.strip()}
def analyze_sentiment(state: WorkflowState) -> dict:
"""Step 2: Analyze customer sentiment."""
response = llm.invoke(
f"Rate sentiment as POSITIVE, NEUTRAL, or NEGATIVE: "
f"{state['email_text']}"
)
return {"sentiment": response.content.strip()}
def draft_response(state: WorkflowState) -> dict:
"""Step 3: Draft an appropriate response."""
response = llm.invoke(
f"Draft a {state['sentiment'].lower()} tone response "
f"for this {state['classification']} email: "
f"{state['email_text']}"
)
return {"response_draft": response.content}
def route_by_sentiment(state: WorkflowState) -> str:
"""Route negative sentiment to human review."""
if state["sentiment"] == "NEGATIVE":
return "human_review"
return "auto_send"
# Build workflow graph
wf = StateGraph(WorkflowState)
wf.add_node("classify", classify_email)
wf.add_node("sentiment", analyze_sentiment)
wf.add_node("draft", draft_response)
wf.set_entry_point("classify")
wf.add_edge("classify", "sentiment")
wf.add_edge("sentiment", "draft")
wf.add_conditional_edges("draft", route_by_sentiment, {
"human_review": END, # Human reviews negative responses
"auto_send": END # Auto-send positive/neutral
})
workflow = wf.compile()
3. Advanced Patterns
Advanced patterns involve autonomous decision-making, iterative self-improvement, and multi-agent coordination. They are the most powerful but also the most complex and expensive to run.
3.1 Agent Loop Pattern
The agent receives a goal, then enters a think-act-observe loop, iteratively taking actions and refining its approach until the goal is achieved or a termination condition is met.
# Pattern: Agent Loop (ReAct / Think-Act-Observe)
# Complexity: High | Latency: High | Cost: High
# The agent iterates until the goal is achieved
# pip install langgraph langchain-openai
import os
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
import operator
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
class AgentLoopState(TypedDict):
messages: Annotated[list, operator.add]
goal: str
plan: str
iteration: int
completed: bool
llm = ChatOpenAI(model="gpt-4", temperature=0)
def think_node(state: AgentLoopState) -> dict:
"""Agent thinks about what to do next."""
messages = [
SystemMessage(content=f"""You are working on: {state['goal']}
Current plan: {state.get('plan', 'None')}
Iteration: {state['iteration']}
Analyze the current situation and decide:
1. What has been accomplished so far?
2. What remains to be done?
3. What is the next specific action?
4. Should we FINISH? (say GOAL_COMPLETE if yes)"""),
*state["messages"][-5:]
]
response = llm.invoke(messages)
completed = "GOAL_COMPLETE" in response.content
return {
"messages": [response],
"completed": completed,
"iteration": state["iteration"] + 1
}
def act_node(state: AgentLoopState) -> dict:
"""Agent takes an action based on its thinking."""
last_thought = state["messages"][-1].content
messages = [
SystemMessage(content="Execute the next action from "
"the plan. Provide concrete output."),
HumanMessage(content=f"Action to take: {last_thought}")
]
response = llm.invoke(messages)
return {"messages": [response]}
def should_continue(state: AgentLoopState) -> str:
if state.get("completed", False):
return "end"
if state.get("iteration", 0) >= 10:
return "end"
return "act"
# Build agent loop graph
loop = StateGraph(AgentLoopState)
loop.add_node("think", think_node)
loop.add_node("act", act_node)
loop.set_entry_point("think")
loop.add_conditional_edges("think", should_continue, {
"act": "act",
"end": END
})
loop.add_edge("act", "think") # Loop back
agent_loop = loop.compile()
3.2 Planning Pattern
The agent first creates an explicit plan (a sequence of steps), then executes the plan step-by-step, re-planning when necessary. This separates strategic thinking from tactical execution.
# Pattern: Planning (Plan-and-Execute)
# Complexity: High | Latency: High | Cost: High
# Separates planning from execution for better results
# pip install langgraph langchain-openai
import os
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser
# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."
class PlanState(TypedDict):
goal: str
plan: list # List of step descriptions
current_step: int
step_results: dict
needs_replan: bool
llm = ChatOpenAI(model="gpt-4", temperature=0)
def planner_node(state: PlanState) -> dict:
"""Create or revise the plan."""
if state.get("needs_replan", False):
prompt = f"""The original plan failed at step
{state['current_step']}. Results so far:
{state['step_results']}
Revise the plan for goal: {state['goal']}
Return a JSON list of remaining steps."""
else:
prompt = f"""Create a step-by-step plan to achieve:
{state['goal']}
Return a JSON list of 3-7 specific, actionable steps.
Example: ["Research topic X", "Write outline", ...]"""
response = llm.invoke(prompt)
# Parse JSON list from response
import json
try:
plan = json.loads(response.content)
except json.JSONDecodeError:
plan = [response.content]
return {
"plan": plan,
"current_step": 0,
"needs_replan": False
}
def executor_node(state: PlanState) -> dict:
"""Execute the current step of the plan."""
step_idx = state["current_step"]
step = state["plan"][step_idx]
response = llm.invoke(
f"Execute this step thoroughly: {step}\n"
f"Previous results: {state.get('step_results', {})}"
)
results = state.get("step_results", {})
results[f"step_{step_idx}"] = response.content
return {
"step_results": results,
"current_step": step_idx + 1
}
def check_progress(state: PlanState) -> str:
if state["current_step"] >= len(state["plan"]):
return "done"
return "execute"
# Build plan-and-execute graph
plan_graph = StateGraph(PlanState)
plan_graph.add_node("planner", planner_node)
plan_graph.add_node("executor", executor_node)
plan_graph.set_entry_point("planner")
plan_graph.add_edge("planner", "executor")
plan_graph.add_conditional_edges("executor", check_progress, {
"execute": "executor",
"done": END
})
plan_execute = plan_graph.compile()
3.3 Multi-Agent Orchestration Pattern
Multiple specialized agents collaborate on a task, coordinated by a supervisor or through peer-to-peer communication. This pattern was covered extensively in Part 10, but here we place it in the context of the full pattern catalog.
# Pattern: Multi-Agent Orchestration
# Complexity: Very High | Latency: Very High | Cost: High
# See Part 10 for detailed implementations
# Summary of multi-agent sub-patterns:
MULTI_AGENT_PATTERNS = {
"supervisor": {
"description": "Central coordinator routes to specialists",
"agents": "3-8",
"best_for": "Quality-gated pipelines, dev workflows",
"framework": "LangGraph (best), CrewAI, AutoGen"
},
"swarm": {
"description": "Autonomous agents with handoff protocols",
"agents": "2-6",
"best_for": "Customer service routing, triage",
"framework": "LangGraph, OpenAI Swarm"
},
"debate": {
"description": "Adversarial agents improve via argumentation",
"agents": "2-4",
"best_for": "Analysis, evaluation, high-stakes decisions",
"framework": "LangGraph, AutoGen"
},
"hierarchical": {
"description": "Tree of managers delegating to workers",
"agents": "5-20+",
"best_for": "Large projects, enterprise automation",
"framework": "CrewAI (hierarchical process), LangGraph"
}
}
4. Pattern Selection Guide
Choosing the right pattern and framework is the most impactful architectural decision you will make. This section provides a systematic approach to pattern selection.
4.1 Framework-Pattern Matrix
This matrix shows which framework is best suited for each design pattern:
Pattern
LangChain
LangGraph
CrewAI
n8n
Zapier
Prompt-Response
Excellent
Overkill
Overkill
Good (AI node)
Good (AI action)
RAG
Excellent
Good
Limited
Good (with plugins)
Limited
Tool Use
Excellent
Excellent
Good
Excellent (native)
Excellent (native)
Chat + Memory
Excellent
Excellent
Good
Good
Limited
Document QA
Excellent
Good
Limited
Good
Limited
Workflow Automation
Good
Excellent
Good
Excellent
Excellent
Agent Loop
Good
Excellent
Limited
Limited
Not supported
Planning
Limited
Excellent
Good (planning=True)
Not supported
Not supported
Multi-Agent
Limited
Excellent
Excellent
Limited
Not supported
4.2 Decision Tree
Follow this decision tree to select the right pattern for your use case:
# AI Pattern Selection Decision Tree
START: What does your application need to do?
|
|-- Single LLM call sufficient?
| |-- YES: Do you need external data?
| | |-- NO --> Prompt-Response (LangChain/direct API)
| | |-- YES --> RAG Pattern (LangChain + vector DB)
| |-- NO: Continue...
|
|-- Does it need conversation history?
| |-- YES: Is it over a document corpus?
| | |-- YES --> Document QA (LangChain)
| | |-- NO --> Chat + Memory (LangChain)
| |-- NO: Continue...
|
|-- Does it need to take actions (APIs, DB, files)?
| |-- YES: Is it a single action or a chain?
| | |-- Single --> Tool Use (LangChain)
| | |-- Chain --> Workflow Automation (LangGraph/n8n)
| |-- NO: Continue...
|
|-- Does it need autonomous multi-step reasoning?
| |-- YES: Does it need explicit planning?
| | |-- YES --> Planning Pattern (LangGraph)
| | |-- NO --> Agent Loop (LangGraph)
| |-- NO: Continue...
|
|-- Does it need multiple specialized agents?
| |-- YES --> Multi-Agent Orchestration (LangGraph/CrewAI)
|
|-- Is the user non-technical?
| |-- YES --> n8n (self-hosted) or Zapier (cloud)
|
|-- None of the above?
--> Start with Prompt-Response and iterate
The Golden Rule of Pattern Selection: Always start with the simplest pattern that could possibly work. Upgrade to a more complex pattern only when you hit a specific limitation. A well-crafted prompt-response can often replace a complex agent loop at 1/10th the cost and latency.
Framework Selection Summary
If you need...
Use this framework
Why
RAG, chains, basic agents
LangChain
Best ecosystem for retrieval and chain composition
Complex agents, stateful workflows, cycles
LangGraph
Graph-based architecture handles any topology
Role-based multi-agent teams
CrewAI
Intuitive team metaphor, task dependencies
No-code visual workflow (self-hosted)
n8n
Visual builder, 400+ integrations, AI nodes
No-code workflow (cloud, simplest)
Zapier
Easiest setup, 6000+ app integrations
5. Anti-Patterns
Anti-patterns are common mistakes that seem reasonable but lead to poor outcomes. Recognizing them saves time, money, and frustration.
5.1 Common Mistakes
Anti-Pattern
What It Looks Like
Why It Fails
Better Approach
The God Prompt
One massive prompt that handles all logic, edge cases, and formatting
Exceeds context limits, becomes fragile, impossible to debug
Break into chains: classify -> route -> handle -> format
Premature Agent-ification
Using a ReAct agent loop for what a simple chain could do
10x more tokens, unpredictable behavior, much higher latency
Start with prompt-response, add agent only when needed
RAG Everything
Putting all data into a vector DB even when not needed
Use RAG only for large, dynamic knowledge bases. Small static data goes in the prompt.
Infinite Agent Loop
No max iterations or termination condition on agent loops
Runaway costs, hangs, and no useful output
Always set max_iterations. Add explicit DONE/FINISH signals.
Memory Bloat
Storing entire conversation history in context forever
Context window overflow, irrelevant old context pollutes responses
Use summary memory, sliding window, or vector memory for long conversations
Framework Overload
Using LangChain + LangGraph + CrewAI + LlamaIndex in one project
Dependency conflicts, debugging nightmare, team confusion
Pick one primary framework. Add others only for specific capabilities.
Tool Explosion
Giving an agent access to 50+ tools
LLM cannot reliably choose from too many options, tool descriptions bloat context
Limit to 5-10 tools per agent. Use a tool router if you need more.
5.2 Pattern Smells
These signs indicate you may be using the wrong pattern:
# Pattern smell detection checklist
PATTERN_SMELLS = {
"High latency for simple tasks": {
"likely_cause": "Using Agent Loop for Prompt-Response tasks",
"fix": "Downgrade to a simpler pattern",
"metric": "If avg response > 10s for classification tasks"
},
"Inconsistent outputs": {
"likely_cause": "Missing structured output parsing",
"fix": "Add Pydantic output parser or function calling",
"metric": "If JSON parse failures > 5% of requests"
},
"Context window errors": {
"likely_cause": "Memory Bloat or God Prompt anti-pattern",
"fix": "Implement summary memory or break prompt into chains",
"metric": "If token count regularly exceeds 80% of limit"
},
"Agent loops without progress": {
"likely_cause": "Infinite Agent Loop anti-pattern",
"fix": "Add progress tracking and termination conditions",
"metric": "If agent takes > 5 iterations for simple tasks"
},
"High cost per query": {
"likely_cause": "Premature Agent-ification",
"fix": "Audit each agent step - can it be a simple chain?",
"metric": "If cost/query > $0.10 for routine operations"
},
"Retrieval returns irrelevant docs": {
"likely_cause": "Poor chunking, wrong embedding model, or "
"RAG Everything anti-pattern",
"fix": "Tune chunk size, use re-ranking, audit what needs RAG",
"metric": "If retrieval relevance score < 0.7 average"
}
}
The Most Expensive Anti-Pattern: Building a multi-agent system when a well-crafted prompt would suffice. A 5-agent system costs 5-20x more per query than a single prompt-response. Always justify the added complexity with measurable quality improvements.
6. Frontend Patterns
Building AI-powered UIs requires specialized frontend patterns for handling streaming data, rendering tool calls, and managing complex chat state. The LangChain ecosystem provides framework-specific hooks that abstract the complexity of real-time LLM communication into declarative, reactive components.
6.1 The useStream Hook
The useStream hook is the core building block for connecting a frontend to a LangGraph backend. Available for all major frameworks, it manages the WebSocket/SSE connection, handles streaming state updates, and exposes a reactive API for rendering messages as they arrive:
Framework
Package
Hook/Function
React
@langchain/react
useStream()
Vue
@langchain/vue
useStream()
Svelte
@langchain/svelte
useStream()
Angular
@langchain/angular
useStream()
import { useStream } from "@langchain/react";
function ChatApp() {
const {
messages, // Reactive message list
submit, // Send a message
isStreaming, // Whether a response is in progress
stop, // Cancel the current stream
error, // Error state
} = useStream({
apiUrl: "http://localhost:8000",
assistantId: "agent",
threadId: "thread-001",
});
const handleSend = (text) => {
submit({ messages: [{ role: "user", content: text }] });
};
return (
{messages.map((msg, i) => (
{msg.content}
))}
{isStreaming &&
Thinking...
}
{
if (e.key === "Enter") handleSend(e.target.value);
}} />
);
}
Key Insight: The useStream hook manages the full lifecycle — connection, reconnection, state synchronization, and cleanup. It handles optimistic updates (showing user messages immediately) and server reconciliation (replacing optimistic state with confirmed server state) automatically.
For branching conversations (allowing users to edit previous messages and explore alternative paths), useStream supports a branch option that creates a new conversation fork from any checkpoint:
import { useStream } from "@langchain/react";
function BranchingChat() {
const { messages, submit, branch } = useStream({
apiUrl: "http://localhost:8000",
assistantId: "agent",
});
const handleEdit = (messageIndex, newContent) => {
// Branch from this message — creates a new thread fork
branch({
messages: [{ role: "user", content: newContent }],
checkpoint: messages[messageIndex].checkpoint_id,
});
};
return (
{messages.map((msg, i) => (
{msg.content}
{msg.role === "user" && (
)}
))}
);
}
6.2 Tool Calling UI
When agents use tools, the frontend needs to render tool calls and their results as distinct UI elements — not just plain text. The useStream hook exposes tool call state through the messages array, where each tool call includes a name, args, and result with a state indicator (pending, completed, or error):
import { useStream } from "@langchain/react";
function ToolCallCard({ toolCall }) {
const { name, args, result, state } = toolCall;
return (
For human-in-the-loop patterns, the frontend can render approval UI when the agent requests permission. The useStream hook pauses on interrupt events, and you resume execution by calling submit with a Command:
import { useStream } from "@langchain/react";
function ApprovalChat() {
const { messages, submit, interrupt } = useStream({
apiUrl: "http://localhost:8000",
assistantId: "agent",
});
// When interrupt is set, agent is waiting for approval
if (interrupt) {
return (
Agent needs approval
{interrupt.value}
);
}
return (
{messages.map((msg, i) => (
{msg.content}
))}
);
}
Security Note: Never execute tool call results as code on the frontend. Tool results from the agent should be treated as untrusted data — always sanitize and escape before rendering in the DOM. Use textContent or a sanitization library, never innerHTML with raw agent output.
6.3 Integrations & Libraries
The LangChain frontend ecosystem integrates with popular UI component libraries for production-ready chat interfaces:
Library
Type
Best For
assistant-ui
React component library
Full-featured chat UI with tool rendering, branching, markdown
AI Elements / shadcn/ui
Headless components
Custom-styled chat with Tailwind CSS, full design control
CopilotKit
React framework
Copilot-style sidebars and embedded AI assistants
OpenUI
Web components
Framework-agnostic AI chat widgets
Integration
Generative UI Pattern
The Generative UI pattern lets the agent return structured data that the frontend renders as rich components — charts, forms, tables, maps — instead of plain text. This requires:
Structured output from the agent (JSON schema defining UI component type and data)
Component registry on the frontend mapping component types to React/Vue/Svelte components
Streaming support to progressively render complex UIs as data arrives
This pattern powers applications like data dashboards, interactive reports, and visual workflow builders where plain text responses are insufficient.
7. Exercises & Self-Assessment
Exercise 1
Pattern Identification
Identify which design pattern(s) each of these real-world applications uses:
ChatGPT — What patterns does the free version use? What about ChatGPT Plus with plugins?
GitHub Copilot — Is it prompt-response, RAG, or something more complex?
Notion AI — What patterns power its "Ask AI" feature vs its "Write with AI" feature?
Perplexity AI — How does it combine search with generation?
Cursor IDE — What pattern enables its multi-file code editing capability?
Exercise 2
Implement Three Patterns
Build these three applications, each using a different pattern:
Prompt-Response: An email subject line generator that produces 5 variants from an email body
RAG: A FAQ bot that answers questions from a set of at least 20 FAQ documents
Agent Loop: A research agent that searches the web, evaluates sources, and writes a summary
Compare: lines of code, tokens per query, latency, and output quality.
Exercise 3
Anti-Pattern Audit
Review this hypothetical AI application and identify all anti-patterns:
A customer support bot that uses a ReAct agent loop for every query (including "What are your hours?")
It stores the entire conversation history (no limit) in the prompt
It has 35 tools available including a calculator, web search, database query, email sender, and 31 others
All company data (50,000 docs) is in one vector store with 2048-token chunks
It uses LangChain + LlamaIndex + CrewAI + a custom framework simultaneously
For each anti-pattern, explain the problem and recommend a fix.
Exercise 4
Framework Selection Challenge
For each scenario, recommend a framework AND a design pattern. Justify both choices:
A legal firm wants to search across 100,000 contracts and answer natural language questions with source citations
A marketing team (non-technical) wants AI to auto-generate social posts from blog articles
A DevOps team wants an AI that autonomously investigates production alerts, checks logs, and suggests fixes
A startup wants to add an AI chatbot to their SaaS product that remembers previous conversations
A research institute wants multiple AI perspectives to debate policy recommendations
Exercise 5
Reflective Questions
Why is the "start simple, upgrade when needed" principle so important for AI applications specifically? How does it differ from traditional software development?
How do you quantify when a prompt-response pattern is "not good enough" and you need to upgrade to RAG or an agent?
What would a "design patterns" library for AI applications look like? How is it different from traditional GoF patterns?
Can you combine anti-patterns to create something worse than each individually? Give an example.
If you were building an AI application framework from scratch, which pattern would you make the easiest to implement and why?
AI Pattern Document Generator
Document an AI application design pattern for your project. Download as Word, Excel, PDF, or PowerPoint.
Draft auto-saved
All data stays in your browser. Nothing is sent to or stored on any server.
Conclusion & Next Steps
You now have a complete catalog of AI application design patterns and a systematic framework for choosing the right one. Here are the key takeaways from Part 11:
Core patterns — Prompt-Response, RAG, and Tool Use are the fundamental building blocks that every AI developer must master
Intermediate patterns — Chat + Memory, Document QA, and Workflow Automation combine core patterns with state management and persistence for production-grade applications
Advanced patterns — Agent Loops, Planning, and Multi-Agent Orchestration enable autonomous, multi-step reasoning but come with significantly higher cost and complexity
Framework-pattern matrix — LangChain excels at RAG and chains, LangGraph at complex agents, CrewAI at multi-agent teams, n8n/Zapier at no-code workflow automation
Start simple — The golden rule is to always start with the simplest pattern that could work, then upgrade only when you hit a specific limitation
Anti-patterns — The God Prompt, Premature Agent-ification, RAG Everything, Infinite Agent Loop, Memory Bloat, Framework Overload, and Tool Explosion are the most common and costly mistakes
Next in the Series
In Part 12: Ecosystem & Frameworks, we provide a comprehensive deep-dive comparison of every major framework and tool in the AI application ecosystem — LangChain, LangGraph, AutoGen, CrewAI, LlamaIndex, n8n, Zapier, HuggingFace, MCP, vLLM, vector databases, and more.
Continue the Series
Part 12: Ecosystem & Frameworks
Comprehensive comparison of LangChain, LangGraph, AutoGen, CrewAI, LlamaIndex, n8n, Zapier, and the entire AI ecosystem.