AI Application Development Mastery Part 11: AI Application Design Patterns

Introduction: Why Design Patterns Matter

                        
                        Series Overview: This is Part 11 of our 18-part AI Application Development Mastery series. Having mastered agents and multi-agent systems, we now catalog the complete set of design patterns that underpin every AI application — giving you a systematic framework for architectural decisions.
                    

AI Application Development Mastery

Your 20-step learning path • Currently on Step 11

1

11

AI Application Design Patterns

RAG, chat+memory, workflow automation, agent loops

You Are Here

12

Ecosystem & Frameworks

LlamaIndex, Haystack, HuggingFace, vLLM

13

MCP Foundations & Architecture

Protocol design, Host/Client/Server, primitives, security

14

MCP in Production

Building servers, integrations, scaling, agent systems

15

Evaluation & LLMOps

Prompt eval, tracing, LangSmith, experiment tracking

16

Production AI Systems

APIs, queues, caching, streaming, scaling

17

Safety, Guardrails & Reliability

Input filtering, hallucination mitigation, prompt injection

18

Advanced Topics

Fine-tuning, tool learning, hybrid LLM+symbolic

19

Building Real AI Applications

Chatbot, document QA, coding assistant, full-stack

20

Future of AI Applications

Autonomous agents, self-improving, multi-modal, AI OS

Just as the Gang of Four cataloged design patterns for object-oriented programming, the AI application ecosystem has developed a set of recurring architectural patterns that solve common problems. Understanding these patterns lets you make faster, better architectural decisions — instead of reinventing solutions, you can apply proven patterns and focus on what makes your application unique.

In this installment, we catalog every major AI application design pattern, organized by complexity tier. For each pattern, we explain what it is, when to use it, how to implement it, and which framework does it best.

                        
                        Key Insight: Most AI applications are compositions of 2-3 patterns. A customer support bot combines the Chat + Memory pattern with the RAG pattern and possibly the Tool Use pattern. Understanding the building blocks lets you compose them effectively.
                    

1. Core Patterns

Core patterns are the foundational building blocks. Every AI application uses at least one of these. They are simple to implement, well-understood, and form the basis for more complex patterns.

1.1 Prompt-Response Pattern

The simplest AI pattern: send a prompt to an LLM, receive a response. Despite its simplicity, this pattern powers a surprising number of production applications when combined with well-engineered prompts.

# Pattern: Prompt-Response
# Complexity: Low | Latency: Low | Cost: Low
# pip install langchain-openai

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

# Simple prompt-response with structured output
llm = ChatOpenAI(model="gpt-4", temperature=0)

# Email classifier — a pure prompt-response application
email_classifier = ChatPromptTemplate.from_messages([
    ("system", """You are an email classifier. Classify the
    incoming email into exactly one category:
    - URGENT: Requires immediate action
    - SUPPORT: Customer support request
    - SALES: Sales inquiry or lead
    - INTERNAL: Internal team communication
    - SPAM: Unwanted or irrelevant

    Respond with JSON: {{"category": "...", "confidence": 0.0-1.0,
    "reasoning": "..."}}"""),
    ("human", "{email_text}")
])

chain = email_classifier | llm

# Usage
result = chain.invoke({
    "email_text": "Our production database is down and customers "
                  "cannot access their accounts. Please fix ASAP!"
})
# Output: {"category": "URGENT", "confidence": 0.95,
#          "reasoning": "Production outage affecting customers"}

                        
                        Use Prompt-Response when: Classification, summarization, translation, formatting, extraction from small inputs, and any task where the LLM's training data is sufficient and no external data is needed.
                    

1.2 RAG (Retrieval-Augmented Generation) Pattern

The most important pattern in AI application development. RAG augments the LLM's knowledge with external data by retrieving relevant documents before generating a response.

# Pattern: RAG (Retrieval-Augmented Generation)
# Complexity: Medium | Latency: Medium | Cost: Medium
# pip install langchain-openai langchain-chroma

import os
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

# Components
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    collection_name="company_docs",
    embedding_function=embeddings,
    persist_directory="./chroma_db"
)
retriever = vectorstore.as_retriever(
    search_type="mmr",  # Maximum Marginal Relevance
    search_kwargs={"k": 5, "fetch_k": 20}
)

llm = ChatOpenAI(model="gpt-4", temperature=0)

# RAG prompt template
rag_prompt = ChatPromptTemplate.from_messages([
    ("system", """Answer the user's question using ONLY the
    provided context. If the answer is not in the context,
    say "I don't have information about that in our docs."

    Context:
    {context}"""),
    ("human", "{question}")
])

# RAG chain
def format_docs(docs):
    return "\n\n---\n\n".join(
        f"Source: {d.metadata.get('source', 'Unknown')}\n"
        f"{d.page_content}" for d in docs
    )

rag_chain = (
    {"context": retriever | format_docs,
     "question": RunnablePassthrough()}
    | rag_prompt
    | llm
)

# Usage
answer = rag_chain.invoke("What is our refund policy?")

1.3 Tool Use Pattern

The LLM decides which external tools to call, constructs the arguments, interprets the results, and formulates a response. This pattern gives LLMs the ability to take actions in the real world.

# Pattern: Tool Use
# Complexity: Medium | Latency: Medium-High | Cost: Medium
# pip install langchain langchain-openai

import os
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    # In production: call a real weather API
    return f"Weather in {city}: 72F, sunny, humidity 45%"

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression safely."""
    allowed = set("0123456789+-*/.()")
    if all(c in allowed or c == ' ' for c in expression):
        return str(eval(expression))
    return "Invalid expression"

@tool
def search_database(query: str) -> str:
    """Search the company database for information."""
    # In production: query your actual database
    return f"Database results for '{query}': 42 matching records"

# Create tool-calling agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
tools = [get_weather, calculate, search_database]

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to tools. "
               "Use them when needed to answer questions accurately."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}")
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# The LLM decides which tools to call
result = executor.invoke({
    "input": "What is the weather in Tokyo and what is 15% of 2500?"
})

2. Intermediate Patterns

Intermediate patterns combine core patterns with state management, persistence, and more complex data flows. These patterns power most production AI applications.

2.1 Chat + Memory Pattern

Maintains conversation history across turns, enabling contextual multi-turn conversations. The LLM references previous messages to provide coherent, context-aware responses.

# Pattern: Chat + Memory
# Complexity: Medium | Latency: Low-Medium | Cost: Medium
# pip install langchain-openai

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

llm = ChatOpenAI(model="gpt-4", temperature=0.7)

# Prompt with memory placeholder
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI assistant. Use the "
               "conversation history to provide contextual answers."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}")
])

chain = prompt | llm

# Session-based memory store
store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

# Wrap chain with message history
chat_with_memory = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history"
)

# Multi-turn conversation
config = {"configurable": {"session_id": "user-123"}}
r1 = chat_with_memory.invoke(
    {"input": "My name is Alice and I work at Acme Corp."},
    config=config
)
r2 = chat_with_memory.invoke(
    {"input": "What company do I work at?"},  # Uses memory
    config=config
)
# r2 correctly answers "Acme Corp" from memory

2.2 Document QA Pattern

A specialized combination of RAG and Chat + Memory optimized for question-answering over a specific document corpus. Includes source citation, confidence scoring, and follow-up question handling.

# Pattern: Document QA
# Complexity: Medium | Latency: Medium | Cost: Medium
# Combines: RAG + Chat + Memory + Source Citation
# pip install langchain-openai langchain-chroma pydantic

import os
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

class QAResponse(BaseModel):
    answer: str = Field(description="The answer to the question")
    sources: list[str] = Field(description="Source documents used")
    confidence: float = Field(description="Confidence score 0-1")
    follow_up: list[str] = Field(
        description="Suggested follow-up questions"
    )

llm = ChatOpenAI(model="gpt-4", temperature=0)
parser = JsonOutputParser(pydantic_object=QAResponse)

doc_qa_prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a document QA assistant. Answer questions
    using ONLY the provided context documents. Always cite which
    source document(s) you used. Rate your confidence from 0 to 1.
    Suggest 2-3 follow-up questions the user might ask.

    {format_instructions}

    Context Documents:
    {context}"""),
    ("human", "{question}")
])

doc_qa_chain = (
    doc_qa_prompt.partial(
        format_instructions=parser.get_format_instructions()
    )
    | llm
    | parser
)

# Returns structured response with sources, confidence, follow-ups

2.3 Workflow Automation Pattern

Chains multiple AI steps together with conditional logic, data transformations, and external system integrations. This pattern is the backbone of AI-powered business process automation.

# Pattern: Workflow Automation
# Complexity: Medium-High | Latency: High | Cost: Medium-High
# Best implemented with: n8n, Zapier, LangGraph
# pip install langgraph langchain-openai

import os
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

class WorkflowState(TypedDict):
    email_text: str
    classification: str
    sentiment: str
    response_draft: str
    approved: bool

llm = ChatOpenAI(model="gpt-4", temperature=0)

def classify_email(state: WorkflowState) -> dict:
    """Step 1: Classify the incoming email."""
    response = llm.invoke(
        f"Classify this email as SUPPORT, SALES, or BILLING: "
        f"{state['email_text']}"
    )
    return {"classification": response.content.strip()}

def analyze_sentiment(state: WorkflowState) -> dict:
    """Step 2: Analyze customer sentiment."""
    response = llm.invoke(
        f"Rate sentiment as POSITIVE, NEUTRAL, or NEGATIVE: "
        f"{state['email_text']}"
    )
    return {"sentiment": response.content.strip()}

def draft_response(state: WorkflowState) -> dict:
    """Step 3: Draft an appropriate response."""
    response = llm.invoke(
        f"Draft a {state['sentiment'].lower()} tone response "
        f"for this {state['classification']} email: "
        f"{state['email_text']}"
    )
    return {"response_draft": response.content}

def route_by_sentiment(state: WorkflowState) -> str:
    """Route negative sentiment to human review."""
    if state["sentiment"] == "NEGATIVE":
        return "human_review"
    return "auto_send"

# Build workflow graph
wf = StateGraph(WorkflowState)
wf.add_node("classify", classify_email)
wf.add_node("sentiment", analyze_sentiment)
wf.add_node("draft", draft_response)

wf.set_entry_point("classify")
wf.add_edge("classify", "sentiment")
wf.add_edge("sentiment", "draft")
wf.add_conditional_edges("draft", route_by_sentiment, {
    "human_review": END,  # Human reviews negative responses
    "auto_send": END      # Auto-send positive/neutral
})

workflow = wf.compile()

3. Advanced Patterns

Advanced patterns involve autonomous decision-making, iterative self-improvement, and multi-agent coordination. They are the most powerful but also the most complex and expensive to run.

3.1 Agent Loop Pattern

The agent receives a goal, then enters a think-act-observe loop, iteratively taking actions and refining its approach until the goal is achieved or a termination condition is met.

# Pattern: Agent Loop (ReAct / Think-Act-Observe)
# Complexity: High | Latency: High | Cost: High
# The agent iterates until the goal is achieved
# pip install langgraph langchain-openai

import os
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
import operator

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

class AgentLoopState(TypedDict):
    messages: Annotated[list, operator.add]
    goal: str
    plan: str
    iteration: int
    completed: bool

llm = ChatOpenAI(model="gpt-4", temperature=0)

def think_node(state: AgentLoopState) -> dict:
    """Agent thinks about what to do next."""
    messages = [
        SystemMessage(content=f"""You are working on: {state['goal']}
        Current plan: {state.get('plan', 'None')}
        Iteration: {state['iteration']}

        Analyze the current situation and decide:
        1. What has been accomplished so far?
        2. What remains to be done?
        3. What is the next specific action?
        4. Should we FINISH? (say GOAL_COMPLETE if yes)"""),
        *state["messages"][-5:]
    ]
    response = llm.invoke(messages)
    completed = "GOAL_COMPLETE" in response.content
    return {
        "messages": [response],
        "completed": completed,
        "iteration": state["iteration"] + 1
    }

def act_node(state: AgentLoopState) -> dict:
    """Agent takes an action based on its thinking."""
    last_thought = state["messages"][-1].content
    messages = [
        SystemMessage(content="Execute the next action from "
                      "the plan. Provide concrete output."),
        HumanMessage(content=f"Action to take: {last_thought}")
    ]
    response = llm.invoke(messages)
    return {"messages": [response]}

def should_continue(state: AgentLoopState) -> str:
    if state.get("completed", False):
        return "end"
    if state.get("iteration", 0) >= 10:
        return "end"
    return "act"

# Build agent loop graph
loop = StateGraph(AgentLoopState)
loop.add_node("think", think_node)
loop.add_node("act", act_node)

loop.set_entry_point("think")
loop.add_conditional_edges("think", should_continue, {
    "act": "act",
    "end": END
})
loop.add_edge("act", "think")  # Loop back

agent_loop = loop.compile()

3.2 Planning Pattern

The agent first creates an explicit plan (a sequence of steps), then executes the plan step-by-step, re-planning when necessary. This separates strategic thinking from tactical execution.

# Pattern: Planning (Plan-and-Execute)
# Complexity: High | Latency: High | Cost: High
# Separates planning from execution for better results
# pip install langgraph langchain-openai

import os
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import JsonOutputParser

# Requires OPENAI_API_KEY environment variable
# export OPENAI_API_KEY="sk-..."

class PlanState(TypedDict):
    goal: str
    plan: list  # List of step descriptions
    current_step: int
    step_results: dict
    needs_replan: bool

llm = ChatOpenAI(model="gpt-4", temperature=0)

def planner_node(state: PlanState) -> dict:
    """Create or revise the plan."""
    if state.get("needs_replan", False):
        prompt = f"""The original plan failed at step
        {state['current_step']}. Results so far:
        {state['step_results']}

        Revise the plan for goal: {state['goal']}
        Return a JSON list of remaining steps."""
    else:
        prompt = f"""Create a step-by-step plan to achieve:
        {state['goal']}

        Return a JSON list of 3-7 specific, actionable steps.
        Example: ["Research topic X", "Write outline", ...]"""

    response = llm.invoke(prompt)
    # Parse JSON list from response
    import json
    try:
        plan = json.loads(response.content)
    except json.JSONDecodeError:
        plan = [response.content]

    return {
        "plan": plan,
        "current_step": 0,
        "needs_replan": False
    }

def executor_node(state: PlanState) -> dict:
    """Execute the current step of the plan."""
    step_idx = state["current_step"]
    step = state["plan"][step_idx]

    response = llm.invoke(
        f"Execute this step thoroughly: {step}\n"
        f"Previous results: {state.get('step_results', {})}"
    )

    results = state.get("step_results", {})
    results[f"step_{step_idx}"] = response.content

    return {
        "step_results": results,
        "current_step": step_idx + 1
    }

def check_progress(state: PlanState) -> str:
    if state["current_step"] >= len(state["plan"]):
        return "done"
    return "execute"

# Build plan-and-execute graph
plan_graph = StateGraph(PlanState)
plan_graph.add_node("planner", planner_node)
plan_graph.add_node("executor", executor_node)

plan_graph.set_entry_point("planner")
plan_graph.add_edge("planner", "executor")
plan_graph.add_conditional_edges("executor", check_progress, {
    "execute": "executor",
    "done": END
})

plan_execute = plan_graph.compile()

3.3 Multi-Agent Orchestration Pattern

Multiple specialized agents collaborate on a task, coordinated by a supervisor or through peer-to-peer communication. This pattern was covered extensively in Part 10, but here we place it in the context of the full pattern catalog.

# Pattern: Multi-Agent Orchestration
# Complexity: Very High | Latency: Very High | Cost: High
# See Part 10 for detailed implementations

# Summary of multi-agent sub-patterns:
MULTI_AGENT_PATTERNS = {
    "supervisor": {
        "description": "Central coordinator routes to specialists",
        "agents": "3-8",
        "best_for": "Quality-gated pipelines, dev workflows",
        "framework": "LangGraph (best), CrewAI, AutoGen"
    },
    "swarm": {
        "description": "Autonomous agents with handoff protocols",
        "agents": "2-6",
        "best_for": "Customer service routing, triage",
        "framework": "LangGraph, OpenAI Swarm"
    },
    "debate": {
        "description": "Adversarial agents improve via argumentation",
        "agents": "2-4",
        "best_for": "Analysis, evaluation, high-stakes decisions",
        "framework": "LangGraph, AutoGen"
    },
    "hierarchical": {
        "description": "Tree of managers delegating to workers",
        "agents": "5-20+",
        "best_for": "Large projects, enterprise automation",
        "framework": "CrewAI (hierarchical process), LangGraph"
    }
}

4. Pattern Selection Guide

Choosing the right pattern and framework is the most impactful architectural decision you will make. This section provides a systematic approach to pattern selection.

4.1 Framework-Pattern Matrix

This matrix shows which framework is best suited for each design pattern:

Pattern	LangChain	LangGraph	CrewAI	n8n	Zapier
Prompt-Response	Excellent	Overkill	Overkill	Good (AI node)	Good (AI action)
RAG	Excellent	Good	Limited	Good (with plugins)	Limited
Tool Use	Excellent	Excellent	Good	Excellent (native)	Excellent (native)
Chat + Memory	Excellent	Excellent	Good	Good	Limited
Document QA	Excellent	Good	Limited	Good	Limited
Workflow Automation	Good	Excellent	Good	Excellent	Excellent
Agent Loop	Good	Excellent	Limited	Limited	Not supported
Planning	Limited	Excellent	Good (planning=True)	Not supported	Not supported
Multi-Agent	Limited	Excellent	Excellent	Limited	Not supported

4.2 Decision Tree

Follow this decision tree to select the right pattern for your use case:

# AI Pattern Selection Decision Tree

START: What does your application need to do?
|
|-- Single LLM call sufficient?
|   |-- YES: Do you need external data?
|   |   |-- NO  --> Prompt-Response (LangChain/direct API)
|   |   |-- YES --> RAG Pattern (LangChain + vector DB)
|   |-- NO: Continue...
|
|-- Does it need conversation history?
|   |-- YES: Is it over a document corpus?
|   |   |-- YES --> Document QA (LangChain)
|   |   |-- NO  --> Chat + Memory (LangChain)
|   |-- NO: Continue...
|
|-- Does it need to take actions (APIs, DB, files)?
|   |-- YES: Is it a single action or a chain?
|   |   |-- Single  --> Tool Use (LangChain)
|   |   |-- Chain   --> Workflow Automation (LangGraph/n8n)
|   |-- NO: Continue...
|
|-- Does it need autonomous multi-step reasoning?
|   |-- YES: Does it need explicit planning?
|   |   |-- YES --> Planning Pattern (LangGraph)
|   |   |-- NO  --> Agent Loop (LangGraph)
|   |-- NO: Continue...
|
|-- Does it need multiple specialized agents?
|   |-- YES --> Multi-Agent Orchestration (LangGraph/CrewAI)
|
|-- Is the user non-technical?
|   |-- YES --> n8n (self-hosted) or Zapier (cloud)
|
|-- None of the above?
    --> Start with Prompt-Response and iterate

                        
                        The Golden Rule of Pattern Selection: Always start with the simplest pattern that could possibly work. Upgrade to a more complex pattern only when you hit a specific limitation. A well-crafted prompt-response can often replace a complex agent loop at 1/10th the cost and latency.
                    

Framework Selection Summary

If you need...	Use this framework	Why
RAG, chains, basic agents	LangChain	Best ecosystem for retrieval and chain composition
Complex agents, stateful workflows, cycles	LangGraph	Graph-based architecture handles any topology
Role-based multi-agent teams	CrewAI	Intuitive team metaphor, task dependencies
No-code visual workflow (self-hosted)	n8n	Visual builder, 400+ integrations, AI nodes
No-code workflow (cloud, simplest)	Zapier	Easiest setup, 6000+ app integrations

5. Anti-Patterns

Anti-patterns are common mistakes that seem reasonable but lead to poor outcomes. Recognizing them saves time, money, and frustration.

5.1 Common Mistakes

Anti-Pattern	What It Looks Like	Why It Fails	Better Approach
The God Prompt	One massive prompt that handles all logic, edge cases, and formatting	Exceeds context limits, becomes fragile, impossible to debug	Break into chains: classify -> route -> handle -> format
Premature Agent-ification	Using a ReAct agent loop for what a simple chain could do	10x more tokens, unpredictable behavior, much higher latency	Start with prompt-response, add agent only when needed
RAG Everything	Putting all data into a vector DB even when not needed	Retrieval noise degrades answers, embedding costs accumulate	Use RAG only for large, dynamic knowledge bases. Small static data goes in the prompt.
Infinite Agent Loop	No max iterations or termination condition on agent loops	Runaway costs, hangs, and no useful output	Always set max_iterations. Add explicit DONE/FINISH signals.
Memory Bloat	Storing entire conversation history in context forever	Context window overflow, irrelevant old context pollutes responses	Use summary memory, sliding window, or vector memory for long conversations
Framework Overload	Using LangChain + LangGraph + CrewAI + LlamaIndex in one project	Dependency conflicts, debugging nightmare, team confusion	Pick one primary framework. Add others only for specific capabilities.
Tool Explosion	Giving an agent access to 50+ tools	LLM cannot reliably choose from too many options, tool descriptions bloat context	Limit to 5-10 tools per agent. Use a tool router if you need more.

5.2 Pattern Smells

These signs indicate you may be using the wrong pattern:

# Pattern smell detection checklist

PATTERN_SMELLS = {
    "High latency for simple tasks": {
        "likely_cause": "Using Agent Loop for Prompt-Response tasks",
        "fix": "Downgrade to a simpler pattern",
        "metric": "If avg response > 10s for classification tasks"
    },
    "Inconsistent outputs": {
        "likely_cause": "Missing structured output parsing",
        "fix": "Add Pydantic output parser or function calling",
        "metric": "If JSON parse failures > 5% of requests"
    },
    "Context window errors": {
        "likely_cause": "Memory Bloat or God Prompt anti-pattern",
        "fix": "Implement summary memory or break prompt into chains",
        "metric": "If token count regularly exceeds 80% of limit"
    },
    "Agent loops without progress": {
        "likely_cause": "Infinite Agent Loop anti-pattern",
        "fix": "Add progress tracking and termination conditions",
        "metric": "If agent takes > 5 iterations for simple tasks"
    },
    "High cost per query": {
        "likely_cause": "Premature Agent-ification",
        "fix": "Audit each agent step - can it be a simple chain?",
        "metric": "If cost/query > $0.10 for routine operations"
    },
    "Retrieval returns irrelevant docs": {
        "likely_cause": "Poor chunking, wrong embedding model, or "
                        "RAG Everything anti-pattern",
        "fix": "Tune chunk size, use re-ranking, audit what needs RAG",
        "metric": "If retrieval relevance score < 0.7 average"
    }
}

                        
                        The Most Expensive Anti-Pattern: Building a multi-agent system when a well-crafted prompt would suffice. A 5-agent system costs 5-20x more per query than a single prompt-response. Always justify the added complexity with measurable quality improvements.
                    

6. Frontend Patterns

Building AI-powered UIs requires specialized frontend patterns for handling streaming data, rendering tool calls, and managing complex chat state. The LangChain ecosystem provides framework-specific hooks that abstract the complexity of real-time LLM communication into declarative, reactive components.

6.1 The useStream Hook

The useStream hook is the core building block for connecting a frontend to a LangGraph backend. Available for all major frameworks, it manages the WebSocket/SSE connection, handles streaming state updates, and exposes a reactive API for rendering messages as they arrive:

Framework	Package	Hook/Function
React	`@langchain/react`	`useStream()`
Vue	`@langchain/vue`	`useStream()`
Svelte	`@langchain/svelte`	`useStream()`
Angular	`@langchain/angular`	`useStream()`

import { useStream } from "@langchain/react";

function ChatApp() {
    const {
        messages,     // Reactive message list
        submit,       // Send a message
        isStreaming,  // Whether a response is in progress
        stop,         // Cancel the current stream
        error,        // Error state
    } = useStream({
        apiUrl: "http://localhost:8000",
        assistantId: "agent",
        threadId: "thread-001",
    });

    const handleSend = (text) => {
        submit({ messages: [{ role: "user", content: text }] });
    };

    return (
        
            {messages.map((msg, i) => (
                
                    {msg.content}
                
            ))}
            {isStreaming && Thinking...}
             {
                if (e.key === "Enter") handleSend(e.target.value);
            }} />
        
    );
}

                        
                        Key Insight: The useStream hook manages the full lifecycle — connection, reconnection, state synchronization, and cleanup. It handles optimistic updates (showing user messages immediately) and server reconciliation (replacing optimistic state with confirmed server state) automatically.
                    

For branching conversations (allowing users to edit previous messages and explore alternative paths), useStream supports a branch option that creates a new conversation fork from any checkpoint:

import { useStream } from "@langchain/react";

function BranchingChat() {
    const { messages, submit, branch } = useStream({
        apiUrl: "http://localhost:8000",
        assistantId: "agent",
    });

    const handleEdit = (messageIndex, newContent) => {
        // Branch from this message — creates a new thread fork
        branch({
            messages: [{ role: "user", content: newContent }],
            checkpoint: messages[messageIndex].checkpoint_id,
        });
    };

    return (
        
            {messages.map((msg, i) => (
                
                    {msg.content}
                    {msg.role === "user" && (
                        
                    )}
                
            ))}
        
    );
}

6.2 Tool Calling UI

When agents use tools, the frontend needs to render tool calls and their results as distinct UI elements — not just plain text. The useStream hook exposes tool call state through the messages array, where each tool call includes a name, args, and result with a state indicator (pending, completed, or error):

import { useStream } from "@langchain/react";

function ToolCallCard({ toolCall }) {
    const { name, args, result, state } = toolCall;

    return (
        
            
                
                    {state === "pending" ? "⏳" : state === "completed" ? "✅" : "❌"}
                
                {name}
            
            {JSON.stringify(args, null, 2)}
            {result && (
                
                    Result: {result}
                
            )}
        
    );
}

function AgentChat() {
    const { messages, submit } = useStream({
        apiUrl: "http://localhost:8000",
        assistantId: "agent",
    });

    return (
        
            {messages.map((msg, i) => (
                
                    {msg.content && {msg.content}}
                    {msg.tool_calls?.map((tc) => (
                        
                    ))}
                
            ))}
        
    );
}

For human-in-the-loop patterns, the frontend can render approval UI when the agent requests permission. The useStream hook pauses on interrupt events, and you resume execution by calling submit with a Command:

import { useStream } from "@langchain/react";

function ApprovalChat() {
    const { messages, submit, interrupt } = useStream({
        apiUrl: "http://localhost:8000",
        assistantId: "agent",
    });

    // When interrupt is set, agent is waiting for approval
    if (interrupt) {
        return (
            
                Agent needs approval
                {interrupt.value}
                
                
            
        );
    }

    return (
        
            {messages.map((msg, i) => (
                {msg.content}
            ))}
        
    );
}

                        
                        Security Note: Never execute tool call results as code on the frontend. Tool results from the agent should be treated as untrusted data — always sanitize and escape before rendering in the DOM. Use textContent or a sanitization library, never innerHTML with raw agent output.
                    

6.3 Integrations & Libraries

The LangChain frontend ecosystem integrates with popular UI component libraries for production-ready chat interfaces:

Library	Type	Best For
assistant-ui	React component library	Full-featured chat UI with tool rendering, branching, markdown
AI Elements / shadcn/ui	Headless components	Custom-styled chat with Tailwind CSS, full design control
CopilotKit	React framework	Copilot-style sidebars and embedded AI assistants
OpenUI	Web components	Framework-agnostic AI chat widgets

Integration

Generative UI Pattern

The Generative UI pattern lets the agent return structured data that the frontend renders as rich components — charts, forms, tables, maps — instead of plain text. This requires:

Structured output from the agent (JSON schema defining UI component type and data)
Component registry on the frontend mapping component types to React/Vue/Svelte components
Streaming support to progressively render complex UIs as data arrives

This pattern powers applications like data dashboards, interactive reports, and visual workflow builders where plain text responses are insufficient.

7. Exercises & Self-Assessment

Exercise 1

Pattern Identification

Identify which design pattern(s) each of these real-world applications uses:

ChatGPT — What patterns does the free version use? What about ChatGPT Plus with plugins?
GitHub Copilot — Is it prompt-response, RAG, or something more complex?
Notion AI — What patterns power its "Ask AI" feature vs its "Write with AI" feature?
Perplexity AI — How does it combine search with generation?
Cursor IDE — What pattern enables its multi-file code editing capability?

Exercise 2

Implement Three Patterns

Build these three applications, each using a different pattern:

Prompt-Response: An email subject line generator that produces 5 variants from an email body
RAG: A FAQ bot that answers questions from a set of at least 20 FAQ documents
Agent Loop: A research agent that searches the web, evaluates sources, and writes a summary

Compare: lines of code, tokens per query, latency, and output quality.

Exercise 3

Anti-Pattern Audit

Review this hypothetical AI application and identify all anti-patterns:

A customer support bot that uses a ReAct agent loop for every query (including "What are your hours?")
It stores the entire conversation history (no limit) in the prompt
It has 35 tools available including a calculator, web search, database query, email sender, and 31 others
All company data (50,000 docs) is in one vector store with 2048-token chunks
It uses LangChain + LlamaIndex + CrewAI + a custom framework simultaneously

For each anti-pattern, explain the problem and recommend a fix.

Exercise 4

Framework Selection Challenge

For each scenario, recommend a framework AND a design pattern. Justify both choices:

A legal firm wants to search across 100,000 contracts and answer natural language questions with source citations
A marketing team (non-technical) wants AI to auto-generate social posts from blog articles
A DevOps team wants an AI that autonomously investigates production alerts, checks logs, and suggests fixes
A startup wants to add an AI chatbot to their SaaS product that remembers previous conversations
A research institute wants multiple AI perspectives to debate policy recommendations

Exercise 5

Reflective Questions

Why is the "start simple, upgrade when needed" principle so important for AI applications specifically? How does it differ from traditional software development?
How do you quantify when a prompt-response pattern is "not good enough" and you need to upgrade to RAG or an agent?
What would a "design patterns" library for AI applications look like? How is it different from traditional GoF patterns?
Can you combine anti-patterns to create something worse than each individually? Give an example.
If you were building an AI application framework from scratch, which pattern would you make the easiest to implement and why?

AI Pattern Document Generator

Document an AI application design pattern for your project. Download as Word, Excel, PDF, or PowerPoint.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

Pattern Name *

Pattern Type *

Use Cases *

Components & Architecture

Trade-offs & Considerations

Additional Notes

Author Name

Conclusion & Next Steps

You now have a complete catalog of AI application design patterns and a systematic framework for choosing the right one. Here are the key takeaways from Part 11:

Core patterns — Prompt-Response, RAG, and Tool Use are the fundamental building blocks that every AI developer must master
Intermediate patterns — Chat + Memory, Document QA, and Workflow Automation combine core patterns with state management and persistence for production-grade applications
Advanced patterns — Agent Loops, Planning, and Multi-Agent Orchestration enable autonomous, multi-step reasoning but come with significantly higher cost and complexity
Framework-pattern matrix — LangChain excels at RAG and chains, LangGraph at complex agents, CrewAI at multi-agent teams, n8n/Zapier at no-code workflow automation
Start simple — The golden rule is to always start with the simplest pattern that could work, then upgrade only when you hit a specific limitation
Anti-patterns — The God Prompt, Premature Agent-ification, RAG Everything, Infinite Agent Loop, Memory Bloat, Framework Overload, and Tool Explosion are the most common and costly mistakes

Next in the Series

In Part 12: Ecosystem & Frameworks, we provide a comprehensive deep-dive comparison of every major framework and tool in the AI application ecosystem — LangChain, LangGraph, AutoGen, CrewAI, LlamaIndex, n8n, Zapier, HuggingFace, MCP, vLLM, vector databases, and more.

Cookie Consent

Cookie Preferences