AI Application Development Mastery Part 13: MCP Foundations & Architecture

Introduction: The Universal Protocol for AI Integration

                        
                        Series Overview: This is Part 13 of our 20-part AI Application Development Mastery series. We now enter the MCP chapters — the open protocol that standardizes how LLMs connect to tools, data, and the outside world. This part covers foundations and architecture; Part 14 takes MCP into production.
                    

AI Application Development Mastery

Your 20-step learning path • Currently on Step 13

1

13

MCP Foundations & Architecture

Protocol design, Host/Client/Server, primitives, security

You Are Here

14

MCP in Production

Building servers, integrations, scaling, agent systems

15

Evaluation & LLMOps

Prompt eval, tracing, LangSmith, experiment tracking

16

Production AI Systems

APIs, queues, caching, streaming, scaling

17

Safety, Guardrails & Reliability

Input filtering, hallucination mitigation, prompt injection

18

Advanced Topics

Fine-tuning, tool learning, hybrid LLM+symbolic

19

Building Real AI Applications

Chatbot, document QA, coding assistant, full-stack

20

Future of AI Applications

Autonomous agents, self-improving, multi-modal, AI OS

Every generation of computing has required a universal integration protocol to unlock its full potential. The web needed HTTP to connect browsers to servers. Mobile needed REST APIs to connect apps to backends. Peripherals needed USB to connect devices to computers. Now, the age of AI agents needs a protocol to connect LLMs to the rest of the world.

That protocol is the Model Context Protocol (MCP) — an open standard originally developed by Anthropic and now adopted across the industry. MCP defines a structured, vendor-neutral way for AI applications to discover tools, access data, execute actions, and interact with external systems through a clean client-server architecture.

Before MCP, every AI integration was a bespoke engineering effort. Connecting Claude to a database required different code than connecting GPT-4 to the same database. Every framework had its own tool definition format, its own transport mechanism, its own error handling. The result was a fragmented ecosystem where integration work consumed more engineering time than building actual AI capabilities.

                        
                        Key Insight: MCP is not a replacement for LangChain, LangGraph, or any orchestration framework. It operates at a different layer entirely — it standardizes the interface between AI applications and external capabilities. Think of MCP as the protocol layer that orchestration frameworks build on top of, just as HTTP is the protocol that web frameworks build on top of.
                    

1. What MCP Solves & Why It Matters

Every AI application that interacts with external data or tools faces the same integration challenge: building and maintaining custom connectors for each combination of AI model and external service. The Model Context Protocol (MCP) solves this with a universal, open standard — analogous to how USB standardized hardware peripherals. Instead of N×M custom integrations, MCP provides a single protocol that any AI client can use to connect to any MCP-compatible server, dramatically reducing integration complexity.

1.1 The Integration Problem

Consider the landscape before MCP. You want your AI agent to access a database, read files, call a REST API, and search the web. Here is what that looked like:

# BEFORE MCP: Every integration is custom, vendor-specific, and fragile
# Each tool requires its own implementation for each LLM provider

# pip install openai anthropic langchain langchain-community
import os
import json
import sqlite3
import requests

# --- OpenAI function calling (vendor-specific format) ---
openai_tools = [
    {
        "type": "function",
        "function": {
            "name": "query_database",
            "description": "Run a SQL query against the customer database",
            "parameters": {
                "type": "object",
                "properties": {
                    "sql": {"type": "string", "description": "SQL query to execute"},
                    "database": {"type": "string", "description": "Database name"}
                },
                "required": ["sql"]
            }
        }
    }
]

# --- Anthropic tool calling (different format for the same tool) ---
anthropic_tools = [
    {
        "name": "query_database",
        "description": "Run a SQL query against the customer database",
        "input_schema": {
            "type": "object",
            "properties": {
                "sql": {"type": "string", "description": "SQL query to execute"},
                "database": {"type": "string", "description": "Database name"}
            },
            "required": ["sql"]
        }
    }
]

# --- LangChain tool definition (yet another format) ---
from langchain.tools import tool

@tool
def query_database(sql: str, database: str = "customers") -> str:
    """Run a SQL query against the customer database."""
    conn = sqlite3.connect(database)
    cursor = conn.cursor()
    cursor.execute(sql)
    results = cursor.fetchall()
    conn.close()
    return json.dumps(results)

# The SAME tool defined THREE different ways for THREE different systems
# Multiply this by 50 tools and 5 LLM providers = 250 definitions to maintain
# Change the tool schema? Update it in all 250 places.

This fragmentation creates cascading problems: vendor lock-in (your tools only work with one provider), maintenance overhead (N tools times M providers equals N*M implementations), no interoperability (tools built for Claude cannot be used with GPT-4), and zero standardization (every integration is a snowflake).

1.2 HTTP for AI Agents — The USB-C Analogy

The best way to understand MCP's role is through analogies to protocols that solved similar fragmentation problems in other domains:

Domain	Before Standardization	After Standardization	Protocol
Web	Custom protocols per service (Gopher, FTP, WAIS)	Universal browser-to-server communication	HTTP/HTTPS
Peripherals	Serial, parallel, PS/2, FireWire, proprietary ports	One connector for everything	USB / USB-C
Databases	Vendor-specific query languages per database	Universal query language across all databases	SQL / ODBC
APIs	SOAP, XML-RPC, custom binary protocols	Uniform resource-based API design	REST / OpenAPI
AI Agents	Custom tool formats per provider (OpenAI, Anthropic, LangChain)	Universal tool/resource/prompt protocol	MCP

                        
                        The USB-C Moment: Before USB-C, you needed different cables for your phone, laptop, headphones, and monitor. MCP is the USB-C moment for AI integrations — one protocol that lets any AI host connect to any capability server. Build an MCP server once, and it works with Claude Desktop, Cursor, Windsurf, VS Code, and any future MCP-compatible host.
                    

1.3 MCP vs Alternatives — Comprehensive Comparison

MCP did not emerge in a vacuum. Several approaches to LLM-tool integration existed before it. Here is how they compare across the dimensions that matter most for production systems:

Criterion	OpenAI Function Calling	ChatGPT Plugins	LangChain Tools	AutoGen Tools	MCP
Vendor Lock-in	High — OpenAI only	Total — ChatGPT only	Medium — LangChain ecosystem	Medium — AutoGen ecosystem	None — vendor-neutral open standard
Modularity	Low — tools embedded in API call	Low — monolithic plugin manifest	Medium — Python decorators	Medium — function registration	High — decoupled server per capability
Interoperability	None across providers	None — deprecated by OpenAI	Within LangChain only	Within AutoGen only	Universal — any host to any server
Standardization	De facto standard for OpenAI	Abandoned (2024)	Community conventions	Microsoft conventions	Open specification with formal schema
Transport Options	HTTPS only	HTTPS only	In-process Python	In-process Python	STDIO, HTTP/SSE, WebSocket, gRPC
Capability Discovery	None — tools hardcoded	Static manifest file	Runtime introspection	Runtime introspection	Dynamic discovery protocol
Primitives	Tools only	Tools + auth	Tools + retrievers	Tools + code exec	Resources + Tools + Prompts + Sampling
Security Model	API key per call	OAuth (limited)	Application-level	Application-level	OAuth 2.0, JWT, mTLS, RBAC, sandboxing

                        
                        Why ChatGPT Plugins Failed: OpenAI launched ChatGPT Plugins in March 2023 with great fanfare, then quietly deprecated them by early 2024. The fundamental flaw was centralization — plugins had to be approved by OpenAI, hosted on specific infrastructure, and only worked within ChatGPT. MCP learned from this failure by being fully open, decentralized, and host-agnostic.
                    

1.4 Core Design Principles

MCP was designed around five principles that distinguish it from every prior approach to AI-tool integration:

Design Principles

The Five Pillars of MCP Design

Separation of Concerns: Hosts manage LLM interaction and UI. Clients manage protocol connections. Servers expose capabilities. Each component has a single responsibility and can be developed, deployed, and scaled independently.
Composability: An agent can connect to multiple MCP servers simultaneously — a database server, a web search server, a file system server — and the host orchestrates them seamlessly. Capabilities compose like UNIX pipes.
Least Privilege: Each MCP server declares exactly what capabilities it exposes, and clients can restrict which capabilities they request. A file-reading server never needs database write access. Permissions are granular and explicit.
Deterministic Tool Interfaces: Every tool has a JSON Schema definition that specifies its inputs and outputs precisely. There is no ambiguity about what a tool expects or returns. The LLM sees the schema and can generate valid invocations reliably.
Transport Agnosticism: MCP works over STDIO (for local processes), HTTP with Server-Sent Events (for web services), WebSocket (for bidirectional streaming), and gRPC (for high-performance). The protocol is the same regardless of transport.

Separation of Concerns Composability Least Privilege Deterministic Interfaces Transport Agnostic

# AFTER MCP: Define a tool ONCE, use it everywhere
# The same MCP server works with Claude, GPT-4, Gemini, Llama, any host

# pip install mcp
from mcp.server import Server
from mcp.types import Tool, TextContent
import json
import sqlite3

# Create an MCP server — one tool definition, universal compatibility
server = Server("database-server")

@server.list_tools()
async def list_tools():
    """Declare available tools via the MCP discovery protocol."""
    return [
        Tool(
            name="query_database",
            description="Run a read-only SQL query against the customer database",
            inputSchema={
                "type": "object",
                "properties": {
                    "sql": {
                        "type": "string",
                        "description": "SQL SELECT query to execute"
                    },
                    "database": {
                        "type": "string",
                        "description": "Database name",
                        "default": "customers.db"
                    }
                },
                "required": ["sql"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Execute a tool invocation from any MCP-compatible host."""
    if name == "query_database":
        sql = arguments["sql"]
        database = arguments.get("database", "customers.db")

        # Security: only allow SELECT queries
        if not sql.strip().upper().startswith("SELECT"):
            return [TextContent(
                type="text",
                text="Error: Only SELECT queries are allowed for safety."
            )]

        conn = sqlite3.connect(database)
        cursor = conn.cursor()
        cursor.execute(sql)
        results = cursor.fetchall()
        columns = [desc[0] for desc in cursor.description]
        conn.close()

        # Return structured results
        formatted = [dict(zip(columns, row)) for row in results]
        return [TextContent(type="text", text=json.dumps(formatted, indent=2))]

    raise ValueError(f"Unknown tool: {name}")

# This SINGLE server definition works with:
# - Claude Desktop (via STDIO transport)
# - Cursor IDE (via STDIO transport)
# - Windsurf (via STDIO transport)
# - Any custom MCP host (via HTTP/SSE or WebSocket)
# - Future AI hosts that adopt MCP
# Define once. Run everywhere.

History

How MCP Evolved from Anthropic's Tool-Use Experience

MCP grew out of Anthropic's internal experience building Claude's tool-use capabilities. When Anthropic launched Claude's tool calling in 2024, they encountered the same integration fragmentation that plagued the entire industry. Every enterprise customer had to write custom glue code to connect Claude to their systems.

Anthropic recognized that this was not a Claude-specific problem — it was an industry problem. In late 2024, they open-sourced MCP as a vendor-neutral specification, explicitly designing it so that competitors could adopt it. By early 2025, Cursor, Windsurf, VS Code, Replit, and dozens of other tools had adopted the protocol, validating the design. By 2026, MCP has become the de facto standard for AI-tool integration, with over 3,000 community-built MCP servers covering databases, APIs, developer tools, business applications, and more.

Anthropic Origin Open Source Vendor Neutral 3000+ Servers

2. Architecture Deep Dive

MCP follows a client-server architecture built on JSON-RPC 2.0, with clearly defined roles: hosts (AI applications like Claude Desktop or VS Code), clients (protocol connectors that maintain 1:1 server connections), and servers (lightweight services exposing tools, resources, and prompts). This layered design enables secure, composable integrations where each server is sandboxed and the host controls which capabilities are exposed to the AI model.

2.1 System Overview

MCP follows a layered architecture with clear separation between four components. Every MCP interaction flows through this chain:

MCP Architecture Overview

flowchart LR
    HOST["MCP HOST
Claude Desktop
Cursor IDE
Windsurf
Custom App"]
    CLIENT["MCP CLIENT
Protocol Layer
Session Mgmt
Retry Logic
Transport"]
    SERVER["MCP SERVER
Capability Provider
Tools
Resources
Prompts"]
    DATA["DATA / APIs
Databases
REST APIs
File Systems
SaaS & Vector DBs"]
    HOST -- "1 : N" --- CLIENT
    CLIENT -- "1 : 1" --- SERVER
    SERVER -- "1 : M" --- DATA

                        
                        A single Host manages multiple Clients. Each Client connects to exactly one Server. Each Server exposes capabilities from one or more Data sources. For example, Claude Desktop (host) manages 5 clients, each connected to a different server: filesystem, database, GitHub, Slack, web-search.
                    

2.2 MCP Hosts

An MCP Host is the application that the user interacts with directly. It manages the LLM, renders the UI, and orchestrates one or more MCP clients. The host is responsible for the overall user experience and for deciding how to route capabilities from multiple servers.

Host Application	Type	MCP Integration	Notable Features
Claude Desktop	Desktop App	Native MCP support via STDIO	First MCP host; JSON config for server registration; auto-starts servers
Cursor	IDE	MCP servers for code tools	Integrates MCP tools into code completion and chat; project-level config
Windsurf	IDE	MCP for IDE extensions	Cascade agent uses MCP tools; supports multi-server configurations
VS Code + Copilot	IDE Extension	MCP via extension API	GitHub Copilot Chat integrates MCP servers for workspace context
Custom Applications	Any app	MCP SDK integration	Build your own host using the mcp Python or TypeScript SDK

// Claude Desktop MCP configuration (~/.claude/claude_desktop_config.json)
// This tells the host which MCP servers to launch and how to connect
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/dev/projects"],
      "env": {}
    },
    "database": {
      "command": "python",
      "args": ["-m", "mcp_server_sqlite", "--db-path", "/Users/dev/data/app.db"],
      "env": {}
    },
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      }
    },
    "web-search": {
      "command": "python",
      "args": ["-m", "mcp_server_brave_search"],
      "env": {
        "BRAVE_API_KEY": "BSAxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      }
    }
  }
}

                        
                        Host Responsibilities: The host manages the LLM conversation loop, presents discovered tools to the LLM, routes tool calls to the appropriate client/server, handles user consent for sensitive operations, and aggregates results back into the conversation context. The host is the orchestrator — it never executes tools directly.
                    

2.3 MCP Clients

An MCP Client is the protocol layer that sits between the host and a server. Each client manages a single connection to a single server. The host creates one client per server it needs to communicate with.

# Building an MCP client that connects to a server
# This demonstrates the client lifecycle: connect, discover, invoke, disconnect

# pip install mcp
import asyncio
import os
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def run_mcp_client():
    """Demonstrate the full MCP client lifecycle."""

    # Step 1: Define server connection parameters
    # STDIO transport — the server runs as a child process
    server_params = StdioServerParameters(
        command="python",                          # Command to launch the server
        args=["-m", "mcp_server_sqlite",           # Server module
              "--db-path", "customers.db"],         # Server-specific arguments
        env={                                      # Environment variables
            "PATH": os.getenv("PATH", ""),
            "LOG_LEVEL": "INFO"
        }
    )

    # Step 2: Connect to the server via STDIO transport
    async with stdio_client(server_params) as (read_stream, write_stream):
        async with ClientSession(read_stream, write_stream) as session:

            # Step 3: Initialize the session (protocol handshake)
            await session.initialize()
            print("Session initialized successfully")

            # Step 4: Discover available capabilities
            # List all tools the server exposes
            tools_response = await session.list_tools()
            print(f"\nAvailable tools ({len(tools_response.tools)}):")
            for tool in tools_response.tools:
                print(f"  - {tool.name}: {tool.description}")
                print(f"    Schema: {tool.inputSchema}")

            # List all resources the server exposes
            resources_response = await session.list_resources()
            print(f"\nAvailable resources ({len(resources_response.resources)}):")
            for resource in resources_response.resources:
                print(f"  - {resource.uri}: {resource.name}")

            # List all prompt templates
            prompts_response = await session.list_prompts()
            print(f"\nAvailable prompts ({len(prompts_response.prompts)}):")
            for prompt in prompts_response.prompts:
                print(f"  - {prompt.name}: {prompt.description}")

            # Step 5: Invoke a tool
            result = await session.call_tool(
                "query_database",
                arguments={"sql": "SELECT name, email FROM customers LIMIT 5"}
            )
            print(f"\nTool result: {result.content[0].text}")

            # Step 6: Read a resource
            resource_content = await session.read_resource("sqlite:///customers/schema")
            print(f"\nResource content: {resource_content.contents[0].text}")

            # Step 7: Get a prompt template
            prompt_result = await session.get_prompt(
                "analyze-table",
                arguments={"table_name": "customers"}
            )
            print(f"\nPrompt template: {prompt_result.messages[0].content.text}")

    print("\nSession closed. Client disconnected.")

# Run the client
# asyncio.run(run_mcp_client())

Key client responsibilities include:

Connection management: Establishing, maintaining, and gracefully closing connections to servers
Session handling: Managing the protocol handshake (Initialize), capability negotiation, and session state
Streaming: Handling streamed responses for long-running operations via Server-Sent Events or WebSocket
Retry and fault tolerance: Implementing exponential backoff, connection pooling, and circuit-breaker patterns for unreliable servers
Message serialization: Converting between the host's internal format and MCP's JSON-RPC message format

2.4 MCP Servers

An MCP Server is the component that exposes actual capabilities — tools, resources, and prompts — to clients. Each server is a focused, single-purpose service that wraps one domain (a database, an API, a file system) in the MCP protocol.

# Complete MCP server implementation with all four primitive types
# This server provides database access with full MCP capability exposure

# pip install mcp aiosqlite
import json
import os
import asyncio
import aiosqlite
from mcp.server import Server
from mcp.types import (
    Tool, Resource, Prompt, PromptMessage,
    TextContent, PromptArgument, ResourceTemplate
)

# Database path from environment variable
DB_PATH = os.getenv("MCP_DB_PATH", "app.db")

# Create the MCP server instance
server = Server("enterprise-database-server")

# --- TOOLS: Actions the LLM can execute ---
@server.list_tools()
async def list_tools():
    """Expose database query and write tools."""
    return [
        Tool(
            name="query",
            description="Execute a read-only SQL SELECT query",
            inputSchema={
                "type": "object",
                "properties": {
                    "sql": {"type": "string", "description": "SQL SELECT query"},
                    "limit": {"type": "integer", "description": "Max rows", "default": 100}
                },
                "required": ["sql"]
            }
        ),
        Tool(
            name="insert_record",
            description="Insert a new record into a table",
            inputSchema={
                "type": "object",
                "properties": {
                    "table": {"type": "string", "description": "Target table name"},
                    "data": {
                        "type": "object",
                        "description": "Column-value pairs to insert",
                        "additionalProperties": True
                    }
                },
                "required": ["table", "data"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Handle tool invocations with safety checks."""
    if name == "query":
        sql = arguments["sql"].strip()
        limit = arguments.get("limit", 100)

        # Security: only allow SELECT statements
        if not sql.upper().startswith("SELECT"):
            return [TextContent(type="text", text="Error: Only SELECT queries allowed.")]

        # Enforce row limit
        if "LIMIT" not in sql.upper():
            sql = f"{sql} LIMIT {limit}"

        async with aiosqlite.connect(DB_PATH) as db:
            db.row_factory = aiosqlite.Row
            cursor = await db.execute(sql)
            rows = await cursor.fetchall()
            columns = [d[0] for d in cursor.description]
            results = [dict(zip(columns, row)) for row in rows]

        return [TextContent(type="text", text=json.dumps(results, indent=2, default=str))]

    elif name == "insert_record":
        table = arguments["table"]
        data = arguments["data"]

        # Security: validate table name (prevent SQL injection)
        if not table.isalnum():
            return [TextContent(type="text", text="Error: Invalid table name.")]

        columns = ", ".join(data.keys())
        placeholders = ", ".join(["?"] * len(data))
        values = list(data.values())

        async with aiosqlite.connect(DB_PATH) as db:
            await db.execute(
                f"INSERT INTO {table} ({columns}) VALUES ({placeholders})",
                values
            )
            await db.commit()

        return [TextContent(type="text", text=f"Successfully inserted record into {table}.")]

    raise ValueError(f"Unknown tool: {name}")

# --- RESOURCES: Data the LLM can read ---
@server.list_resources()
async def list_resources():
    """Expose database schema and table data as readable resources."""
    resources = [
        Resource(
            uri="db://schema",
            name="Database Schema",
            description="Complete schema of all tables in the database",
            mimeType="application/json"
        )
    ]

    # Dynamically list all tables as resources
    async with aiosqlite.connect(DB_PATH) as db:
        cursor = await db.execute(
            "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"
        )
        tables = await cursor.fetchall()
        for (table_name,) in tables:
            resources.append(Resource(
                uri=f"db://tables/{table_name}",
                name=f"Table: {table_name}",
                description=f"Sample data and schema for the {table_name} table",
                mimeType="application/json"
            ))

    return resources

@server.read_resource()
async def read_resource(uri: str):
    """Return resource content for a given URI."""
    if uri == "db://schema":
        async with aiosqlite.connect(DB_PATH) as db:
            cursor = await db.execute(
                "SELECT sql FROM sqlite_master WHERE type='table'"
            )
            schemas = await cursor.fetchall()
            return json.dumps([s[0] for s in schemas], indent=2)

    if uri.startswith("db://tables/"):
        table_name = uri.split("/")[-1]
        if not table_name.isalnum():
            return "Error: Invalid table name"

        async with aiosqlite.connect(DB_PATH) as db:
            # Return schema + sample rows
            cursor = await db.execute(f"PRAGMA table_info({table_name})")
            columns = await cursor.fetchall()
            cursor = await db.execute(f"SELECT * FROM {table_name} LIMIT 10")
            rows = await cursor.fetchall()
            col_names = [c[1] for c in columns]

            return json.dumps({
                "table": table_name,
                "columns": [{"name": c[1], "type": c[2], "nullable": not c[3]} for c in columns],
                "sample_rows": [dict(zip(col_names, row)) for row in rows],
                "row_count": len(rows)
            }, indent=2, default=str)

    raise ValueError(f"Unknown resource URI: {uri}")

# --- PROMPTS: Reusable templates for the LLM ---
@server.list_prompts()
async def list_prompts():
    """Expose reusable prompt templates."""
    return [
        Prompt(
            name="analyze-table",
            description="Generate a comprehensive analysis prompt for a database table",
            arguments=[
                PromptArgument(
                    name="table_name",
                    description="Name of the table to analyze",
                    required=True
                ),
                PromptArgument(
                    name="focus",
                    description="Analysis focus: 'quality', 'patterns', or 'summary'",
                    required=False
                )
            ]
        ),
        Prompt(
            name="write-query",
            description="Generate a prompt to help write a SQL query for a specific question",
            arguments=[
                PromptArgument(
                    name="question",
                    description="Natural language question to answer with SQL",
                    required=True
                )
            ]
        )
    ]

@server.get_prompt()
async def get_prompt(name: str, arguments: dict):
    """Return a populated prompt template."""
    if name == "analyze-table":
        table_name = arguments["table_name"]
        focus = arguments.get("focus", "summary")

        # Fetch schema for context
        async with aiosqlite.connect(DB_PATH) as db:
            cursor = await db.execute(f"PRAGMA table_info({table_name})")
            columns = await cursor.fetchall()
            schema_info = ", ".join([f"{c[1]} ({c[2]})" for c in columns])

        return {
            "messages": [
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text=f"Analyze the '{table_name}' table with focus on {focus}.\n\n"
                             f"Schema: {schema_info}\n\n"
                             f"Please provide:\n"
                             f"1. Data quality assessment\n"
                             f"2. Key patterns and distributions\n"
                             f"3. Potential issues or anomalies\n"
                             f"4. Recommended queries for deeper analysis"
                    )
                )
            ]
        }

    elif name == "write-query":
        question = arguments["question"]

        # Fetch full schema for context
        async with aiosqlite.connect(DB_PATH) as db:
            cursor = await db.execute(
                "SELECT sql FROM sqlite_master WHERE type='table'"
            )
            schemas = await cursor.fetchall()
            schema_text = "\n".join([s[0] for s in schemas if s[0]])

        return {
            "messages": [
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text=f"Write a SQL query to answer: {question}\n\n"
                             f"Database schema:\n{schema_text}\n\n"
                             f"Requirements:\n"
                             f"- Use only SELECT statements\n"
                             f"- Include appropriate JOINs if needed\n"
                             f"- Add LIMIT clause for safety\n"
                             f"- Explain the query logic"
                    )
                )
            ]
        }

    raise ValueError(f"Unknown prompt: {name}")

Server design considerations for production:

Stateless vs Stateful: Prefer stateless servers where possible. State should live in the data layer (database, cache), not the server process. This enables horizontal scaling and fault tolerance.
Latency: Tool invocations should complete within 5 seconds for interactive use. For long-running operations, use streaming responses to provide progress updates.
Observability: Instrument every tool call with structured logging, request tracing (correlation IDs), and metrics (latency histograms, error rates).
Idempotency: Write operations should be idempotent when possible. If the client retries a failed insert, it should not create duplicate records.

2.5 Data Layer Integration

MCP servers bridge the gap between AI agents and the data they need. The data layer spans both local and remote sources:

Data Source	Type	MCP Integration Pattern	Example Server
Local Filesystem	Local	Resources for reading, Tools for writing	@modelcontextprotocol/server-filesystem
SQLite / PostgreSQL	Local / Remote	Resources for schema, Tools for queries	mcp-server-sqlite, mcp-server-postgres
REST / GraphQL APIs	Remote	Tools that wrap HTTP calls	Custom server per API
SaaS Platforms	Remote	Tools for CRUD operations on SaaS entities	mcp-server-github, mcp-server-slack
Vector Databases	Local / Remote	Resources for similarity search, Tools for indexing	Custom server wrapping Chroma, Pinecone, Qdrant
Knowledge Graphs	Remote	Resources for traversal, Tools for queries	Custom server wrapping Neo4j, Amazon Neptune

# MCP server wrapping a vector database for semantic search
# Demonstrates the Resources pattern for RAG-style retrieval

# pip install mcp chromadb sentence-transformers
import os
import json
import chromadb
from mcp.server import Server
from mcp.types import Tool, Resource, TextContent

# Initialize ChromaDB with persistent storage
CHROMA_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_data")
chroma_client = chromadb.PersistentClient(path=CHROMA_PATH)

server = Server("vector-search-server")

@server.list_tools()
async def list_tools():
    """Expose semantic search and indexing tools."""
    return [
        Tool(
            name="semantic_search",
            description="Search the knowledge base using natural language",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Natural language search query"},
                    "collection": {"type": "string", "description": "Collection name", "default": "documents"},
                    "top_k": {"type": "integer", "description": "Number of results", "default": 5}
                },
                "required": ["query"]
            }
        ),
        Tool(
            name="index_document",
            description="Add a document to the knowledge base",
            inputSchema={
                "type": "object",
                "properties": {
                    "text": {"type": "string", "description": "Document text to index"},
                    "metadata": {"type": "object", "description": "Document metadata"},
                    "collection": {"type": "string", "description": "Target collection", "default": "documents"}
                },
                "required": ["text"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Handle vector search and indexing operations."""
    if name == "semantic_search":
        query = arguments["query"]
        collection_name = arguments.get("collection", "documents")
        top_k = arguments.get("top_k", 5)

        collection = chroma_client.get_or_create_collection(collection_name)
        results = collection.query(query_texts=[query], n_results=top_k)

        # Format results with metadata and relevance scores
        formatted = []
        for i, (doc, meta, dist) in enumerate(zip(
            results["documents"][0],
            results["metadatas"][0],
            results["distances"][0]
        )):
            formatted.append({
                "rank": i + 1,
                "text": doc,
                "metadata": meta,
                "similarity_score": round(1 - dist, 4)  # Convert distance to similarity
            })

        return [TextContent(type="text", text=json.dumps(formatted, indent=2))]

    elif name == "index_document":
        text = arguments["text"]
        metadata = arguments.get("metadata", {})
        collection_name = arguments.get("collection", "documents")

        collection = chroma_client.get_or_create_collection(collection_name)

        # Generate a deterministic ID for idempotency
        import hashlib
        doc_id = hashlib.sha256(text.encode()).hexdigest()[:16]

        collection.upsert(
            documents=[text],
            metadatas=[metadata],
            ids=[doc_id]
        )

        return [TextContent(
            type="text",
            text=f"Document indexed successfully. ID: {doc_id}, Collection: {collection_name}"
        )]

    raise ValueError(f"Unknown tool: {name}")

@server.list_resources()
async def list_resources():
    """Expose collection metadata as resources."""
    collections = chroma_client.list_collections()
    return [
        Resource(
            uri=f"vector://collections/{col.name}",
            name=f"Collection: {col.name}",
            description=f"Metadata and stats for the {col.name} vector collection",
            mimeType="application/json"
        )
        for col in collections
    ]

Case Study

Claude Desktop's MCP Ecosystem

Claude Desktop was the first production MCP host, and its ecosystem demonstrates the power of the protocol. A typical power user's Claude Desktop configuration connects to 5-10 MCP servers simultaneously:

Filesystem server — read and write project files directly from chat
GitHub server — create issues, review PRs, search repositories
Slack server — send messages, search conversations, manage channels
PostgreSQL server — query production databases (read-only)
Brave Search server — real-time web search with citations

Claude can seamlessly combine capabilities: "Search GitHub for open issues about authentication, check the relevant code files, query the database for affected users, and draft a Slack message to the team with your findings." One prompt, five MCP servers, zero custom integration code.

Claude Desktop Multi-Server Ecosystem Zero Custom Code

3. Core MCP Primitives

MCP defines four core primitives that cover the full spectrum of AI-system interactions. Together they form a complete vocabulary: read data (Resources), take actions (Tools), reuse templates (Prompts), and delegate reasoning (Sampling).

3.1 Resources (READ) — Structured Data Access

Resources represent data that the LLM can read but not modify. They are identified by URIs and return structured or unstructured content. Think of Resources as a read-only API for the LLM's knowledge.

Resource Type	URI Pattern	Content	Use Case
Documents	file:///docs/guide.md	Markdown, PDF, text	Knowledge base articles, documentation
Database Queries	db://tables/users/schema	JSON schema, sample rows	Schema discovery, data previews
API Responses	api://weather/current	JSON data	Real-time data feeds
Vector Search Results	vector://search?q=deployment	Ranked document chunks	Semantic retrieval for RAG
Configuration	config://app/settings	JSON/YAML config	Application state, feature flags

Advanced resource patterns include: pagination (using cursor-based or offset parameters in the URI), filtering (query parameters that narrow results), chunking (splitting large documents into LLM-friendly sizes), and caching (ETags or last-modified headers to avoid re-fetching unchanged data).

3.2 Tools (ACT) — Actions with Side Effects

Tools are the workhorse of MCP — they let the LLM do things. Unlike Resources (read-only), Tools can have side effects: writing to databases, calling APIs, sending emails, creating files. Every Tool is defined by a JSON Schema that makes its interface completely explicit.

# Advanced tool patterns: idempotency, side-effect control, tool chaining
# Demonstrates production-grade tool implementation

# pip install mcp httpx
import os
import json
import hashlib
import httpx
from datetime import datetime, timezone
from mcp.server import Server
from mcp.types import Tool, TextContent

# API key from environment
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", "")
server = Server("github-tools-server")

@server.list_tools()
async def list_tools():
    """Expose GitHub operations as MCP tools."""
    return [
        Tool(
            name="create_issue",
            description="Create a GitHub issue with title, body, and labels",
            inputSchema={
                "type": "object",
                "properties": {
                    "repo": {
                        "type": "string",
                        "description": "Repository in 'owner/name' format"
                    },
                    "title": {
                        "type": "string",
                        "description": "Issue title",
                        "maxLength": 256
                    },
                    "body": {
                        "type": "string",
                        "description": "Issue body (supports Markdown)"
                    },
                    "labels": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "Labels to apply",
                        "default": []
                    },
                    "idempotency_key": {
                        "type": "string",
                        "description": "Unique key to prevent duplicate creation"
                    }
                },
                "required": ["repo", "title", "body"]
            }
        ),
        Tool(
            name="search_code",
            description="Search for code across GitHub repositories",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query (supports GitHub search syntax)"
                    },
                    "language": {
                        "type": "string",
                        "description": "Filter by programming language"
                    },
                    "max_results": {
                        "type": "integer",
                        "description": "Maximum results to return",
                        "default": 10,
                        "maximum": 50
                    }
                },
                "required": ["query"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Execute GitHub tool invocations with safety and idempotency."""
    headers = {
        "Authorization": f"Bearer {GITHUB_TOKEN}",
        "Accept": "application/vnd.github.v3+json",
        "X-GitHub-Api-Version": "2022-11-28"
    }

    async with httpx.AsyncClient(base_url="https://api.github.com") as client:

        if name == "create_issue":
            repo = arguments["repo"]
            title = arguments["title"]
            body = arguments["body"]
            labels = arguments.get("labels", [])

            # Idempotency: check if issue with same title already exists
            idempotency_key = arguments.get(
                "idempotency_key",
                hashlib.sha256(f"{repo}:{title}".encode()).hexdigest()[:12]
            )

            # Search for existing issue with the idempotency marker
            search_response = await client.get(
                f"/repos/{repo}/issues",
                headers=headers,
                params={"state": "all", "per_page": 5}
            )

            if search_response.status_code == 200:
                existing = [
                    i for i in search_response.json()
                    if i.get("title") == title
                ]
                if existing:
                    return [TextContent(
                        type="text",
                        text=json.dumps({
                            "status": "already_exists",
                            "issue_number": existing[0]["number"],
                            "url": existing[0]["html_url"],
                            "message": "Issue with identical title already exists."
                        }, indent=2)
                    )]

            # Create the issue
            response = await client.post(
                f"/repos/{repo}/issues",
                headers=headers,
                json={
                    "title": title,
                    "body": f"{body}\n\n---\n_idempotency_key: {idempotency_key}_",
                    "labels": labels
                }
            )

            if response.status_code == 201:
                issue = response.json()
                return [TextContent(
                    type="text",
                    text=json.dumps({
                        "status": "created",
                        "issue_number": issue["number"],
                        "url": issue["html_url"],
                        "created_at": issue["created_at"]
                    }, indent=2)
                )]
            else:
                return [TextContent(
                    type="text",
                    text=f"Error creating issue: {response.status_code} - {response.text}"
                )]

        elif name == "search_code":
            query = arguments["query"]
            language = arguments.get("language", "")
            max_results = arguments.get("max_results", 10)

            # Build GitHub search query
            search_query = query
            if language:
                search_query += f" language:{language}"

            response = await client.get(
                "/search/code",
                headers=headers,
                params={"q": search_query, "per_page": min(max_results, 50)}
            )

            if response.status_code == 200:
                data = response.json()
                results = []
                for item in data.get("items", [])[:max_results]:
                    results.append({
                        "repository": item["repository"]["full_name"],
                        "path": item["path"],
                        "url": item["html_url"],
                        "score": item.get("score", 0)
                    })
                return [TextContent(
                    type="text",
                    text=json.dumps({
                        "total_count": data.get("total_count", 0),
                        "results": results
                    }, indent=2)
                )]
            else:
                return [TextContent(
                    type="text",
                    text=f"Search error: {response.status_code} - {response.text}"
                )]

    raise ValueError(f"Unknown tool: {name}")

3.3 Prompts (REUSE) — Reusable Templates

MCP Prompts are reusable, parameterized templates that servers expose for common interaction patterns. They are not just strings — they are structured message sequences that can include system prompts, user messages, and even pre-filled assistant responses.

# Advanced prompt patterns: versioning, parameterization, injection defense
# Demonstrates production-grade prompt templates

# pip install mcp
from mcp.server import Server
from mcp.types import (
    Prompt, PromptArgument, PromptMessage, TextContent
)

server = Server("prompt-library-server")

# Prompt template registry with versioning
PROMPT_TEMPLATES = {
    "code-review": {
        "version": "2.1",
        "description": "Generate a thorough code review with security analysis",
        "system_prompt": (
            "You are a senior software engineer conducting a code review. "
            "Focus on: correctness, security vulnerabilities, performance, "
            "readability, and adherence to best practices. "
            "IMPORTANT: Never execute code suggestions. Only analyze and recommend."
        ),
        "user_template": (
            "Please review the following {language} code:\n\n"
            "```{language}\n{code}\n```\n\n"
            "Context: {context}\n\n"
            "Focus areas: {focus_areas}\n\n"
            "Provide your review in the following format:\n"
            "1. Summary (1-2 sentences)\n"
            "2. Critical Issues (security, correctness)\n"
            "3. Improvements (performance, readability)\n"
            "4. Positive Aspects\n"
            "5. Suggested Refactoring (with code examples)"
        )
    },
    "incident-response": {
        "version": "1.3",
        "description": "Guide incident response and root cause analysis",
        "system_prompt": (
            "You are an SRE incident commander. Help analyze the incident, "
            "identify root causes, and recommend mitigations. "
            "Be systematic and prioritize by severity. "
            "CRITICAL: Do not suggest running destructive commands."
        ),
        "user_template": (
            "Incident: {incident_title}\n"
            "Severity: {severity}\n"
            "Service: {service_name}\n"
            "Symptoms: {symptoms}\n"
            "Timeline: {timeline}\n\n"
            "Please provide:\n"
            "1. Initial assessment and severity validation\n"
            "2. Likely root causes (ranked by probability)\n"
            "3. Immediate mitigation steps\n"
            "4. Investigation queries to run\n"
            "5. Post-incident action items"
        )
    }
}

@server.list_prompts()
async def list_prompts():
    """Expose versioned prompt templates with parameter definitions."""
    return [
        Prompt(
            name="code-review",
            description=f"[v{PROMPT_TEMPLATES['code-review']['version']}] "
                        f"{PROMPT_TEMPLATES['code-review']['description']}",
            arguments=[
                PromptArgument(name="code", description="Code to review", required=True),
                PromptArgument(name="language", description="Programming language", required=True),
                PromptArgument(name="context", description="PR context or description", required=False),
                PromptArgument(name="focus_areas", description="Specific areas to focus on", required=False)
            ]
        ),
        Prompt(
            name="incident-response",
            description=f"[v{PROMPT_TEMPLATES['incident-response']['version']}] "
                        f"{PROMPT_TEMPLATES['incident-response']['description']}",
            arguments=[
                PromptArgument(name="incident_title", description="Incident title", required=True),
                PromptArgument(name="severity", description="P0-P4", required=True),
                PromptArgument(name="service_name", description="Affected service", required=True),
                PromptArgument(name="symptoms", description="Observed symptoms", required=True),
                PromptArgument(name="timeline", description="Event timeline", required=False)
            ]
        )
    ]

@server.get_prompt()
async def get_prompt(name: str, arguments: dict):
    """Return populated prompt with injection defense."""
    if name not in PROMPT_TEMPLATES:
        raise ValueError(f"Unknown prompt: {name}")

    template = PROMPT_TEMPLATES[name]

    # Injection defense: sanitize all user-provided arguments
    sanitized_args = {}
    for key, value in arguments.items():
        if isinstance(value, str):
            # Remove potential prompt injection markers
            sanitized = value.replace("IGNORE PREVIOUS INSTRUCTIONS", "[FILTERED]")
            sanitized = sanitized.replace("", "[FILTERED]")
            sanitized = sanitized.replace("", "[FILTERED]")
            sanitized_args[key] = sanitized
        else:
            sanitized_args[key] = value

    # Fill defaults for optional arguments
    sanitized_args.setdefault("context", "No additional context provided")
    sanitized_args.setdefault("focus_areas", "All areas")
    sanitized_args.setdefault("timeline", "Not provided")

    # Build the message sequence
    user_text = template["user_template"].format(**sanitized_args)

    return {
        "messages": [
            PromptMessage(
                role="assistant",
                content=TextContent(type="text", text=template["system_prompt"])
            ),
            PromptMessage(
                role="user",
                content=TextContent(type="text", text=user_text)
            )
        ]
    }

3.4 Sampling (THINK) — Delegated Reasoning

Sampling is MCP's most distinctive primitive. It inverts the normal flow: instead of the host asking the server to execute a tool, the server asks the host to perform an LLM completion. This enables servers to leverage AI reasoning without needing their own LLM access.

                        
                        Important Distinction: Sampling requests flow from server to host, the opposite direction of tool calls. The server says "I need the LLM to reason about this data before I can continue." The host fulfills the request using its LLM and returns the result. This keeps all LLM interaction centralized in the host while letting servers participate in the reasoning chain.
                    

Key sampling use cases:

Tool result summarization: A database server returns 500 rows, then asks the host to summarize the key findings before returning to the user
Multi-step planning: A code analysis server asks the host to plan which files to examine based on an initial code scan
Content classification: A content moderation server asks the host to classify user input before deciding which tool to invoke
Error interpretation: A deployment server encounters an error log and asks the host to interpret the stack trace before suggesting remediation

3.5 Transports (CONNECT) — Communication Channels

MCP is transport-agnostic — the protocol messages are the same regardless of how they are delivered. The choice of transport depends on the deployment context:

Transport	Mechanism	Best For	Latency	Scalability
STDIO	Standard input/output of child process	Local development, desktop apps (Claude Desktop, Cursor)	Lowest (~1ms)	Single user only
HTTP + SSE	HTTP POST for requests, Server-Sent Events for streaming	Web services, multi-user, cloud deployments	Low (~10-50ms)	Horizontal scaling via load balancer
WebSocket	Persistent bidirectional connection	Real-time streaming, long-lived sessions	Lowest for sustained connections (~5ms)	Good with connection managers
gRPC	Protocol Buffers over HTTP/2	High-performance microservices, large payloads	Lowest at scale (~2-5ms)	Excellent — built for microservices

# Running the same MCP server over different transports
# The server logic is identical — only the transport layer changes

# pip install mcp uvicorn
import asyncio
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.server.sse import SseServerTransport
from mcp.types import Tool, TextContent

# Create the server (transport-independent)
server = Server("multi-transport-demo")

@server.list_tools()
async def list_tools():
    """Same tools regardless of transport."""
    return [
        Tool(
            name="greet",
            description="Generate a greeting message",
            inputSchema={
                "type": "object",
                "properties": {
                    "name": {"type": "string", "description": "Name to greet"}
                },
                "required": ["name"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "greet":
        return [TextContent(type="text", text=f"Hello, {arguments['name']}!")]
    raise ValueError(f"Unknown tool: {name}")


# --- Transport Option 1: STDIO (for local/desktop use) ---
async def run_stdio():
    """Run as a child process communicating via stdin/stdout."""
    async with stdio_server() as (read_stream, write_stream):
        await server.run(read_stream, write_stream, server.create_initialization_options())


# --- Transport Option 2: HTTP + SSE (for web/cloud use) ---
def create_sse_app():
    """Create an HTTP app with Server-Sent Events transport."""
    from starlette.applications import Starlette
    from starlette.routing import Route

    sse_transport = SseServerTransport("/messages")

    async def handle_sse(request):
        """Handle SSE connections from MCP clients."""
        async with sse_transport.connect_sse(
            request.scope, request.receive, request._send
        ) as streams:
            await server.run(
                streams[0], streams[1],
                server.create_initialization_options()
            )

    async def handle_messages(request):
        """Handle incoming JSON-RPC messages via HTTP POST."""
        await sse_transport.handle_post_message(
            request.scope, request.receive, request._send
        )

    app = Starlette(routes=[
        Route("/sse", endpoint=handle_sse),
        Route("/messages", endpoint=handle_messages, methods=["POST"]),
    ])
    return app

# To run STDIO: asyncio.run(run_stdio())
# To run HTTP/SSE: uvicorn.run(create_sse_app(), host="0.0.0.0", port=8000)

4. The FastMCP Python SDK — Hands-On Guide

Now that you understand the MCP architecture and its core primitives conceptually, let us write real code. The FastMCP SDK is the official high-level Python library that makes building MCP servers simple and intuitive. If the previous sections explained the what and why of MCP, this section covers the how.

                        
                        Analogy: FastMCP is to MCP what Flask is to HTTP — it handles the protocol plumbing (JSON-RPC messages, schema generation, transport negotiation) so you can focus entirely on your tools and data. You write normal Python functions; FastMCP converts them into fully compliant MCP capabilities automatically.
                    

Getting Started with FastMCP

FastMCP lives in the mcp Python package. Install it with pip and create your first server in just three lines:

# pip install "mcp[cli]" httpx
from mcp.server.fastmcp import FastMCP

# Create an MCP server instance — the name identifies your server to clients
mcp = FastMCP("my-awesome-server")

That is it. The FastMCP constructor takes a server name (this appears in client UIs like Claude Desktop or Cursor) and returns a server instance. You then add capabilities to this instance using decorators.

The mcp[cli] install extra includes the mcp command-line tool for testing and debugging your servers. The httpx library is commonly used for making async HTTP requests inside your tools.

@mcp.tool() — Exposing Functions to LLMs

The @mcp.tool() decorator is the most important concept in FastMCP. It takes a normal Python function and registers it as a tool that any connected LLM can call. The key insight is that you write a normal Python function with good docstrings and type hints, and FastMCP converts it into a fully-described, schema-validated MCP tool automatically.

Here is a concrete example — a weather alerts tool from the Anthropic documentation:

@mcp.tool()
async def get_alerts(state: str) -> str:
    """Get weather alerts for a US state.

    Args:
        state: Two-letter US state code (e.g. CA, NY)
    """
    url = f"{NWS_API_BASE}/alerts/active/area/{state}"
    data = await make_nws_request(url)
    if not data or "features" not in data:
        return "Unable to fetch alerts or no alerts found."
    alerts = [format_alert(f) for f in data["features"]]
    return "\n---\n".join(alerts)

When an LLM sees this tool, it knows: name=get_alerts, takes a state string, returns weather alerts. The docstring tells the LLM when to use it, and the Args section tells it how to call it correctly.

Here is exactly how FastMCP translates your Python code into the MCP protocol:

Your Code	MCP Schema	Purpose
Function name	`tool.name`	How the LLM identifies the tool
Docstring	`tool.description`	How the LLM decides to use it
Type hints	`inputSchema` (JSON Schema)	Input validation
Args docstring	Parameter descriptions	Helps LLM provide correct arguments
Return type	Output format	What the LLM receives back

This is the magic of FastMCP: good Python practices (type hints, docstrings) directly become good MCP tool descriptions. There is no separate schema file, no configuration YAML, no manual JSON Schema authoring.

@mcp.resource() — Exposing Data to LLMs

The @mcp.resource() decorator exposes read-only data that the application can retrieve. Unlike tools (which the LLM decides to call), resources are application-controlled — the host application decides when to read them. Think of resources as files or API endpoints that provide context to the LLM.

Resources use URI patterns — static URIs for fixed data and URI templates for dynamic data:

@mcp.resource("config://app")
def get_config() -> str:
    """Get the current application configuration."""
    return json.dumps({"theme": "dark", "language": "en", "version": "2.1.0"})

@mcp.resource("users://{user_id}/profile")
def get_user_profile(user_id: str) -> str:
    """Get a user's profile by their ID."""
    # In production, this would query a database
    profiles = {"alice": "Alice Smith - Engineer", "bob": "Bob Jones - Designer"}
    return profiles.get(user_id, f"User {user_id} not found")

The first resource uses a static URI (config://app) — there is exactly one configuration. The second uses a URI template (users://{user_id}/profile) — the {user_id} placeholder means the application can request any user's profile by substituting the ID.

@mcp.prompt() — Reusable Templates

The @mcp.prompt() decorator creates reusable prompt templates that users can invoke. These are user-controlled — think of them like slash commands in Slack or Discord. Prompts let you package complex, well-crafted instructions into simple, parameterized templates.

@mcp.prompt()
def review_code(code: str, language: str = "python") -> str:
    """Review code for bugs and improvements.

    Args:
        code: The source code to review
        language: Programming language of the code
    """
    return f"""Please review this {language} code for:
1. Bugs and potential errors
2. Performance improvements
3. Security vulnerabilities
4. Code style and best practices

Code to review:
```{language}
{code}
```"""

When a user selects this prompt in their MCP client, they are asked to provide the code and optionally the language. The template then generates a well-structured review request for the LLM.

Running Your MCP Server

With your tools, resources, and prompts defined, running your server is a single line. The transport parameter determines how clients connect:

# Run with STDIO transport (for local clients like Claude Desktop, Cursor)
mcp.run(transport="stdio")

Here is a complete, copy-paste-ready weather server that ties everything together:

# weather_server.py — Complete MCP Weather Server
# pip install "mcp[cli]" httpx

import json
import httpx
from mcp.server.fastmcp import FastMCP

# Create the MCP server
mcp = FastMCP("weather-server")

NWS_API_BASE = "https://api.weather.gov"
USER_AGENT = "weather-app/1.0"

async def make_nws_request(url: str) -> dict | None:
    """Make a request to the NWS API with proper headers."""
    headers = {"User-Agent": USER_AGENT, "Accept": "application/geo+json"}
    async with httpx.AsyncClient() as client:
        response = await client.get(url, headers=headers, timeout=30.0)
        response.raise_for_status()
        return response.json()

def format_alert(feature: dict) -> str:
    """Format a single weather alert for display."""
    props = feature["properties"]
    return f"""
Event: {props.get('event', 'Unknown')}
Area: {props.get('areaDesc', 'Unknown')}
Severity: {props.get('severity', 'Unknown')}
Description: {props.get('description', 'No description')}
Instructions: {props.get('instruction', 'No instructions')}
"""

@mcp.tool()
async def get_alerts(state: str) -> str:
    """Get weather alerts for a US state.

    Args:
        state: Two-letter US state code (e.g. CA, NY)
    """
    url = f"{NWS_API_BASE}/alerts/active/area/{state}"
    data = await make_nws_request(url)
    if not data or "features" not in data:
        return "Unable to fetch alerts or no alerts found."
    alerts = [format_alert(f) for f in data["features"]]
    return "\n---\n".join(alerts) if alerts else "No active alerts for this state."

@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
    """Get the weather forecast for a location.

    Args:
        latitude: Latitude of the location
        longitude: Longitude of the location
    """
    # First get the forecast grid endpoint
    points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
    points_data = await make_nws_request(points_url)
    if not points_data:
        return "Unable to fetch forecast data for this location."
    forecast_url = points_data["properties"]["forecast"]
    forecast_data = await make_nws_request(forecast_url)
    if not forecast_data:
        return "Unable to fetch forecast."
    periods = forecast_data["properties"]["periods"][:5]
    forecasts = [f"{p['name']}: {p['detailedForecast']}" for p in periods]
    return "\n---\n".join(forecasts)

@mcp.resource("config://weather")
def get_weather_config() -> str:
    """Get the weather server configuration and supported regions."""
    return json.dumps({
        "api": "National Weather Service",
        "coverage": "United States",
        "update_frequency": "Every 15 minutes"
    })

@mcp.prompt()
def weather_briefing(state: str) -> str:
    """Generate a comprehensive weather briefing for a state.

    Args:
        state: Two-letter US state code (e.g. CA, NY)
    """
    return f"""Please provide a comprehensive weather briefing for {state}:
1. Check current weather alerts
2. Summarize any severe weather warnings
3. Provide a general outlook
Use the available weather tools to gather this information."""

# Start the server
if __name__ == "__main__":
    mcp.run(transport="stdio")

To connect this server to Claude Desktop, add it to your Claude Desktop configuration file:

{
  "mcpServers": {
    "weather": {
      "command": "python",
      "args": ["weather_server.py"]
    }
  }
}

                        
                        Key Insight: The entire FastMCP SDK is designed around one principle — write normal Python functions with good docstrings and type hints, and FastMCP handles everything else: schema generation, validation, transport, and protocol compliance. The three decorators map directly to the three core MCP primitives: @mcp.tool() = LLM calls it, @mcp.resource() = app reads it, @mcp.prompt() = user invokes it.
                    

5. Protocol Flow & Lifecycle

Understanding MCP’s protocol lifecycle is essential for building reliable integrations. Every MCP session follows a structured flow: initialization (capability negotiation), operation (request/response and notification exchanges), and shutdown (graceful disconnection). This section traces the complete message flow from connection establishment through tool invocation, showing how JSON-RPC messages coordinate between client and server at each stage.

5.1 End-to-End Flow

Understanding the complete message flow is essential for debugging and optimizing MCP-based systems. Here is the full lifecycle of a user request through the MCP stack:

# End-to-End MCP Flow: User asks "How many active users do we have?"
#
# Step 1: USER INPUT
#   User types: "How many active users do we have?"
#
# Step 2: HOST PROCESSING
#   Claude Desktop receives the message.
#   Host adds the message to the conversation context.
#   Host includes tool descriptions from all connected MCP servers.
#
# Step 3: LLM DECISION
#   Claude sees available tools: [query_database, search_web, read_file, ...]
#   Claude decides to use "query_database" tool.
#   Claude generates: {"tool": "query_database", "args": {"sql": "SELECT COUNT(*) ..."}}
#
# Step 4: HOST -> CLIENT -> SERVER
#   Host identifies which client manages the database server.
#   Client sends JSON-RPC tool invocation to the server.
#   Server executes the SQL query against the actual database.
#
# Step 5: SERVER -> CLIENT -> HOST
#   Server returns: {"result": [{"count": 12847}]}
#   Client passes result back to host.
#   Host injects tool result into the conversation context.
#
# Step 6: LLM SYNTHESIS
#   Claude sees the tool result in context.
#   Claude generates: "You currently have 12,847 active users."
#
# Step 7: USER RESPONSE
#   Host renders Claude's response in the chat UI.
#   Total time: ~2-4 seconds (LLM inference dominates)

5.2 Message Types

MCP uses JSON-RPC 2.0 as its message format. Every interaction between client and server is a JSON-RPC message. The key message types in the protocol lifecycle are:

Phase	Message Type	Direction	Purpose
Initialization	initialize	Client -> Server	Protocol handshake, version negotiation, capability exchange
Initialization	initialized	Client -> Server	Confirms initialization is complete
Discovery	tools/list	Client -> Server	Request list of available tools with schemas
Discovery	resources/list	Client -> Server	Request list of available resources with URIs
Discovery	prompts/list	Client -> Server	Request list of available prompt templates
Invocation	tools/call	Client -> Server	Execute a tool with provided arguments
Invocation	resources/read	Client -> Server	Fetch content of a resource by URI
Invocation	prompts/get	Client -> Server	Get a populated prompt template
Sampling	sampling/createMessage	Server -> Client	Request LLM completion from the host
Notifications	notifications/tools/list_changed	Server -> Client	Server's available tools have changed
Error	JSON-RPC error	Either direction	Structured error with code, message, and data

5.3 State Management

MCP carefully separates stateless and stateful concerns across the architecture:

Servers are preferably stateless: Each tool invocation should be self-contained. The server receives all necessary context in the request and returns a complete response. This allows servers to be restarted, scaled horizontally, or replaced without losing state.
Clients are session-aware: Clients maintain connection state (session ID, negotiated capabilities, transport state) for the duration of a session. If a server restarts, the client re-initializes and re-discovers capabilities.
Hosts manage conversation context: The host is responsible for managing the conversation history, context window budget, and deciding which tool results to include in the LLM prompt. This is where context window management becomes critical — a tool that returns 10,000 tokens of data may need to be summarized before injection.

5.4 Complete MCP Message Flow Simulation

To see the full protocol in action, this simulation traces every JSON-RPC message exchanged during a complete MCP session — from initialization handshake through tool discovery, resource listing, tool execution, and graceful shutdown. Running this simulation reveals the exact message structure and sequencing that real MCP clients and servers use, making it an invaluable reference for debugging protocol-level issues.

# Complete MCP protocol flow simulation
# This demonstrates every message type in the correct sequence

# pip install mcp
import json
import asyncio
from datetime import datetime, timezone
from dataclasses import dataclass, field
from typing import Any

@dataclass
class MCPMessage:
    """Represents a single MCP JSON-RPC message."""
    jsonrpc: str = "2.0"
    method: str = ""
    params: dict = field(default_factory=dict)
    result: Any = None
    error: dict = None
    id: int = None

    def to_json(self):
        """Serialize to JSON-RPC format."""
        msg = {"jsonrpc": self.jsonrpc}
        if self.method:
            msg["method"] = self.method
        if self.params:
            msg["params"] = self.params
        if self.result is not None:
            msg["result"] = self.result
        if self.error:
            msg["error"] = self.error
        if self.id is not None:
            msg["id"] = self.id
        return json.dumps(msg, indent=2)


def simulate_mcp_flow():
    """Simulate the complete MCP protocol flow with all message types."""

    messages = []
    msg_id = 0

    # --- Phase 1: Initialization ---
    msg_id += 1
    init_request = MCPMessage(
        method="initialize",
        params={
            "protocolVersion": "2025-03-26",
            "capabilities": {
                "tools": {},            # Client supports tool invocations
                "resources": {},        # Client supports resource reading
                "prompts": {},          # Client supports prompt templates
                "sampling": {}          # Client supports sampling requests
            },
            "clientInfo": {
                "name": "claude-desktop",
                "version": "1.5.0"
            }
        },
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", init_request))

    # Server responds with its capabilities
    init_response = MCPMessage(
        result={
            "protocolVersion": "2025-03-26",
            "capabilities": {
                "tools": {"listChanged": True},     # Server supports tool change notifications
                "resources": {"subscribe": True},   # Server supports resource subscriptions
                "prompts": {"listChanged": True},   # Server supports prompt change notifications
                "sampling": {}                      # Server may request LLM completions
            },
            "serverInfo": {
                "name": "enterprise-database-server",
                "version": "2.1.0"
            }
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", init_response))

    # Client confirms initialization
    initialized = MCPMessage(method="notifications/initialized")
    messages.append(("CLIENT -> SERVER", initialized))

    # --- Phase 2: Capability Discovery ---
    msg_id += 1
    list_tools_request = MCPMessage(
        method="tools/list",
        params={},
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", list_tools_request))

    list_tools_response = MCPMessage(
        result={
            "tools": [
                {
                    "name": "query_database",
                    "description": "Execute a read-only SQL query",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "sql": {"type": "string", "description": "SQL SELECT query"},
                            "limit": {"type": "integer", "default": 100}
                        },
                        "required": ["sql"]
                    }
                },
                {
                    "name": "list_tables",
                    "description": "List all tables in the database",
                    "inputSchema": {
                        "type": "object",
                        "properties": {},
                        "required": []
                    }
                }
            ]
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", list_tools_response))

    # --- Phase 3: Tool Invocation ---
    msg_id += 1
    tool_call = MCPMessage(
        method="tools/call",
        params={
            "name": "query_database",
            "arguments": {
                "sql": "SELECT COUNT(*) as active_users FROM users WHERE status = 'active'",
                "limit": 1
            }
        },
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", tool_call))

    tool_result = MCPMessage(
        result={
            "content": [
                {
                    "type": "text",
                    "text": json.dumps([{"active_users": 12847}])
                }
            ],
            "isError": False
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", tool_result))

    # --- Phase 4: Resource Read ---
    msg_id += 1
    resource_read = MCPMessage(
        method="resources/read",
        params={"uri": "db://tables/users/schema"},
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", resource_read))

    resource_response = MCPMessage(
        result={
            "contents": [
                {
                    "uri": "db://tables/users/schema",
                    "mimeType": "application/json",
                    "text": json.dumps({
                        "columns": ["id", "name", "email", "status", "created_at"],
                        "types": ["INTEGER", "TEXT", "TEXT", "TEXT", "TIMESTAMP"]
                    })
                }
            ]
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", resource_response))

    # --- Phase 5: Error Handling ---
    msg_id += 1
    bad_tool_call = MCPMessage(
        method="tools/call",
        params={
            "name": "query_database",
            "arguments": {"sql": "DROP TABLE users"}  # Dangerous query
        },
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", bad_tool_call))

    error_response = MCPMessage(
        error={
            "code": -32602,
            "message": "Invalid params: Only SELECT queries are allowed",
            "data": {"attempted_query": "DROP TABLE users", "policy": "read_only"}
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", error_response))

    # --- Print the complete flow ---
    print("=" * 70)
    print("MCP PROTOCOL FLOW SIMULATION")
    print(f"Timestamp: {datetime.now(timezone.utc).isoformat()}")
    print("=" * 70)

    for i, (direction, msg) in enumerate(messages, 1):
        print(f"\n--- Message {i}: {direction} ---")
        print(msg.to_json())

    print(f"\n{'=' * 70}")
    print(f"Total messages exchanged: {len(messages)}")
    print(f"Phases covered: Initialization, Discovery, Invocation, Resource Read, Error Handling")
    print(f"{'=' * 70}")


# Run the simulation
simulate_mcp_flow()

6. Authentication & Security

Security is not an afterthought in MCP — it is built into the protocol's design. Because MCP servers can access databases, filesystems, APIs, and other sensitive systems, a robust security model is essential.

6.1 Authentication Mechanisms

Mechanism	How It Works	Best For	Complexity
API Keys	Static secret passed as environment variable or header	Local development, single-user servers	Low
OAuth 2.0	Token-based flow with scopes and refresh	Multi-user, SaaS integrations, delegated access	Medium-High
JWT Tokens	Signed tokens with claims (user, permissions, expiry)	Stateless auth across microservices	Medium
mTLS	Mutual TLS with client certificates	Zero-trust environments, inter-service auth	High

6.2 Authorization & RBAC

Authentication verifies who you are; authorization determines what you can do. MCP supports fine-grained access control at the tool, resource, and argument level:

# MCP auth middleware: JWT validation + RBAC authorization
# Demonstrates production-grade security for MCP servers

# pip install mcp pyjwt cryptography
import os
import json
import jwt
import time
import functools
from typing import Callable
from mcp.server import Server
from mcp.types import Tool, TextContent

# Security configuration from environment
JWT_SECRET = os.getenv("MCP_JWT_SECRET", "change-me-in-production")
JWT_ALGORITHM = "HS256"

# Role-based access control matrix
# Maps roles to allowed tools and their permitted argument patterns
RBAC_POLICY = {
    "analyst": {
        "allowed_tools": ["query", "list_tables", "describe_table"],
        "restrictions": {
            "query": {
                "sql_must_start_with": "SELECT",       # Read-only
                "forbidden_tables": ["audit_logs", "credentials"]  # Sensitive tables
            }
        }
    },
    "developer": {
        "allowed_tools": ["query", "list_tables", "describe_table", "insert_record"],
        "restrictions": {
            "query": {
                "sql_must_start_with": "SELECT",
                "forbidden_tables": ["credentials"]
            },
            "insert_record": {
                "allowed_tables": ["logs", "events", "metrics"]
            }
        }
    },
    "admin": {
        "allowed_tools": ["*"],  # All tools
        "restrictions": {}       # No restrictions
    }
}


def validate_jwt_token(token: str) -> dict:
    """Validate a JWT token and extract claims."""
    try:
        payload = jwt.decode(token, JWT_SECRET, algorithms=[JWT_ALGORITHM])

        # Check expiration
        if payload.get("exp", 0) < time.time():
            raise ValueError("Token expired")

        return {
            "user_id": payload["sub"],
            "role": payload.get("role", "analyst"),  # Default to least privilege
            "permissions": payload.get("permissions", []),
            "issued_at": payload.get("iat"),
            "expires_at": payload.get("exp")
        }

    except jwt.InvalidTokenError as e:
        raise ValueError(f"Invalid token: {e}")


def authorize_tool_call(user_claims: dict, tool_name: str, arguments: dict) -> bool:
    """Check if a user is authorized to call a specific tool with given arguments."""
    role = user_claims["role"]

    if role not in RBAC_POLICY:
        return False  # Unknown role — deny by default

    policy = RBAC_POLICY[role]

    # Check if tool is allowed for this role
    allowed = policy["allowed_tools"]
    if "*" not in allowed and tool_name not in allowed:
        return False

    # Check tool-specific restrictions
    restrictions = policy.get("restrictions", {}).get(tool_name, {})

    if tool_name == "query":
        sql = arguments.get("sql", "").strip().upper()

        # Check SQL command restriction
        required_prefix = restrictions.get("sql_must_start_with", "")
        if required_prefix and not sql.startswith(required_prefix):
            return False

        # Check forbidden tables
        forbidden = restrictions.get("forbidden_tables", [])
        for table in forbidden:
            if table.upper() in sql:
                return False

    elif tool_name == "insert_record":
        table = arguments.get("table", "")
        allowed_tables = restrictions.get("allowed_tables", [])
        if allowed_tables and table not in allowed_tables:
            return False

    return True


def generate_sample_jwt(user_id: str, role: str, hours_valid: int = 8) -> str:
    """Generate a sample JWT token for testing."""
    now = int(time.time())
    payload = {
        "sub": user_id,
        "role": role,
        "iat": now,
        "exp": now + (hours_valid * 3600),
        "permissions": RBAC_POLICY.get(role, {}).get("allowed_tools", [])
    }
    return jwt.encode(payload, JWT_SECRET, algorithm=JWT_ALGORITHM)


# --- Demonstration ---
def demonstrate_auth():
    """Show the auth system in action."""

    # Generate tokens for different roles
    analyst_token = generate_sample_jwt("alice", "analyst")
    developer_token = generate_sample_jwt("bob", "developer")
    admin_token = generate_sample_jwt("charlie", "admin")

    print("=== MCP Auth Middleware Demonstration ===\n")

    # Test scenarios
    scenarios = [
        ("Analyst: SELECT query", analyst_token, "query",
         {"sql": "SELECT * FROM users LIMIT 10"}),
        ("Analyst: SELECT from credentials", analyst_token, "query",
         {"sql": "SELECT * FROM credentials"}),
        ("Analyst: INSERT (forbidden tool)", analyst_token, "insert_record",
         {"table": "logs", "data": {"msg": "test"}}),
        ("Developer: SELECT query", developer_token, "query",
         {"sql": "SELECT * FROM users LIMIT 10"}),
        ("Developer: INSERT into logs", developer_token, "insert_record",
         {"table": "logs", "data": {"msg": "test"}}),
        ("Developer: INSERT into users (forbidden)", developer_token, "insert_record",
         {"table": "users", "data": {"name": "hack"}}),
        ("Admin: DELETE query (all allowed)", admin_token, "query",
         {"sql": "DELETE FROM temp_data"}),
    ]

    for description, token, tool, args in scenarios:
        claims = validate_jwt_token(token)
        authorized = authorize_tool_call(claims, tool, args)
        status = "ALLOWED" if authorized else "DENIED"
        print(f"  [{status}] {description}")
        print(f"          Role: {claims['role']}, Tool: {tool}")
        if not authorized:
            print(f"          Reason: Policy violation for role '{claims['role']}'")
        print()

demonstrate_auth()

6.3 Security Best Practices

Security Checklist

Production MCP Security Checklist

Least Privilege: Each server should expose the minimum set of tools required. A file-reading server should never expose file-writing tools unless explicitly needed.
Input Validation: Validate all tool arguments against their JSON Schema before execution. Reject malformed inputs at the protocol level, not the application level.
Sandboxing: Run MCP servers in isolated environments (Docker containers, Firecracker VMs, or OS-level sandboxes). Limit filesystem access, network access, and system calls.
Rate Limiting: Implement per-user and per-tool rate limits to prevent abuse. An agent stuck in a loop could make thousands of tool calls per minute without rate limiting.
Audit Logging: Log every tool invocation with timestamp, user identity, tool name, arguments (sanitized), result status, and execution duration. These logs are essential for security forensics and compliance.
Prompt Injection Mitigation: Never pass raw tool results directly into system prompts. Sanitize tool outputs to remove potential injection strings. Mark tool results as "untrusted" in the context window.
Secret Management: Never embed API keys, database passwords, or other secrets in server code. Use environment variables, secret managers (HashiCorp Vault, AWS Secrets Manager), or secure key stores.
TLS Everywhere: All HTTP-based MCP transports should use TLS 1.3. For internal service-to-service communication, use mTLS with short-lived certificates.

Least Privilege Input Validation Sandboxing Rate Limiting Audit Logging

6.4 Data Privacy

MCP's architecture inherently supports privacy-preserving patterns because the server (which accesses data) is separate from the host (which runs the LLM). This separation enables several important privacy architectures:

Local-first architectures: STDIO transport keeps all data on the user's machine. The MCP server reads local files and databases; data never leaves the device (only the LLM inference call goes to the cloud).
Secure enclaves: MCP servers can run inside trusted execution environments (Intel SGX, AWS Nitro Enclaves) where even the server operator cannot access the data being processed.
Encryption at rest: Servers should encrypt any cached or persisted data using AES-256-GCM. Encryption keys should be managed via a KMS, never hardcoded.
Encryption in transit: All MCP transports (except STDIO, which uses OS process isolation) should use TLS 1.3 for encryption in transit. HTTP/SSE and WebSocket transports must enforce HTTPS/WSS.
Data minimization: Tools should return only the data the LLM needs, not entire database tables. A query for "active user count" should return the count, not all user records.

Case Study

Cursor's MCP Integration — IDE as MCP Host

Cursor, the AI-powered code editor, demonstrates a compelling MCP integration pattern. As an MCP host, Cursor connects to servers that provide:

Project context: An MCP server that indexes the codebase and exposes semantic code search as a resource
Build tools: An MCP server that wraps the project's build system (npm, cargo, make) as tools
Testing: An MCP server that runs test suites and returns structured results
Documentation: An MCP server that provides framework documentation as resources

The key insight is that Cursor's AI assistant seamlessly combines these capabilities: "Find all usages of the deprecated `authenticate()` function, run the tests to confirm they pass, then refactor each call site to use the new `verify_identity()` function." This requires reading code (resource), executing tests (tool), understanding context (sampling), and writing code (tool) — all through MCP.

Cursor IDE Code Context Multi-Server Developer Workflow

Exercises & Self-Assessment

Exercise 1

Build Your First MCP Server

Create a minimal MCP server that wraps a local JSON file as both a Resource and a set of Tools:

Create a JSON file with 10+ records (e.g., a product catalog, employee directory, or recipe book)
Implement a Resource that returns the full dataset and a Resource that returns schema information
Implement Tools: search (filter by field), get_by_id (fetch single record), add_record (append new record)
Implement a Prompt template for "analyze this dataset"
Test with the MCP Inspector CLI tool or connect to Claude Desktop

Exercise 2

MCP Architecture Diagram

Draw a complete architecture diagram for the following scenario and label every MCP component:

A customer support application (the Host) that uses Claude as its LLM
Three MCP servers: (a) Zendesk ticket system, (b) product database, (c) knowledge base with vector search
Show the flow when a user asks: "What's the status of ticket #4521 and does our warranty cover the reported issue?"
Label each message type (initialize, tools/list, tools/call, resources/read)
Identify where authentication occurs and what type you would use for each server

Exercise 3

Security Audit

Review the following MCP server configuration and identify all security vulnerabilities:

A server that exposes execute_sql with no query validation (accepts any SQL)
API keys passed as tool arguments instead of environment variables
No rate limiting on tool invocations
Tool results returned without sanitization (could contain prompt injection payloads)
For each vulnerability, write the fix and explain the attack vector it prevents

Exercise 4

Reflective Questions

Why does MCP separate Hosts, Clients, and Servers into three distinct roles instead of combining them? What would break if they were merged?
Compare MCP's Resources primitive to a traditional REST API GET endpoint. What does MCP add that REST does not? What does REST provide that MCP does not?
Explain why Sampling (server-to-host LLM requests) is necessary. Give an example where a server cannot accomplish its task without delegating reasoning to the LLM.
MCP supports four transports: STDIO, HTTP/SSE, WebSocket, gRPC. For each, describe a scenario where it is the best choice and a scenario where it is the worst choice.
ChatGPT Plugins failed, but MCP succeeded. Identify three specific design decisions in MCP that address Plugin weaknesses.

Exercise 5

Transport Comparison Lab

Implement the same simple MCP server (a weather lookup tool) using two different transports:

STDIO transport (for local use with Claude Desktop)
HTTP/SSE transport (for remote use with a web client)
Measure the latency difference between the two transports using 100 sequential tool calls
Write up your findings: When is the latency difference significant? When is it negligible?

MCP Framework Comparison Document Generator

Generate a professional framework comparison document for MCP and related integration approaches. Download as Word, Excel, PDF, or PowerPoint.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

Comparison Name *

Frameworks Being Compared *

Evaluation Criteria *

Key Strengths

Key Weaknesses

Target Use Case

Recommendation

Additional Notes

Author Name

Conclusion & Next Steps

You now have a thorough understanding of the Model Context Protocol — the open standard that is becoming the universal integration layer for AI applications. Here are the key takeaways from Part 13:

MCP solves AI integration fragmentation the same way HTTP solved web communication and USB-C solved peripheral connectivity — through a vendor-neutral, open protocol
The Host/Client/Server architecture separates concerns cleanly: hosts manage LLM interaction, clients manage protocol connections, servers expose capabilities, and data sources provide the underlying information
Four core primitives cover every type of AI-system interaction: Resources (read data), Tools (take actions), Prompts (reuse templates), and Sampling (delegate reasoning)
Transport agnosticism means the same server works locally (STDIO), on the web (HTTP/SSE), in real-time applications (WebSocket), and in high-performance microservices (gRPC)
The protocol lifecycle follows a clear sequence: initialize, discover capabilities, invoke tools/resources, handle errors — all using JSON-RPC 2.0 messages
Security is built into the architecture through least privilege, RBAC, JWT/OAuth authentication, input validation, sandboxing, audit logging, and prompt injection mitigation
MCP outperforms alternatives (OpenAI function calling, LangChain tools, ChatGPT Plugins) on every dimension that matters for production: vendor neutrality, modularity, interoperability, and security

Next in the Series

In Part 14: MCP in Production, we will take everything from this foundational chapter and apply it at scale — building production-grade MCP servers, integrating with real-world APIs and databases, implementing observability and monitoring, scaling MCP architectures, and building complete agent systems that combine multiple MCP servers into powerful autonomous workflows.

Cookie Consent

Cookie Preferences