Back to Technology

AI Application Development Mastery Part 13: MCP Foundations & Architecture

April 1, 2026 Wasil Zafar 48 min read

The Model Context Protocol (MCP) is the open standard that finally solves AI integration fragmentation. Master the Host/Client/Server architecture, core primitives (Resources, Tools, Prompts, Sampling), protocol lifecycle, authentication patterns, and build production-ready MCP servers and clients from scratch.

Table of Contents

  1. What MCP Solves & Why It Matters
  2. Architecture Deep Dive
  3. Core MCP Primitives
  4. The FastMCP Python SDK
  5. Protocol Flow & Lifecycle
  6. Authentication & Security
  7. Exercises & Self-Assessment
  8. MCP Framework Comparison Generator
  9. Conclusion & Next Steps

Introduction: The Universal Protocol for AI Integration

Series Overview: This is Part 13 of our 20-part AI Application Development Mastery series. We now enter the MCP chapters — the open protocol that standardizes how LLMs connect to tools, data, and the outside world. This part covers foundations and architecture; Part 14 takes MCP into production.

AI Application Development Mastery

Your 20-step learning path • Currently on Step 13
1
Foundations & Evolution of AI Apps
Pre-LLM era, transformers, LLM revolution
2
LLM Fundamentals for Developers
Tokens, context windows, sampling, API patterns
3
Prompt Engineering Mastery
Zero/few-shot, CoT, ReAct, structured outputs
4
LangChain Core Concepts
Chains, prompts, LLMs, tools, LCEL
5
Retrieval-Augmented Generation (RAG)
Embeddings, vector DBs, retrievers, RAG pipelines
6
Memory & Context Engineering
Buffer/summary/vector memory, chunking, re-ranking
7
Agents — Core of Modern AI Apps
ReAct, tool-calling, planner-executor agents
8
LangGraph — Stateful Agent Workflows
Nodes, edges, state, graph execution, cycles
9
Deep Agents & Autonomous Systems
Multi-step reasoning, self-reflection, planning
10
Multi-Agent Systems
Supervisor, swarm, debate, role-based collaboration
11
AI Application Design Patterns
RAG, chat+memory, workflow automation, agent loops
12
Ecosystem & Frameworks
LlamaIndex, Haystack, HuggingFace, vLLM
13
MCP Foundations & Architecture
Protocol design, Host/Client/Server, primitives, security
You Are Here
14
MCP in Production
Building servers, integrations, scaling, agent systems
15
Evaluation & LLMOps
Prompt eval, tracing, LangSmith, experiment tracking
16
Production AI Systems
APIs, queues, caching, streaming, scaling
17
Safety, Guardrails & Reliability
Input filtering, hallucination mitigation, prompt injection
18
Advanced Topics
Fine-tuning, tool learning, hybrid LLM+symbolic
19
Building Real AI Applications
Chatbot, document QA, coding assistant, full-stack
20
Future of AI Applications
Autonomous agents, self-improving, multi-modal, AI OS

Every generation of computing has required a universal integration protocol to unlock its full potential. The web needed HTTP to connect browsers to servers. Mobile needed REST APIs to connect apps to backends. Peripherals needed USB to connect devices to computers. Now, the age of AI agents needs a protocol to connect LLMs to the rest of the world.

That protocol is the Model Context Protocol (MCP) — an open standard originally developed by Anthropic and now adopted across the industry. MCP defines a structured, vendor-neutral way for AI applications to discover tools, access data, execute actions, and interact with external systems through a clean client-server architecture.

Before MCP, every AI integration was a bespoke engineering effort. Connecting Claude to a database required different code than connecting GPT-4 to the same database. Every framework had its own tool definition format, its own transport mechanism, its own error handling. The result was a fragmented ecosystem where integration work consumed more engineering time than building actual AI capabilities.

Key Insight: MCP is not a replacement for LangChain, LangGraph, or any orchestration framework. It operates at a different layer entirely — it standardizes the interface between AI applications and external capabilities. Think of MCP as the protocol layer that orchestration frameworks build on top of, just as HTTP is the protocol that web frameworks build on top of.

1. What MCP Solves & Why It Matters

Every AI application that interacts with external data or tools faces the same integration challenge: building and maintaining custom connectors for each combination of AI model and external service. The Model Context Protocol (MCP) solves this with a universal, open standard — analogous to how USB standardized hardware peripherals. Instead of N×M custom integrations, MCP provides a single protocol that any AI client can use to connect to any MCP-compatible server, dramatically reducing integration complexity.

1.1 The Integration Problem

Consider the landscape before MCP. You want your AI agent to access a database, read files, call a REST API, and search the web. Here is what that looked like:

# BEFORE MCP: Every integration is custom, vendor-specific, and fragile
# Each tool requires its own implementation for each LLM provider

# pip install openai anthropic langchain langchain-community
import os
import json
import sqlite3
import requests

# --- OpenAI function calling (vendor-specific format) ---
openai_tools = [
    {
        "type": "function",
        "function": {
            "name": "query_database",
            "description": "Run a SQL query against the customer database",
            "parameters": {
                "type": "object",
                "properties": {
                    "sql": {"type": "string", "description": "SQL query to execute"},
                    "database": {"type": "string", "description": "Database name"}
                },
                "required": ["sql"]
            }
        }
    }
]

# --- Anthropic tool calling (different format for the same tool) ---
anthropic_tools = [
    {
        "name": "query_database",
        "description": "Run a SQL query against the customer database",
        "input_schema": {
            "type": "object",
            "properties": {
                "sql": {"type": "string", "description": "SQL query to execute"},
                "database": {"type": "string", "description": "Database name"}
            },
            "required": ["sql"]
        }
    }
]

# --- LangChain tool definition (yet another format) ---
from langchain.tools import tool

@tool
def query_database(sql: str, database: str = "customers") -> str:
    """Run a SQL query against the customer database."""
    conn = sqlite3.connect(database)
    cursor = conn.cursor()
    cursor.execute(sql)
    results = cursor.fetchall()
    conn.close()
    return json.dumps(results)

# The SAME tool defined THREE different ways for THREE different systems
# Multiply this by 50 tools and 5 LLM providers = 250 definitions to maintain
# Change the tool schema? Update it in all 250 places.

This fragmentation creates cascading problems: vendor lock-in (your tools only work with one provider), maintenance overhead (N tools times M providers equals N*M implementations), no interoperability (tools built for Claude cannot be used with GPT-4), and zero standardization (every integration is a snowflake).

1.2 HTTP for AI Agents — The USB-C Analogy

The best way to understand MCP's role is through analogies to protocols that solved similar fragmentation problems in other domains:

Domain Before Standardization After Standardization Protocol
Web Custom protocols per service (Gopher, FTP, WAIS) Universal browser-to-server communication HTTP/HTTPS
Peripherals Serial, parallel, PS/2, FireWire, proprietary ports One connector for everything USB / USB-C
Databases Vendor-specific query languages per database Universal query language across all databases SQL / ODBC
APIs SOAP, XML-RPC, custom binary protocols Uniform resource-based API design REST / OpenAPI
AI Agents Custom tool formats per provider (OpenAI, Anthropic, LangChain) Universal tool/resource/prompt protocol MCP
The USB-C Moment: Before USB-C, you needed different cables for your phone, laptop, headphones, and monitor. MCP is the USB-C moment for AI integrations — one protocol that lets any AI host connect to any capability server. Build an MCP server once, and it works with Claude Desktop, Cursor, Windsurf, VS Code, and any future MCP-compatible host.

1.3 MCP vs Alternatives — Comprehensive Comparison

MCP did not emerge in a vacuum. Several approaches to LLM-tool integration existed before it. Here is how they compare across the dimensions that matter most for production systems:

Criterion OpenAI Function Calling ChatGPT Plugins LangChain Tools AutoGen Tools MCP
Vendor Lock-in High — OpenAI only Total — ChatGPT only Medium — LangChain ecosystem Medium — AutoGen ecosystem None — vendor-neutral open standard
Modularity Low — tools embedded in API call Low — monolithic plugin manifest Medium — Python decorators Medium — function registration High — decoupled server per capability
Interoperability None across providers None — deprecated by OpenAI Within LangChain only Within AutoGen only Universal — any host to any server
Standardization De facto standard for OpenAI Abandoned (2024) Community conventions Microsoft conventions Open specification with formal schema
Transport Options HTTPS only HTTPS only In-process Python In-process Python STDIO, HTTP/SSE, WebSocket, gRPC
Capability Discovery None — tools hardcoded Static manifest file Runtime introspection Runtime introspection Dynamic discovery protocol
Primitives Tools only Tools + auth Tools + retrievers Tools + code exec Resources + Tools + Prompts + Sampling
Security Model API key per call OAuth (limited) Application-level Application-level OAuth 2.0, JWT, mTLS, RBAC, sandboxing
Why ChatGPT Plugins Failed: OpenAI launched ChatGPT Plugins in March 2023 with great fanfare, then quietly deprecated them by early 2024. The fundamental flaw was centralization — plugins had to be approved by OpenAI, hosted on specific infrastructure, and only worked within ChatGPT. MCP learned from this failure by being fully open, decentralized, and host-agnostic.

1.4 Core Design Principles

MCP was designed around five principles that distinguish it from every prior approach to AI-tool integration:

Design Principles

The Five Pillars of MCP Design

  1. Separation of Concerns: Hosts manage LLM interaction and UI. Clients manage protocol connections. Servers expose capabilities. Each component has a single responsibility and can be developed, deployed, and scaled independently.
  2. Composability: An agent can connect to multiple MCP servers simultaneously — a database server, a web search server, a file system server — and the host orchestrates them seamlessly. Capabilities compose like UNIX pipes.
  3. Least Privilege: Each MCP server declares exactly what capabilities it exposes, and clients can restrict which capabilities they request. A file-reading server never needs database write access. Permissions are granular and explicit.
  4. Deterministic Tool Interfaces: Every tool has a JSON Schema definition that specifies its inputs and outputs precisely. There is no ambiguity about what a tool expects or returns. The LLM sees the schema and can generate valid invocations reliably.
  5. Transport Agnosticism: MCP works over STDIO (for local processes), HTTP with Server-Sent Events (for web services), WebSocket (for bidirectional streaming), and gRPC (for high-performance). The protocol is the same regardless of transport.
Separation of Concerns Composability Least Privilege Deterministic Interfaces Transport Agnostic
# AFTER MCP: Define a tool ONCE, use it everywhere
# The same MCP server works with Claude, GPT-4, Gemini, Llama, any host

# pip install mcp
from mcp.server import Server
from mcp.types import Tool, TextContent
import json
import sqlite3

# Create an MCP server — one tool definition, universal compatibility
server = Server("database-server")

@server.list_tools()
async def list_tools():
    """Declare available tools via the MCP discovery protocol."""
    return [
        Tool(
            name="query_database",
            description="Run a read-only SQL query against the customer database",
            inputSchema={
                "type": "object",
                "properties": {
                    "sql": {
                        "type": "string",
                        "description": "SQL SELECT query to execute"
                    },
                    "database": {
                        "type": "string",
                        "description": "Database name",
                        "default": "customers.db"
                    }
                },
                "required": ["sql"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Execute a tool invocation from any MCP-compatible host."""
    if name == "query_database":
        sql = arguments["sql"]
        database = arguments.get("database", "customers.db")

        # Security: only allow SELECT queries
        if not sql.strip().upper().startswith("SELECT"):
            return [TextContent(
                type="text",
                text="Error: Only SELECT queries are allowed for safety."
            )]

        conn = sqlite3.connect(database)
        cursor = conn.cursor()
        cursor.execute(sql)
        results = cursor.fetchall()
        columns = [desc[0] for desc in cursor.description]
        conn.close()

        # Return structured results
        formatted = [dict(zip(columns, row)) for row in results]
        return [TextContent(type="text", text=json.dumps(formatted, indent=2))]

    raise ValueError(f"Unknown tool: {name}")

# This SINGLE server definition works with:
# - Claude Desktop (via STDIO transport)
# - Cursor IDE (via STDIO transport)
# - Windsurf (via STDIO transport)
# - Any custom MCP host (via HTTP/SSE or WebSocket)
# - Future AI hosts that adopt MCP
# Define once. Run everywhere.
History

How MCP Evolved from Anthropic's Tool-Use Experience

MCP grew out of Anthropic's internal experience building Claude's tool-use capabilities. When Anthropic launched Claude's tool calling in 2024, they encountered the same integration fragmentation that plagued the entire industry. Every enterprise customer had to write custom glue code to connect Claude to their systems.

Anthropic recognized that this was not a Claude-specific problem — it was an industry problem. In late 2024, they open-sourced MCP as a vendor-neutral specification, explicitly designing it so that competitors could adopt it. By early 2025, Cursor, Windsurf, VS Code, Replit, and dozens of other tools had adopted the protocol, validating the design. By 2026, MCP has become the de facto standard for AI-tool integration, with over 3,000 community-built MCP servers covering databases, APIs, developer tools, business applications, and more.

Anthropic Origin Open Source Vendor Neutral 3000+ Servers

2. Architecture Deep Dive

MCP follows a client-server architecture built on JSON-RPC 2.0, with clearly defined roles: hosts (AI applications like Claude Desktop or VS Code), clients (protocol connectors that maintain 1:1 server connections), and servers (lightweight services exposing tools, resources, and prompts). This layered design enables secure, composable integrations where each server is sandboxed and the host controls which capabilities are exposed to the AI model.

2.1 System Overview

MCP follows a layered architecture with clear separation between four components. Every MCP interaction flows through this chain:

# MCP Architecture Overview
#
#  +------------------+     +------------------+     +------------------+     +------------------+
#  |                  |     |                  |     |                  |     |                  |
#  |    MCP HOST      |<--->|    MCP CLIENT    |<--->|    MCP SERVER    |<--->|   DATA / APIs    |
#  |                  |     |                  |     |                  |     |                  |
#  |  Claude Desktop  |     |  Protocol Layer  |     |  Capability      |     |  Databases       |
#  |  Cursor IDE      |     |  Session Mgmt    |     |  Provider        |     |  REST APIs       |
#  |  Windsurf        |     |  Retry Logic     |     |  Tools           |     |  File Systems    |
#  |  Custom App      |     |  Transport       |     |  Resources       |     |  SaaS Services   |
#  |                  |     |                  |     |  Prompts         |     |  Vector DBs      |
#  +------------------+     +------------------+     +------------------+     +------------------+
#
#  A single HOST manages multiple CLIENTs.
#  Each CLIENT connects to exactly ONE SERVER.
#  Each SERVER exposes capabilities from one or more DATA sources.
#
#  Example: Claude Desktop (host) manages 5 clients, each connected to
#  a different server: filesystem, database, GitHub, Slack, web-search.

2.2 MCP Hosts

An MCP Host is the application that the user interacts with directly. It manages the LLM, renders the UI, and orchestrates one or more MCP clients. The host is responsible for the overall user experience and for deciding how to route capabilities from multiple servers.

Host Application Type MCP Integration Notable Features
Claude Desktop Desktop App Native MCP support via STDIO First MCP host; JSON config for server registration; auto-starts servers
Cursor IDE MCP servers for code tools Integrates MCP tools into code completion and chat; project-level config
Windsurf IDE MCP for IDE extensions Cascade agent uses MCP tools; supports multi-server configurations
VS Code + Copilot IDE Extension MCP via extension API GitHub Copilot Chat integrates MCP servers for workspace context
Custom Applications Any app MCP SDK integration Build your own host using the mcp Python or TypeScript SDK
// Claude Desktop MCP configuration (~/.claude/claude_desktop_config.json)
// This tells the host which MCP servers to launch and how to connect
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/dev/projects"],
      "env": {}
    },
    "database": {
      "command": "python",
      "args": ["-m", "mcp_server_sqlite", "--db-path", "/Users/dev/data/app.db"],
      "env": {}
    },
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      }
    },
    "web-search": {
      "command": "python",
      "args": ["-m", "mcp_server_brave_search"],
      "env": {
        "BRAVE_API_KEY": "BSAxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      }
    }
  }
}
Host Responsibilities: The host manages the LLM conversation loop, presents discovered tools to the LLM, routes tool calls to the appropriate client/server, handles user consent for sensitive operations, and aggregates results back into the conversation context. The host is the orchestrator — it never executes tools directly.

2.3 MCP Clients

An MCP Client is the protocol layer that sits between the host and a server. Each client manages a single connection to a single server. The host creates one client per server it needs to communicate with.

# Building an MCP client that connects to a server
# This demonstrates the client lifecycle: connect, discover, invoke, disconnect

# pip install mcp
import asyncio
import os
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def run_mcp_client():
    """Demonstrate the full MCP client lifecycle."""

    # Step 1: Define server connection parameters
    # STDIO transport — the server runs as a child process
    server_params = StdioServerParameters(
        command="python",                          # Command to launch the server
        args=["-m", "mcp_server_sqlite",           # Server module
              "--db-path", "customers.db"],         # Server-specific arguments
        env={                                      # Environment variables
            "PATH": os.getenv("PATH", ""),
            "LOG_LEVEL": "INFO"
        }
    )

    # Step 2: Connect to the server via STDIO transport
    async with stdio_client(server_params) as (read_stream, write_stream):
        async with ClientSession(read_stream, write_stream) as session:

            # Step 3: Initialize the session (protocol handshake)
            await session.initialize()
            print("Session initialized successfully")

            # Step 4: Discover available capabilities
            # List all tools the server exposes
            tools_response = await session.list_tools()
            print(f"\nAvailable tools ({len(tools_response.tools)}):")
            for tool in tools_response.tools:
                print(f"  - {tool.name}: {tool.description}")
                print(f"    Schema: {tool.inputSchema}")

            # List all resources the server exposes
            resources_response = await session.list_resources()
            print(f"\nAvailable resources ({len(resources_response.resources)}):")
            for resource in resources_response.resources:
                print(f"  - {resource.uri}: {resource.name}")

            # List all prompt templates
            prompts_response = await session.list_prompts()
            print(f"\nAvailable prompts ({len(prompts_response.prompts)}):")
            for prompt in prompts_response.prompts:
                print(f"  - {prompt.name}: {prompt.description}")

            # Step 5: Invoke a tool
            result = await session.call_tool(
                "query_database",
                arguments={"sql": "SELECT name, email FROM customers LIMIT 5"}
            )
            print(f"\nTool result: {result.content[0].text}")

            # Step 6: Read a resource
            resource_content = await session.read_resource("sqlite:///customers/schema")
            print(f"\nResource content: {resource_content.contents[0].text}")

            # Step 7: Get a prompt template
            prompt_result = await session.get_prompt(
                "analyze-table",
                arguments={"table_name": "customers"}
            )
            print(f"\nPrompt template: {prompt_result.messages[0].content.text}")

    print("\nSession closed. Client disconnected.")

# Run the client
# asyncio.run(run_mcp_client())

Key client responsibilities include:

  • Connection management: Establishing, maintaining, and gracefully closing connections to servers
  • Session handling: Managing the protocol handshake (Initialize), capability negotiation, and session state
  • Streaming: Handling streamed responses for long-running operations via Server-Sent Events or WebSocket
  • Retry and fault tolerance: Implementing exponential backoff, connection pooling, and circuit-breaker patterns for unreliable servers
  • Message serialization: Converting between the host's internal format and MCP's JSON-RPC message format

2.4 MCP Servers

An MCP Server is the component that exposes actual capabilities — tools, resources, and prompts — to clients. Each server is a focused, single-purpose service that wraps one domain (a database, an API, a file system) in the MCP protocol.

# Complete MCP server implementation with all four primitive types
# This server provides database access with full MCP capability exposure

# pip install mcp aiosqlite
import json
import os
import asyncio
import aiosqlite
from mcp.server import Server
from mcp.types import (
    Tool, Resource, Prompt, PromptMessage,
    TextContent, PromptArgument, ResourceTemplate
)

# Database path from environment variable
DB_PATH = os.getenv("MCP_DB_PATH", "app.db")

# Create the MCP server instance
server = Server("enterprise-database-server")

# --- TOOLS: Actions the LLM can execute ---
@server.list_tools()
async def list_tools():
    """Expose database query and write tools."""
    return [
        Tool(
            name="query",
            description="Execute a read-only SQL SELECT query",
            inputSchema={
                "type": "object",
                "properties": {
                    "sql": {"type": "string", "description": "SQL SELECT query"},
                    "limit": {"type": "integer", "description": "Max rows", "default": 100}
                },
                "required": ["sql"]
            }
        ),
        Tool(
            name="insert_record",
            description="Insert a new record into a table",
            inputSchema={
                "type": "object",
                "properties": {
                    "table": {"type": "string", "description": "Target table name"},
                    "data": {
                        "type": "object",
                        "description": "Column-value pairs to insert",
                        "additionalProperties": True
                    }
                },
                "required": ["table", "data"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Handle tool invocations with safety checks."""
    if name == "query":
        sql = arguments["sql"].strip()
        limit = arguments.get("limit", 100)

        # Security: only allow SELECT statements
        if not sql.upper().startswith("SELECT"):
            return [TextContent(type="text", text="Error: Only SELECT queries allowed.")]

        # Enforce row limit
        if "LIMIT" not in sql.upper():
            sql = f"{sql} LIMIT {limit}"

        async with aiosqlite.connect(DB_PATH) as db:
            db.row_factory = aiosqlite.Row
            cursor = await db.execute(sql)
            rows = await cursor.fetchall()
            columns = [d[0] for d in cursor.description]
            results = [dict(zip(columns, row)) for row in rows]

        return [TextContent(type="text", text=json.dumps(results, indent=2, default=str))]

    elif name == "insert_record":
        table = arguments["table"]
        data = arguments["data"]

        # Security: validate table name (prevent SQL injection)
        if not table.isalnum():
            return [TextContent(type="text", text="Error: Invalid table name.")]

        columns = ", ".join(data.keys())
        placeholders = ", ".join(["?"] * len(data))
        values = list(data.values())

        async with aiosqlite.connect(DB_PATH) as db:
            await db.execute(
                f"INSERT INTO {table} ({columns}) VALUES ({placeholders})",
                values
            )
            await db.commit()

        return [TextContent(type="text", text=f"Successfully inserted record into {table}.")]

    raise ValueError(f"Unknown tool: {name}")

# --- RESOURCES: Data the LLM can read ---
@server.list_resources()
async def list_resources():
    """Expose database schema and table data as readable resources."""
    resources = [
        Resource(
            uri="db://schema",
            name="Database Schema",
            description="Complete schema of all tables in the database",
            mimeType="application/json"
        )
    ]

    # Dynamically list all tables as resources
    async with aiosqlite.connect(DB_PATH) as db:
        cursor = await db.execute(
            "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"
        )
        tables = await cursor.fetchall()
        for (table_name,) in tables:
            resources.append(Resource(
                uri=f"db://tables/{table_name}",
                name=f"Table: {table_name}",
                description=f"Sample data and schema for the {table_name} table",
                mimeType="application/json"
            ))

    return resources

@server.read_resource()
async def read_resource(uri: str):
    """Return resource content for a given URI."""
    if uri == "db://schema":
        async with aiosqlite.connect(DB_PATH) as db:
            cursor = await db.execute(
                "SELECT sql FROM sqlite_master WHERE type='table'"
            )
            schemas = await cursor.fetchall()
            return json.dumps([s[0] for s in schemas], indent=2)

    if uri.startswith("db://tables/"):
        table_name = uri.split("/")[-1]
        if not table_name.isalnum():
            return "Error: Invalid table name"

        async with aiosqlite.connect(DB_PATH) as db:
            # Return schema + sample rows
            cursor = await db.execute(f"PRAGMA table_info({table_name})")
            columns = await cursor.fetchall()
            cursor = await db.execute(f"SELECT * FROM {table_name} LIMIT 10")
            rows = await cursor.fetchall()
            col_names = [c[1] for c in columns]

            return json.dumps({
                "table": table_name,
                "columns": [{"name": c[1], "type": c[2], "nullable": not c[3]} for c in columns],
                "sample_rows": [dict(zip(col_names, row)) for row in rows],
                "row_count": len(rows)
            }, indent=2, default=str)

    raise ValueError(f"Unknown resource URI: {uri}")

# --- PROMPTS: Reusable templates for the LLM ---
@server.list_prompts()
async def list_prompts():
    """Expose reusable prompt templates."""
    return [
        Prompt(
            name="analyze-table",
            description="Generate a comprehensive analysis prompt for a database table",
            arguments=[
                PromptArgument(
                    name="table_name",
                    description="Name of the table to analyze",
                    required=True
                ),
                PromptArgument(
                    name="focus",
                    description="Analysis focus: 'quality', 'patterns', or 'summary'",
                    required=False
                )
            ]
        ),
        Prompt(
            name="write-query",
            description="Generate a prompt to help write a SQL query for a specific question",
            arguments=[
                PromptArgument(
                    name="question",
                    description="Natural language question to answer with SQL",
                    required=True
                )
            ]
        )
    ]

@server.get_prompt()
async def get_prompt(name: str, arguments: dict):
    """Return a populated prompt template."""
    if name == "analyze-table":
        table_name = arguments["table_name"]
        focus = arguments.get("focus", "summary")

        # Fetch schema for context
        async with aiosqlite.connect(DB_PATH) as db:
            cursor = await db.execute(f"PRAGMA table_info({table_name})")
            columns = await cursor.fetchall()
            schema_info = ", ".join([f"{c[1]} ({c[2]})" for c in columns])

        return {
            "messages": [
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text=f"Analyze the '{table_name}' table with focus on {focus}.\n\n"
                             f"Schema: {schema_info}\n\n"
                             f"Please provide:\n"
                             f"1. Data quality assessment\n"
                             f"2. Key patterns and distributions\n"
                             f"3. Potential issues or anomalies\n"
                             f"4. Recommended queries for deeper analysis"
                    )
                )
            ]
        }

    elif name == "write-query":
        question = arguments["question"]

        # Fetch full schema for context
        async with aiosqlite.connect(DB_PATH) as db:
            cursor = await db.execute(
                "SELECT sql FROM sqlite_master WHERE type='table'"
            )
            schemas = await cursor.fetchall()
            schema_text = "\n".join([s[0] for s in schemas if s[0]])

        return {
            "messages": [
                PromptMessage(
                    role="user",
                    content=TextContent(
                        type="text",
                        text=f"Write a SQL query to answer: {question}\n\n"
                             f"Database schema:\n{schema_text}\n\n"
                             f"Requirements:\n"
                             f"- Use only SELECT statements\n"
                             f"- Include appropriate JOINs if needed\n"
                             f"- Add LIMIT clause for safety\n"
                             f"- Explain the query logic"
                    )
                )
            ]
        }

    raise ValueError(f"Unknown prompt: {name}")

Server design considerations for production:

  • Stateless vs Stateful: Prefer stateless servers where possible. State should live in the data layer (database, cache), not the server process. This enables horizontal scaling and fault tolerance.
  • Latency: Tool invocations should complete within 5 seconds for interactive use. For long-running operations, use streaming responses to provide progress updates.
  • Observability: Instrument every tool call with structured logging, request tracing (correlation IDs), and metrics (latency histograms, error rates).
  • Idempotency: Write operations should be idempotent when possible. If the client retries a failed insert, it should not create duplicate records.

2.5 Data Layer Integration

MCP servers bridge the gap between AI agents and the data they need. The data layer spans both local and remote sources:

Data Source Type MCP Integration Pattern Example Server
Local Filesystem Local Resources for reading, Tools for writing @modelcontextprotocol/server-filesystem
SQLite / PostgreSQL Local / Remote Resources for schema, Tools for queries mcp-server-sqlite, mcp-server-postgres
REST / GraphQL APIs Remote Tools that wrap HTTP calls Custom server per API
SaaS Platforms Remote Tools for CRUD operations on SaaS entities mcp-server-github, mcp-server-slack
Vector Databases Local / Remote Resources for similarity search, Tools for indexing Custom server wrapping Chroma, Pinecone, Qdrant
Knowledge Graphs Remote Resources for traversal, Tools for queries Custom server wrapping Neo4j, Amazon Neptune
# MCP server wrapping a vector database for semantic search
# Demonstrates the Resources pattern for RAG-style retrieval

# pip install mcp chromadb sentence-transformers
import os
import json
import chromadb
from mcp.server import Server
from mcp.types import Tool, Resource, TextContent

# Initialize ChromaDB with persistent storage
CHROMA_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_data")
chroma_client = chromadb.PersistentClient(path=CHROMA_PATH)

server = Server("vector-search-server")

@server.list_tools()
async def list_tools():
    """Expose semantic search and indexing tools."""
    return [
        Tool(
            name="semantic_search",
            description="Search the knowledge base using natural language",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Natural language search query"},
                    "collection": {"type": "string", "description": "Collection name", "default": "documents"},
                    "top_k": {"type": "integer", "description": "Number of results", "default": 5}
                },
                "required": ["query"]
            }
        ),
        Tool(
            name="index_document",
            description="Add a document to the knowledge base",
            inputSchema={
                "type": "object",
                "properties": {
                    "text": {"type": "string", "description": "Document text to index"},
                    "metadata": {"type": "object", "description": "Document metadata"},
                    "collection": {"type": "string", "description": "Target collection", "default": "documents"}
                },
                "required": ["text"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Handle vector search and indexing operations."""
    if name == "semantic_search":
        query = arguments["query"]
        collection_name = arguments.get("collection", "documents")
        top_k = arguments.get("top_k", 5)

        collection = chroma_client.get_or_create_collection(collection_name)
        results = collection.query(query_texts=[query], n_results=top_k)

        # Format results with metadata and relevance scores
        formatted = []
        for i, (doc, meta, dist) in enumerate(zip(
            results["documents"][0],
            results["metadatas"][0],
            results["distances"][0]
        )):
            formatted.append({
                "rank": i + 1,
                "text": doc,
                "metadata": meta,
                "similarity_score": round(1 - dist, 4)  # Convert distance to similarity
            })

        return [TextContent(type="text", text=json.dumps(formatted, indent=2))]

    elif name == "index_document":
        text = arguments["text"]
        metadata = arguments.get("metadata", {})
        collection_name = arguments.get("collection", "documents")

        collection = chroma_client.get_or_create_collection(collection_name)

        # Generate a deterministic ID for idempotency
        import hashlib
        doc_id = hashlib.sha256(text.encode()).hexdigest()[:16]

        collection.upsert(
            documents=[text],
            metadatas=[metadata],
            ids=[doc_id]
        )

        return [TextContent(
            type="text",
            text=f"Document indexed successfully. ID: {doc_id}, Collection: {collection_name}"
        )]

    raise ValueError(f"Unknown tool: {name}")

@server.list_resources()
async def list_resources():
    """Expose collection metadata as resources."""
    collections = chroma_client.list_collections()
    return [
        Resource(
            uri=f"vector://collections/{col.name}",
            name=f"Collection: {col.name}",
            description=f"Metadata and stats for the {col.name} vector collection",
            mimeType="application/json"
        )
        for col in collections
    ]
Case Study

Claude Desktop's MCP Ecosystem

Claude Desktop was the first production MCP host, and its ecosystem demonstrates the power of the protocol. A typical power user's Claude Desktop configuration connects to 5-10 MCP servers simultaneously:

  • Filesystem server — read and write project files directly from chat
  • GitHub server — create issues, review PRs, search repositories
  • Slack server — send messages, search conversations, manage channels
  • PostgreSQL server — query production databases (read-only)
  • Brave Search server — real-time web search with citations

Claude can seamlessly combine capabilities: "Search GitHub for open issues about authentication, check the relevant code files, query the database for affected users, and draft a Slack message to the team with your findings." One prompt, five MCP servers, zero custom integration code.

Claude Desktop Multi-Server Ecosystem Zero Custom Code

3. Core MCP Primitives

MCP defines four core primitives that cover the full spectrum of AI-system interactions. Together they form a complete vocabulary: read data (Resources), take actions (Tools), reuse templates (Prompts), and delegate reasoning (Sampling).

3.1 Resources (READ) — Structured Data Access

Resources represent data that the LLM can read but not modify. They are identified by URIs and return structured or unstructured content. Think of Resources as a read-only API for the LLM's knowledge.

Resource Type URI Pattern Content Use Case
Documents file:///docs/guide.md Markdown, PDF, text Knowledge base articles, documentation
Database Queries db://tables/users/schema JSON schema, sample rows Schema discovery, data previews
API Responses api://weather/current JSON data Real-time data feeds
Vector Search Results vector://search?q=deployment Ranked document chunks Semantic retrieval for RAG
Configuration config://app/settings JSON/YAML config Application state, feature flags

Advanced resource patterns include: pagination (using cursor-based or offset parameters in the URI), filtering (query parameters that narrow results), chunking (splitting large documents into LLM-friendly sizes), and caching (ETags or last-modified headers to avoid re-fetching unchanged data).

3.2 Tools (ACT) — Actions with Side Effects

Tools are the workhorse of MCP — they let the LLM do things. Unlike Resources (read-only), Tools can have side effects: writing to databases, calling APIs, sending emails, creating files. Every Tool is defined by a JSON Schema that makes its interface completely explicit.

# Advanced tool patterns: idempotency, side-effect control, tool chaining
# Demonstrates production-grade tool implementation

# pip install mcp httpx
import os
import json
import hashlib
import httpx
from datetime import datetime, timezone
from mcp.server import Server
from mcp.types import Tool, TextContent

# API key from environment
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", "")
server = Server("github-tools-server")

@server.list_tools()
async def list_tools():
    """Expose GitHub operations as MCP tools."""
    return [
        Tool(
            name="create_issue",
            description="Create a GitHub issue with title, body, and labels",
            inputSchema={
                "type": "object",
                "properties": {
                    "repo": {
                        "type": "string",
                        "description": "Repository in 'owner/name' format"
                    },
                    "title": {
                        "type": "string",
                        "description": "Issue title",
                        "maxLength": 256
                    },
                    "body": {
                        "type": "string",
                        "description": "Issue body (supports Markdown)"
                    },
                    "labels": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "Labels to apply",
                        "default": []
                    },
                    "idempotency_key": {
                        "type": "string",
                        "description": "Unique key to prevent duplicate creation"
                    }
                },
                "required": ["repo", "title", "body"]
            }
        ),
        Tool(
            name="search_code",
            description="Search for code across GitHub repositories",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query (supports GitHub search syntax)"
                    },
                    "language": {
                        "type": "string",
                        "description": "Filter by programming language"
                    },
                    "max_results": {
                        "type": "integer",
                        "description": "Maximum results to return",
                        "default": 10,
                        "maximum": 50
                    }
                },
                "required": ["query"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    """Execute GitHub tool invocations with safety and idempotency."""
    headers = {
        "Authorization": f"Bearer {GITHUB_TOKEN}",
        "Accept": "application/vnd.github.v3+json",
        "X-GitHub-Api-Version": "2022-11-28"
    }

    async with httpx.AsyncClient(base_url="https://api.github.com") as client:

        if name == "create_issue":
            repo = arguments["repo"]
            title = arguments["title"]
            body = arguments["body"]
            labels = arguments.get("labels", [])

            # Idempotency: check if issue with same title already exists
            idempotency_key = arguments.get(
                "idempotency_key",
                hashlib.sha256(f"{repo}:{title}".encode()).hexdigest()[:12]
            )

            # Search for existing issue with the idempotency marker
            search_response = await client.get(
                f"/repos/{repo}/issues",
                headers=headers,
                params={"state": "all", "per_page": 5}
            )

            if search_response.status_code == 200:
                existing = [
                    i for i in search_response.json()
                    if i.get("title") == title
                ]
                if existing:
                    return [TextContent(
                        type="text",
                        text=json.dumps({
                            "status": "already_exists",
                            "issue_number": existing[0]["number"],
                            "url": existing[0]["html_url"],
                            "message": "Issue with identical title already exists."
                        }, indent=2)
                    )]

            # Create the issue
            response = await client.post(
                f"/repos/{repo}/issues",
                headers=headers,
                json={
                    "title": title,
                    "body": f"{body}\n\n---\n_idempotency_key: {idempotency_key}_",
                    "labels": labels
                }
            )

            if response.status_code == 201:
                issue = response.json()
                return [TextContent(
                    type="text",
                    text=json.dumps({
                        "status": "created",
                        "issue_number": issue["number"],
                        "url": issue["html_url"],
                        "created_at": issue["created_at"]
                    }, indent=2)
                )]
            else:
                return [TextContent(
                    type="text",
                    text=f"Error creating issue: {response.status_code} - {response.text}"
                )]

        elif name == "search_code":
            query = arguments["query"]
            language = arguments.get("language", "")
            max_results = arguments.get("max_results", 10)

            # Build GitHub search query
            search_query = query
            if language:
                search_query += f" language:{language}"

            response = await client.get(
                "/search/code",
                headers=headers,
                params={"q": search_query, "per_page": min(max_results, 50)}
            )

            if response.status_code == 200:
                data = response.json()
                results = []
                for item in data.get("items", [])[:max_results]:
                    results.append({
                        "repository": item["repository"]["full_name"],
                        "path": item["path"],
                        "url": item["html_url"],
                        "score": item.get("score", 0)
                    })
                return [TextContent(
                    type="text",
                    text=json.dumps({
                        "total_count": data.get("total_count", 0),
                        "results": results
                    }, indent=2)
                )]
            else:
                return [TextContent(
                    type="text",
                    text=f"Search error: {response.status_code} - {response.text}"
                )]

    raise ValueError(f"Unknown tool: {name}")

3.3 Prompts (REUSE) — Reusable Templates

MCP Prompts are reusable, parameterized templates that servers expose for common interaction patterns. They are not just strings — they are structured message sequences that can include system prompts, user messages, and even pre-filled assistant responses.

# Advanced prompt patterns: versioning, parameterization, injection defense
# Demonstrates production-grade prompt templates

# pip install mcp
from mcp.server import Server
from mcp.types import (
    Prompt, PromptArgument, PromptMessage, TextContent
)

server = Server("prompt-library-server")

# Prompt template registry with versioning
PROMPT_TEMPLATES = {
    "code-review": {
        "version": "2.1",
        "description": "Generate a thorough code review with security analysis",
        "system_prompt": (
            "You are a senior software engineer conducting a code review. "
            "Focus on: correctness, security vulnerabilities, performance, "
            "readability, and adherence to best practices. "
            "IMPORTANT: Never execute code suggestions. Only analyze and recommend."
        ),
        "user_template": (
            "Please review the following {language} code:\n\n"
            "```{language}\n{code}\n```\n\n"
            "Context: {context}\n\n"
            "Focus areas: {focus_areas}\n\n"
            "Provide your review in the following format:\n"
            "1. Summary (1-2 sentences)\n"
            "2. Critical Issues (security, correctness)\n"
            "3. Improvements (performance, readability)\n"
            "4. Positive Aspects\n"
            "5. Suggested Refactoring (with code examples)"
        )
    },
    "incident-response": {
        "version": "1.3",
        "description": "Guide incident response and root cause analysis",
        "system_prompt": (
            "You are an SRE incident commander. Help analyze the incident, "
            "identify root causes, and recommend mitigations. "
            "Be systematic and prioritize by severity. "
            "CRITICAL: Do not suggest running destructive commands."
        ),
        "user_template": (
            "Incident: {incident_title}\n"
            "Severity: {severity}\n"
            "Service: {service_name}\n"
            "Symptoms: {symptoms}\n"
            "Timeline: {timeline}\n\n"
            "Please provide:\n"
            "1. Initial assessment and severity validation\n"
            "2. Likely root causes (ranked by probability)\n"
            "3. Immediate mitigation steps\n"
            "4. Investigation queries to run\n"
            "5. Post-incident action items"
        )
    }
}

@server.list_prompts()
async def list_prompts():
    """Expose versioned prompt templates with parameter definitions."""
    return [
        Prompt(
            name="code-review",
            description=f"[v{PROMPT_TEMPLATES['code-review']['version']}] "
                        f"{PROMPT_TEMPLATES['code-review']['description']}",
            arguments=[
                PromptArgument(name="code", description="Code to review", required=True),
                PromptArgument(name="language", description="Programming language", required=True),
                PromptArgument(name="context", description="PR context or description", required=False),
                PromptArgument(name="focus_areas", description="Specific areas to focus on", required=False)
            ]
        ),
        Prompt(
            name="incident-response",
            description=f"[v{PROMPT_TEMPLATES['incident-response']['version']}] "
                        f"{PROMPT_TEMPLATES['incident-response']['description']}",
            arguments=[
                PromptArgument(name="incident_title", description="Incident title", required=True),
                PromptArgument(name="severity", description="P0-P4", required=True),
                PromptArgument(name="service_name", description="Affected service", required=True),
                PromptArgument(name="symptoms", description="Observed symptoms", required=True),
                PromptArgument(name="timeline", description="Event timeline", required=False)
            ]
        )
    ]

@server.get_prompt()
async def get_prompt(name: str, arguments: dict):
    """Return populated prompt with injection defense."""
    if name not in PROMPT_TEMPLATES:
        raise ValueError(f"Unknown prompt: {name}")

    template = PROMPT_TEMPLATES[name]

    # Injection defense: sanitize all user-provided arguments
    sanitized_args = {}
    for key, value in arguments.items():
        if isinstance(value, str):
            # Remove potential prompt injection markers
            sanitized = value.replace("IGNORE PREVIOUS INSTRUCTIONS", "[FILTERED]")
            sanitized = sanitized.replace("", "[FILTERED]")
            sanitized = sanitized.replace("", "[FILTERED]")
            sanitized_args[key] = sanitized
        else:
            sanitized_args[key] = value

    # Fill defaults for optional arguments
    sanitized_args.setdefault("context", "No additional context provided")
    sanitized_args.setdefault("focus_areas", "All areas")
    sanitized_args.setdefault("timeline", "Not provided")

    # Build the message sequence
    user_text = template["user_template"].format(**sanitized_args)

    return {
        "messages": [
            PromptMessage(
                role="assistant",
                content=TextContent(type="text", text=template["system_prompt"])
            ),
            PromptMessage(
                role="user",
                content=TextContent(type="text", text=user_text)
            )
        ]
    }

3.4 Sampling (THINK) — Delegated Reasoning

Sampling is MCP's most distinctive primitive. It inverts the normal flow: instead of the host asking the server to execute a tool, the server asks the host to perform an LLM completion. This enables servers to leverage AI reasoning without needing their own LLM access.

Important Distinction: Sampling requests flow from server to host, the opposite direction of tool calls. The server says "I need the LLM to reason about this data before I can continue." The host fulfills the request using its LLM and returns the result. This keeps all LLM interaction centralized in the host while letting servers participate in the reasoning chain.

Key sampling use cases:

  • Tool result summarization: A database server returns 500 rows, then asks the host to summarize the key findings before returning to the user
  • Multi-step planning: A code analysis server asks the host to plan which files to examine based on an initial code scan
  • Content classification: A content moderation server asks the host to classify user input before deciding which tool to invoke
  • Error interpretation: A deployment server encounters an error log and asks the host to interpret the stack trace before suggesting remediation

3.5 Transports (CONNECT) — Communication Channels

MCP is transport-agnostic — the protocol messages are the same regardless of how they are delivered. The choice of transport depends on the deployment context:

Transport Mechanism Best For Latency Scalability
STDIO Standard input/output of child process Local development, desktop apps (Claude Desktop, Cursor) Lowest (~1ms) Single user only
HTTP + SSE HTTP POST for requests, Server-Sent Events for streaming Web services, multi-user, cloud deployments Low (~10-50ms) Horizontal scaling via load balancer
WebSocket Persistent bidirectional connection Real-time streaming, long-lived sessions Lowest for sustained connections (~5ms) Good with connection managers
gRPC Protocol Buffers over HTTP/2 High-performance microservices, large payloads Lowest at scale (~2-5ms) Excellent — built for microservices
# Running the same MCP server over different transports
# The server logic is identical — only the transport layer changes

# pip install mcp uvicorn
import asyncio
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.server.sse import SseServerTransport
from mcp.types import Tool, TextContent

# Create the server (transport-independent)
server = Server("multi-transport-demo")

@server.list_tools()
async def list_tools():
    """Same tools regardless of transport."""
    return [
        Tool(
            name="greet",
            description="Generate a greeting message",
            inputSchema={
                "type": "object",
                "properties": {
                    "name": {"type": "string", "description": "Name to greet"}
                },
                "required": ["name"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "greet":
        return [TextContent(type="text", text=f"Hello, {arguments['name']}!")]
    raise ValueError(f"Unknown tool: {name}")


# --- Transport Option 1: STDIO (for local/desktop use) ---
async def run_stdio():
    """Run as a child process communicating via stdin/stdout."""
    async with stdio_server() as (read_stream, write_stream):
        await server.run(read_stream, write_stream, server.create_initialization_options())


# --- Transport Option 2: HTTP + SSE (for web/cloud use) ---
def create_sse_app():
    """Create an HTTP app with Server-Sent Events transport."""
    from starlette.applications import Starlette
    from starlette.routing import Route

    sse_transport = SseServerTransport("/messages")

    async def handle_sse(request):
        """Handle SSE connections from MCP clients."""
        async with sse_transport.connect_sse(
            request.scope, request.receive, request._send
        ) as streams:
            await server.run(
                streams[0], streams[1],
                server.create_initialization_options()
            )

    async def handle_messages(request):
        """Handle incoming JSON-RPC messages via HTTP POST."""
        await sse_transport.handle_post_message(
            request.scope, request.receive, request._send
        )

    app = Starlette(routes=[
        Route("/sse", endpoint=handle_sse),
        Route("/messages", endpoint=handle_messages, methods=["POST"]),
    ])
    return app

# To run STDIO: asyncio.run(run_stdio())
# To run HTTP/SSE: uvicorn.run(create_sse_app(), host="0.0.0.0", port=8000)

4. The FastMCP Python SDK — Hands-On Guide

Now that you understand the MCP architecture and its core primitives conceptually, let us write real code. The FastMCP SDK is the official high-level Python library that makes building MCP servers simple and intuitive. If the previous sections explained the what and why of MCP, this section covers the how.

Analogy: FastMCP is to MCP what Flask is to HTTP — it handles the protocol plumbing (JSON-RPC messages, schema generation, transport negotiation) so you can focus entirely on your tools and data. You write normal Python functions; FastMCP converts them into fully compliant MCP capabilities automatically.

Getting Started with FastMCP

FastMCP lives in the mcp Python package. Install it with pip and create your first server in just three lines:

# pip install "mcp[cli]" httpx
from mcp.server.fastmcp import FastMCP

# Create an MCP server instance — the name identifies your server to clients
mcp = FastMCP("my-awesome-server")

That is it. The FastMCP constructor takes a server name (this appears in client UIs like Claude Desktop or Cursor) and returns a server instance. You then add capabilities to this instance using decorators.

The mcp[cli] install extra includes the mcp command-line tool for testing and debugging your servers. The httpx library is commonly used for making async HTTP requests inside your tools.

@mcp.tool() — Exposing Functions to LLMs

The @mcp.tool() decorator is the most important concept in FastMCP. It takes a normal Python function and registers it as a tool that any connected LLM can call. The key insight is that you write a normal Python function with good docstrings and type hints, and FastMCP converts it into a fully-described, schema-validated MCP tool automatically.

Here is a concrete example — a weather alerts tool from the Anthropic documentation:

@mcp.tool()
async def get_alerts(state: str) -> str:
    """Get weather alerts for a US state.

    Args:
        state: Two-letter US state code (e.g. CA, NY)
    """
    url = f"{NWS_API_BASE}/alerts/active/area/{state}"
    data = await make_nws_request(url)
    if not data or "features" not in data:
        return "Unable to fetch alerts or no alerts found."
    alerts = [format_alert(f) for f in data["features"]]
    return "\n---\n".join(alerts)

When an LLM sees this tool, it knows: name=get_alerts, takes a state string, returns weather alerts. The docstring tells the LLM when to use it, and the Args section tells it how to call it correctly.

Here is exactly how FastMCP translates your Python code into the MCP protocol:

Your Code MCP Schema Purpose
Function name tool.name How the LLM identifies the tool
Docstring tool.description How the LLM decides to use it
Type hints inputSchema (JSON Schema) Input validation
Args docstring Parameter descriptions Helps LLM provide correct arguments
Return type Output format What the LLM receives back

This is the magic of FastMCP: good Python practices (type hints, docstrings) directly become good MCP tool descriptions. There is no separate schema file, no configuration YAML, no manual JSON Schema authoring.

@mcp.resource() — Exposing Data to LLMs

The @mcp.resource() decorator exposes read-only data that the application can retrieve. Unlike tools (which the LLM decides to call), resources are application-controlled — the host application decides when to read them. Think of resources as files or API endpoints that provide context to the LLM.

Resources use URI patterns — static URIs for fixed data and URI templates for dynamic data:

@mcp.resource("config://app")
def get_config() -> str:
    """Get the current application configuration."""
    return json.dumps({"theme": "dark", "language": "en", "version": "2.1.0"})

@mcp.resource("users://{user_id}/profile")
def get_user_profile(user_id: str) -> str:
    """Get a user's profile by their ID."""
    # In production, this would query a database
    profiles = {"alice": "Alice Smith - Engineer", "bob": "Bob Jones - Designer"}
    return profiles.get(user_id, f"User {user_id} not found")

The first resource uses a static URI (config://app) — there is exactly one configuration. The second uses a URI template (users://{user_id}/profile) — the {user_id} placeholder means the application can request any user's profile by substituting the ID.

@mcp.prompt() — Reusable Templates

The @mcp.prompt() decorator creates reusable prompt templates that users can invoke. These are user-controlled — think of them like slash commands in Slack or Discord. Prompts let you package complex, well-crafted instructions into simple, parameterized templates.

@mcp.prompt()
def review_code(code: str, language: str = "python") -> str:
    """Review code for bugs and improvements.

    Args:
        code: The source code to review
        language: Programming language of the code
    """
    return f"""Please review this {language} code for:
1. Bugs and potential errors
2. Performance improvements
3. Security vulnerabilities
4. Code style and best practices

Code to review:
```{language}
{code}
```"""

When a user selects this prompt in their MCP client, they are asked to provide the code and optionally the language. The template then generates a well-structured review request for the LLM.

Running Your MCP Server

With your tools, resources, and prompts defined, running your server is a single line. The transport parameter determines how clients connect:

# Run with STDIO transport (for local clients like Claude Desktop, Cursor)
mcp.run(transport="stdio")

Here is a complete, copy-paste-ready weather server that ties everything together:

# weather_server.py — Complete MCP Weather Server
# pip install "mcp[cli]" httpx

import json
import httpx
from mcp.server.fastmcp import FastMCP

# Create the MCP server
mcp = FastMCP("weather-server")

NWS_API_BASE = "https://api.weather.gov"
USER_AGENT = "weather-app/1.0"

async def make_nws_request(url: str) -> dict | None:
    """Make a request to the NWS API with proper headers."""
    headers = {"User-Agent": USER_AGENT, "Accept": "application/geo+json"}
    async with httpx.AsyncClient() as client:
        response = await client.get(url, headers=headers, timeout=30.0)
        response.raise_for_status()
        return response.json()

def format_alert(feature: dict) -> str:
    """Format a single weather alert for display."""
    props = feature["properties"]
    return f"""
Event: {props.get('event', 'Unknown')}
Area: {props.get('areaDesc', 'Unknown')}
Severity: {props.get('severity', 'Unknown')}
Description: {props.get('description', 'No description')}
Instructions: {props.get('instruction', 'No instructions')}
"""

@mcp.tool()
async def get_alerts(state: str) -> str:
    """Get weather alerts for a US state.

    Args:
        state: Two-letter US state code (e.g. CA, NY)
    """
    url = f"{NWS_API_BASE}/alerts/active/area/{state}"
    data = await make_nws_request(url)
    if not data or "features" not in data:
        return "Unable to fetch alerts or no alerts found."
    alerts = [format_alert(f) for f in data["features"]]
    return "\n---\n".join(alerts) if alerts else "No active alerts for this state."

@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
    """Get the weather forecast for a location.

    Args:
        latitude: Latitude of the location
        longitude: Longitude of the location
    """
    # First get the forecast grid endpoint
    points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
    points_data = await make_nws_request(points_url)
    if not points_data:
        return "Unable to fetch forecast data for this location."
    forecast_url = points_data["properties"]["forecast"]
    forecast_data = await make_nws_request(forecast_url)
    if not forecast_data:
        return "Unable to fetch forecast."
    periods = forecast_data["properties"]["periods"][:5]
    forecasts = [f"{p['name']}: {p['detailedForecast']}" for p in periods]
    return "\n---\n".join(forecasts)

@mcp.resource("config://weather")
def get_weather_config() -> str:
    """Get the weather server configuration and supported regions."""
    return json.dumps({
        "api": "National Weather Service",
        "coverage": "United States",
        "update_frequency": "Every 15 minutes"
    })

@mcp.prompt()
def weather_briefing(state: str) -> str:
    """Generate a comprehensive weather briefing for a state.

    Args:
        state: Two-letter US state code (e.g. CA, NY)
    """
    return f"""Please provide a comprehensive weather briefing for {state}:
1. Check current weather alerts
2. Summarize any severe weather warnings
3. Provide a general outlook
Use the available weather tools to gather this information."""

# Start the server
if __name__ == "__main__":
    mcp.run(transport="stdio")

To connect this server to Claude Desktop, add it to your Claude Desktop configuration file:

{
  "mcpServers": {
    "weather": {
      "command": "python",
      "args": ["weather_server.py"]
    }
  }
}
Key Insight: The entire FastMCP SDK is designed around one principle — write normal Python functions with good docstrings and type hints, and FastMCP handles everything else: schema generation, validation, transport, and protocol compliance. The three decorators map directly to the three core MCP primitives: @mcp.tool() = LLM calls it, @mcp.resource() = app reads it, @mcp.prompt() = user invokes it.

5. Protocol Flow & Lifecycle

Understanding MCP’s protocol lifecycle is essential for building reliable integrations. Every MCP session follows a structured flow: initialization (capability negotiation), operation (request/response and notification exchanges), and shutdown (graceful disconnection). This section traces the complete message flow from connection establishment through tool invocation, showing how JSON-RPC messages coordinate between client and server at each stage.

5.1 End-to-End Flow

Understanding the complete message flow is essential for debugging and optimizing MCP-based systems. Here is the full lifecycle of a user request through the MCP stack:

# End-to-End MCP Flow: User asks "How many active users do we have?"
#
# Step 1: USER INPUT
#   User types: "How many active users do we have?"
#
# Step 2: HOST PROCESSING
#   Claude Desktop receives the message.
#   Host adds the message to the conversation context.
#   Host includes tool descriptions from all connected MCP servers.
#
# Step 3: LLM DECISION
#   Claude sees available tools: [query_database, search_web, read_file, ...]
#   Claude decides to use "query_database" tool.
#   Claude generates: {"tool": "query_database", "args": {"sql": "SELECT COUNT(*) ..."}}
#
# Step 4: HOST -> CLIENT -> SERVER
#   Host identifies which client manages the database server.
#   Client sends JSON-RPC tool invocation to the server.
#   Server executes the SQL query against the actual database.
#
# Step 5: SERVER -> CLIENT -> HOST
#   Server returns: {"result": [{"count": 12847}]}
#   Client passes result back to host.
#   Host injects tool result into the conversation context.
#
# Step 6: LLM SYNTHESIS
#   Claude sees the tool result in context.
#   Claude generates: "You currently have 12,847 active users."
#
# Step 7: USER RESPONSE
#   Host renders Claude's response in the chat UI.
#   Total time: ~2-4 seconds (LLM inference dominates)

5.2 Message Types

MCP uses JSON-RPC 2.0 as its message format. Every interaction between client and server is a JSON-RPC message. The key message types in the protocol lifecycle are:

Phase Message Type Direction Purpose
Initialization initialize Client -> Server Protocol handshake, version negotiation, capability exchange
Initialization initialized Client -> Server Confirms initialization is complete
Discovery tools/list Client -> Server Request list of available tools with schemas
Discovery resources/list Client -> Server Request list of available resources with URIs
Discovery prompts/list Client -> Server Request list of available prompt templates
Invocation tools/call Client -> Server Execute a tool with provided arguments
Invocation resources/read Client -> Server Fetch content of a resource by URI
Invocation prompts/get Client -> Server Get a populated prompt template
Sampling sampling/createMessage Server -> Client Request LLM completion from the host
Notifications notifications/tools/list_changed Server -> Client Server's available tools have changed
Error JSON-RPC error Either direction Structured error with code, message, and data

5.3 State Management

MCP carefully separates stateless and stateful concerns across the architecture:

  • Servers are preferably stateless: Each tool invocation should be self-contained. The server receives all necessary context in the request and returns a complete response. This allows servers to be restarted, scaled horizontally, or replaced without losing state.
  • Clients are session-aware: Clients maintain connection state (session ID, negotiated capabilities, transport state) for the duration of a session. If a server restarts, the client re-initializes and re-discovers capabilities.
  • Hosts manage conversation context: The host is responsible for managing the conversation history, context window budget, and deciding which tool results to include in the LLM prompt. This is where context window management becomes critical — a tool that returns 10,000 tokens of data may need to be summarized before injection.

5.4 Complete MCP Message Flow Simulation

To see the full protocol in action, this simulation traces every JSON-RPC message exchanged during a complete MCP session — from initialization handshake through tool discovery, resource listing, tool execution, and graceful shutdown. Running this simulation reveals the exact message structure and sequencing that real MCP clients and servers use, making it an invaluable reference for debugging protocol-level issues.

# Complete MCP protocol flow simulation
# This demonstrates every message type in the correct sequence

# pip install mcp
import json
import asyncio
from datetime import datetime, timezone
from dataclasses import dataclass, field
from typing import Any

@dataclass
class MCPMessage:
    """Represents a single MCP JSON-RPC message."""
    jsonrpc: str = "2.0"
    method: str = ""
    params: dict = field(default_factory=dict)
    result: Any = None
    error: dict = None
    id: int = None

    def to_json(self):
        """Serialize to JSON-RPC format."""
        msg = {"jsonrpc": self.jsonrpc}
        if self.method:
            msg["method"] = self.method
        if self.params:
            msg["params"] = self.params
        if self.result is not None:
            msg["result"] = self.result
        if self.error:
            msg["error"] = self.error
        if self.id is not None:
            msg["id"] = self.id
        return json.dumps(msg, indent=2)


def simulate_mcp_flow():
    """Simulate the complete MCP protocol flow with all message types."""

    messages = []
    msg_id = 0

    # --- Phase 1: Initialization ---
    msg_id += 1
    init_request = MCPMessage(
        method="initialize",
        params={
            "protocolVersion": "2025-03-26",
            "capabilities": {
                "tools": {},            # Client supports tool invocations
                "resources": {},        # Client supports resource reading
                "prompts": {},          # Client supports prompt templates
                "sampling": {}          # Client supports sampling requests
            },
            "clientInfo": {
                "name": "claude-desktop",
                "version": "1.5.0"
            }
        },
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", init_request))

    # Server responds with its capabilities
    init_response = MCPMessage(
        result={
            "protocolVersion": "2025-03-26",
            "capabilities": {
                "tools": {"listChanged": True},     # Server supports tool change notifications
                "resources": {"subscribe": True},   # Server supports resource subscriptions
                "prompts": {"listChanged": True},   # Server supports prompt change notifications
                "sampling": {}                      # Server may request LLM completions
            },
            "serverInfo": {
                "name": "enterprise-database-server",
                "version": "2.1.0"
            }
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", init_response))

    # Client confirms initialization
    initialized = MCPMessage(method="notifications/initialized")
    messages.append(("CLIENT -> SERVER", initialized))

    # --- Phase 2: Capability Discovery ---
    msg_id += 1
    list_tools_request = MCPMessage(
        method="tools/list",
        params={},
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", list_tools_request))

    list_tools_response = MCPMessage(
        result={
            "tools": [
                {
                    "name": "query_database",
                    "description": "Execute a read-only SQL query",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "sql": {"type": "string", "description": "SQL SELECT query"},
                            "limit": {"type": "integer", "default": 100}
                        },
                        "required": ["sql"]
                    }
                },
                {
                    "name": "list_tables",
                    "description": "List all tables in the database",
                    "inputSchema": {
                        "type": "object",
                        "properties": {},
                        "required": []
                    }
                }
            ]
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", list_tools_response))

    # --- Phase 3: Tool Invocation ---
    msg_id += 1
    tool_call = MCPMessage(
        method="tools/call",
        params={
            "name": "query_database",
            "arguments": {
                "sql": "SELECT COUNT(*) as active_users FROM users WHERE status = 'active'",
                "limit": 1
            }
        },
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", tool_call))

    tool_result = MCPMessage(
        result={
            "content": [
                {
                    "type": "text",
                    "text": json.dumps([{"active_users": 12847}])
                }
            ],
            "isError": False
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", tool_result))

    # --- Phase 4: Resource Read ---
    msg_id += 1
    resource_read = MCPMessage(
        method="resources/read",
        params={"uri": "db://tables/users/schema"},
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", resource_read))

    resource_response = MCPMessage(
        result={
            "contents": [
                {
                    "uri": "db://tables/users/schema",
                    "mimeType": "application/json",
                    "text": json.dumps({
                        "columns": ["id", "name", "email", "status", "created_at"],
                        "types": ["INTEGER", "TEXT", "TEXT", "TEXT", "TIMESTAMP"]
                    })
                }
            ]
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", resource_response))

    # --- Phase 5: Error Handling ---
    msg_id += 1
    bad_tool_call = MCPMessage(
        method="tools/call",
        params={
            "name": "query_database",
            "arguments": {"sql": "DROP TABLE users"}  # Dangerous query
        },
        id=msg_id
    )
    messages.append(("CLIENT -> SERVER", bad_tool_call))

    error_response = MCPMessage(
        error={
            "code": -32602,
            "message": "Invalid params: Only SELECT queries are allowed",
            "data": {"attempted_query": "DROP TABLE users", "policy": "read_only"}
        },
        id=msg_id
    )
    messages.append(("SERVER -> CLIENT", error_response))

    # --- Print the complete flow ---
    print("=" * 70)
    print("MCP PROTOCOL FLOW SIMULATION")
    print(f"Timestamp: {datetime.now(timezone.utc).isoformat()}")
    print("=" * 70)

    for i, (direction, msg) in enumerate(messages, 1):
        print(f"\n--- Message {i}: {direction} ---")
        print(msg.to_json())

    print(f"\n{'=' * 70}")
    print(f"Total messages exchanged: {len(messages)}")
    print(f"Phases covered: Initialization, Discovery, Invocation, Resource Read, Error Handling")
    print(f"{'=' * 70}")


# Run the simulation
simulate_mcp_flow()

6. Authentication & Security

Security is not an afterthought in MCP — it is built into the protocol's design. Because MCP servers can access databases, filesystems, APIs, and other sensitive systems, a robust security model is essential.

6.1 Authentication Mechanisms

Mechanism How It Works Best For Complexity
API Keys Static secret passed as environment variable or header Local development, single-user servers Low
OAuth 2.0 Token-based flow with scopes and refresh Multi-user, SaaS integrations, delegated access Medium-High
JWT Tokens Signed tokens with claims (user, permissions, expiry) Stateless auth across microservices Medium
mTLS Mutual TLS with client certificates Zero-trust environments, inter-service auth High

6.2 Authorization & RBAC

Authentication verifies who you are; authorization determines what you can do. MCP supports fine-grained access control at the tool, resource, and argument level:

# MCP auth middleware: JWT validation + RBAC authorization
# Demonstrates production-grade security for MCP servers

# pip install mcp pyjwt cryptography
import os
import json
import jwt
import time
import functools
from typing import Callable
from mcp.server import Server
from mcp.types import Tool, TextContent

# Security configuration from environment
JWT_SECRET = os.getenv("MCP_JWT_SECRET", "change-me-in-production")
JWT_ALGORITHM = "HS256"

# Role-based access control matrix
# Maps roles to allowed tools and their permitted argument patterns
RBAC_POLICY = {
    "analyst": {
        "allowed_tools": ["query", "list_tables", "describe_table"],
        "restrictions": {
            "query": {
                "sql_must_start_with": "SELECT",       # Read-only
                "forbidden_tables": ["audit_logs", "credentials"]  # Sensitive tables
            }
        }
    },
    "developer": {
        "allowed_tools": ["query", "list_tables", "describe_table", "insert_record"],
        "restrictions": {
            "query": {
                "sql_must_start_with": "SELECT",
                "forbidden_tables": ["credentials"]
            },
            "insert_record": {
                "allowed_tables": ["logs", "events", "metrics"]
            }
        }
    },
    "admin": {
        "allowed_tools": ["*"],  # All tools
        "restrictions": {}       # No restrictions
    }
}


def validate_jwt_token(token: str) -> dict:
    """Validate a JWT token and extract claims."""
    try:
        payload = jwt.decode(token, JWT_SECRET, algorithms=[JWT_ALGORITHM])

        # Check expiration
        if payload.get("exp", 0) < time.time():
            raise ValueError("Token expired")

        return {
            "user_id": payload["sub"],
            "role": payload.get("role", "analyst"),  # Default to least privilege
            "permissions": payload.get("permissions", []),
            "issued_at": payload.get("iat"),
            "expires_at": payload.get("exp")
        }

    except jwt.InvalidTokenError as e:
        raise ValueError(f"Invalid token: {e}")


def authorize_tool_call(user_claims: dict, tool_name: str, arguments: dict) -> bool:
    """Check if a user is authorized to call a specific tool with given arguments."""
    role = user_claims["role"]

    if role not in RBAC_POLICY:
        return False  # Unknown role — deny by default

    policy = RBAC_POLICY[role]

    # Check if tool is allowed for this role
    allowed = policy["allowed_tools"]
    if "*" not in allowed and tool_name not in allowed:
        return False

    # Check tool-specific restrictions
    restrictions = policy.get("restrictions", {}).get(tool_name, {})

    if tool_name == "query":
        sql = arguments.get("sql", "").strip().upper()

        # Check SQL command restriction
        required_prefix = restrictions.get("sql_must_start_with", "")
        if required_prefix and not sql.startswith(required_prefix):
            return False

        # Check forbidden tables
        forbidden = restrictions.get("forbidden_tables", [])
        for table in forbidden:
            if table.upper() in sql:
                return False

    elif tool_name == "insert_record":
        table = arguments.get("table", "")
        allowed_tables = restrictions.get("allowed_tables", [])
        if allowed_tables and table not in allowed_tables:
            return False

    return True


def generate_sample_jwt(user_id: str, role: str, hours_valid: int = 8) -> str:
    """Generate a sample JWT token for testing."""
    now = int(time.time())
    payload = {
        "sub": user_id,
        "role": role,
        "iat": now,
        "exp": now + (hours_valid * 3600),
        "permissions": RBAC_POLICY.get(role, {}).get("allowed_tools", [])
    }
    return jwt.encode(payload, JWT_SECRET, algorithm=JWT_ALGORITHM)


# --- Demonstration ---
def demonstrate_auth():
    """Show the auth system in action."""

    # Generate tokens for different roles
    analyst_token = generate_sample_jwt("alice", "analyst")
    developer_token = generate_sample_jwt("bob", "developer")
    admin_token = generate_sample_jwt("charlie", "admin")

    print("=== MCP Auth Middleware Demonstration ===\n")

    # Test scenarios
    scenarios = [
        ("Analyst: SELECT query", analyst_token, "query",
         {"sql": "SELECT * FROM users LIMIT 10"}),
        ("Analyst: SELECT from credentials", analyst_token, "query",
         {"sql": "SELECT * FROM credentials"}),
        ("Analyst: INSERT (forbidden tool)", analyst_token, "insert_record",
         {"table": "logs", "data": {"msg": "test"}}),
        ("Developer: SELECT query", developer_token, "query",
         {"sql": "SELECT * FROM users LIMIT 10"}),
        ("Developer: INSERT into logs", developer_token, "insert_record",
         {"table": "logs", "data": {"msg": "test"}}),
        ("Developer: INSERT into users (forbidden)", developer_token, "insert_record",
         {"table": "users", "data": {"name": "hack"}}),
        ("Admin: DELETE query (all allowed)", admin_token, "query",
         {"sql": "DELETE FROM temp_data"}),
    ]

    for description, token, tool, args in scenarios:
        claims = validate_jwt_token(token)
        authorized = authorize_tool_call(claims, tool, args)
        status = "ALLOWED" if authorized else "DENIED"
        print(f"  [{status}] {description}")
        print(f"          Role: {claims['role']}, Tool: {tool}")
        if not authorized:
            print(f"          Reason: Policy violation for role '{claims['role']}'")
        print()

demonstrate_auth()

6.3 Security Best Practices

Security Checklist

Production MCP Security Checklist

  1. Least Privilege: Each server should expose the minimum set of tools required. A file-reading server should never expose file-writing tools unless explicitly needed.
  2. Input Validation: Validate all tool arguments against their JSON Schema before execution. Reject malformed inputs at the protocol level, not the application level.
  3. Sandboxing: Run MCP servers in isolated environments (Docker containers, Firecracker VMs, or OS-level sandboxes). Limit filesystem access, network access, and system calls.
  4. Rate Limiting: Implement per-user and per-tool rate limits to prevent abuse. An agent stuck in a loop could make thousands of tool calls per minute without rate limiting.
  5. Audit Logging: Log every tool invocation with timestamp, user identity, tool name, arguments (sanitized), result status, and execution duration. These logs are essential for security forensics and compliance.
  6. Prompt Injection Mitigation: Never pass raw tool results directly into system prompts. Sanitize tool outputs to remove potential injection strings. Mark tool results as "untrusted" in the context window.
  7. Secret Management: Never embed API keys, database passwords, or other secrets in server code. Use environment variables, secret managers (HashiCorp Vault, AWS Secrets Manager), or secure key stores.
  8. TLS Everywhere: All HTTP-based MCP transports should use TLS 1.3. For internal service-to-service communication, use mTLS with short-lived certificates.
Least Privilege Input Validation Sandboxing Rate Limiting Audit Logging

6.4 Data Privacy

MCP's architecture inherently supports privacy-preserving patterns because the server (which accesses data) is separate from the host (which runs the LLM). This separation enables several important privacy architectures:

  • Local-first architectures: STDIO transport keeps all data on the user's machine. The MCP server reads local files and databases; data never leaves the device (only the LLM inference call goes to the cloud).
  • Secure enclaves: MCP servers can run inside trusted execution environments (Intel SGX, AWS Nitro Enclaves) where even the server operator cannot access the data being processed.
  • Encryption at rest: Servers should encrypt any cached or persisted data using AES-256-GCM. Encryption keys should be managed via a KMS, never hardcoded.
  • Encryption in transit: All MCP transports (except STDIO, which uses OS process isolation) should use TLS 1.3 for encryption in transit. HTTP/SSE and WebSocket transports must enforce HTTPS/WSS.
  • Data minimization: Tools should return only the data the LLM needs, not entire database tables. A query for "active user count" should return the count, not all user records.
Case Study

Cursor's MCP Integration — IDE as MCP Host

Cursor, the AI-powered code editor, demonstrates a compelling MCP integration pattern. As an MCP host, Cursor connects to servers that provide:

  • Project context: An MCP server that indexes the codebase and exposes semantic code search as a resource
  • Build tools: An MCP server that wraps the project's build system (npm, cargo, make) as tools
  • Testing: An MCP server that runs test suites and returns structured results
  • Documentation: An MCP server that provides framework documentation as resources

The key insight is that Cursor's AI assistant seamlessly combines these capabilities: "Find all usages of the deprecated `authenticate()` function, run the tests to confirm they pass, then refactor each call site to use the new `verify_identity()` function." This requires reading code (resource), executing tests (tool), understanding context (sampling), and writing code (tool) — all through MCP.

Cursor IDE Code Context Multi-Server Developer Workflow

Exercises & Self-Assessment

Exercise 1

Build Your First MCP Server

Create a minimal MCP server that wraps a local JSON file as both a Resource and a set of Tools:

  1. Create a JSON file with 10+ records (e.g., a product catalog, employee directory, or recipe book)
  2. Implement a Resource that returns the full dataset and a Resource that returns schema information
  3. Implement Tools: search (filter by field), get_by_id (fetch single record), add_record (append new record)
  4. Implement a Prompt template for "analyze this dataset"
  5. Test with the MCP Inspector CLI tool or connect to Claude Desktop
Exercise 2

MCP Architecture Diagram

Draw a complete architecture diagram for the following scenario and label every MCP component:

  1. A customer support application (the Host) that uses Claude as its LLM
  2. Three MCP servers: (a) Zendesk ticket system, (b) product database, (c) knowledge base with vector search
  3. Show the flow when a user asks: "What's the status of ticket #4521 and does our warranty cover the reported issue?"
  4. Label each message type (initialize, tools/list, tools/call, resources/read)
  5. Identify where authentication occurs and what type you would use for each server
Exercise 3

Security Audit

Review the following MCP server configuration and identify all security vulnerabilities:

  1. A server that exposes execute_sql with no query validation (accepts any SQL)
  2. API keys passed as tool arguments instead of environment variables
  3. No rate limiting on tool invocations
  4. Tool results returned without sanitization (could contain prompt injection payloads)
  5. For each vulnerability, write the fix and explain the attack vector it prevents
Exercise 4

Reflective Questions

  1. Why does MCP separate Hosts, Clients, and Servers into three distinct roles instead of combining them? What would break if they were merged?
  2. Compare MCP's Resources primitive to a traditional REST API GET endpoint. What does MCP add that REST does not? What does REST provide that MCP does not?
  3. Explain why Sampling (server-to-host LLM requests) is necessary. Give an example where a server cannot accomplish its task without delegating reasoning to the LLM.
  4. MCP supports four transports: STDIO, HTTP/SSE, WebSocket, gRPC. For each, describe a scenario where it is the best choice and a scenario where it is the worst choice.
  5. ChatGPT Plugins failed, but MCP succeeded. Identify three specific design decisions in MCP that address Plugin weaknesses.
Exercise 5

Transport Comparison Lab

Implement the same simple MCP server (a weather lookup tool) using two different transports:

  1. STDIO transport (for local use with Claude Desktop)
  2. HTTP/SSE transport (for remote use with a web client)
  3. Measure the latency difference between the two transports using 100 sequential tool calls
  4. Write up your findings: When is the latency difference significant? When is it negligible?

MCP Framework Comparison Document Generator

Generate a professional framework comparison document for MCP and related integration approaches. Download as Word, Excel, PDF, or PowerPoint.

Draft auto-saved

All data stays in your browser. Nothing is sent to or stored on any server.

Conclusion & Next Steps

You now have a thorough understanding of the Model Context Protocol — the open standard that is becoming the universal integration layer for AI applications. Here are the key takeaways from Part 13:

  • MCP solves AI integration fragmentation the same way HTTP solved web communication and USB-C solved peripheral connectivity — through a vendor-neutral, open protocol
  • The Host/Client/Server architecture separates concerns cleanly: hosts manage LLM interaction, clients manage protocol connections, servers expose capabilities, and data sources provide the underlying information
  • Four core primitives cover every type of AI-system interaction: Resources (read data), Tools (take actions), Prompts (reuse templates), and Sampling (delegate reasoning)
  • Transport agnosticism means the same server works locally (STDIO), on the web (HTTP/SSE), in real-time applications (WebSocket), and in high-performance microservices (gRPC)
  • The protocol lifecycle follows a clear sequence: initialize, discover capabilities, invoke tools/resources, handle errors — all using JSON-RPC 2.0 messages
  • Security is built into the architecture through least privilege, RBAC, JWT/OAuth authentication, input validation, sandboxing, audit logging, and prompt injection mitigation
  • MCP outperforms alternatives (OpenAI function calling, LangChain tools, ChatGPT Plugins) on every dimension that matters for production: vendor neutrality, modularity, interoperability, and security

Next in the Series

In Part 14: MCP in Production, we will take everything from this foundational chapter and apply it at scale — building production-grade MCP servers, integrating with real-world APIs and databases, implementing observability and monitoring, scaling MCP architectures, and building complete agent systems that combine multiple MCP servers into powerful autonomous workflows.

Technology