Introduction: The Universal Protocol for AI Integration
Series Overview: This is Part 13 of our 20-part AI Application Development Mastery series. We now enter the MCP chapters — the open protocol that standardizes how LLMs connect to tools, data, and the outside world. This part covers foundations and architecture; Part 14 takes MCP into production.
1
Foundations & Evolution of AI Apps
Pre-LLM era, transformers, LLM revolution
2
LLM Fundamentals for Developers
Tokens, context windows, sampling, API patterns
3
Prompt Engineering Mastery
Zero/few-shot, CoT, ReAct, structured outputs
4
LangChain Core Concepts
Chains, prompts, LLMs, tools, LCEL
5
Retrieval-Augmented Generation (RAG)
Embeddings, vector DBs, retrievers, RAG pipelines
6
Memory & Context Engineering
Buffer/summary/vector memory, chunking, re-ranking
7
Agents — Core of Modern AI Apps
ReAct, tool-calling, planner-executor agents
8
LangGraph — Stateful Agent Workflows
Nodes, edges, state, graph execution, cycles
9
Deep Agents & Autonomous Systems
Multi-step reasoning, self-reflection, planning
10
Multi-Agent Systems
Supervisor, swarm, debate, role-based collaboration
11
AI Application Design Patterns
RAG, chat+memory, workflow automation, agent loops
12
Ecosystem & Frameworks
LlamaIndex, Haystack, HuggingFace, vLLM
13
MCP Foundations & Architecture
Protocol design, Host/Client/Server, primitives, security
You Are Here
14
MCP in Production
Building servers, integrations, scaling, agent systems
15
Evaluation & LLMOps
Prompt eval, tracing, LangSmith, experiment tracking
16
Production AI Systems
APIs, queues, caching, streaming, scaling
17
Safety, Guardrails & Reliability
Input filtering, hallucination mitigation, prompt injection
18
Advanced Topics
Fine-tuning, tool learning, hybrid LLM+symbolic
19
Building Real AI Applications
Chatbot, document QA, coding assistant, full-stack
20
Future of AI Applications
Autonomous agents, self-improving, multi-modal, AI OS
Every generation of computing has required a universal integration protocol to unlock its full potential. The web needed HTTP to connect browsers to servers. Mobile needed REST APIs to connect apps to backends. Peripherals needed USB to connect devices to computers. Now, the age of AI agents needs a protocol to connect LLMs to the rest of the world.
That protocol is the Model Context Protocol (MCP) — an open standard originally developed by Anthropic and now adopted across the industry. MCP defines a structured, vendor-neutral way for AI applications to discover tools, access data, execute actions, and interact with external systems through a clean client-server architecture.
Before MCP, every AI integration was a bespoke engineering effort. Connecting Claude to a database required different code than connecting GPT-4 to the same database. Every framework had its own tool definition format, its own transport mechanism, its own error handling. The result was a fragmented ecosystem where integration work consumed more engineering time than building actual AI capabilities.
Key Insight: MCP is not a replacement for LangChain, LangGraph, or any orchestration framework. It operates at a different layer entirely — it standardizes the interface between AI applications and external capabilities. Think of MCP as the protocol layer that orchestration frameworks build on top of, just as HTTP is the protocol that web frameworks build on top of.
1. What MCP Solves & Why It Matters
Every AI application that interacts with external data or tools faces the same integration challenge: building and maintaining custom connectors for each combination of AI model and external service. The Model Context Protocol (MCP) solves this with a universal, open standard — analogous to how USB standardized hardware peripherals. Instead of N×M custom integrations, MCP provides a single protocol that any AI client can use to connect to any MCP-compatible server, dramatically reducing integration complexity.
1.1 The Integration Problem
Consider the landscape before MCP. You want your AI agent to access a database, read files, call a REST API, and search the web. Here is what that looked like:
# BEFORE MCP: Every integration is custom, vendor-specific, and fragile
# Each tool requires its own implementation for each LLM provider
# pip install openai anthropic langchain langchain-community
import os
import json
import sqlite3
import requests
# --- OpenAI function calling (vendor-specific format) ---
openai_tools = [
{
"type": "function",
"function": {
"name": "query_database",
"description": "Run a SQL query against the customer database",
"parameters": {
"type": "object",
"properties": {
"sql": {"type": "string", "description": "SQL query to execute"},
"database": {"type": "string", "description": "Database name"}
},
"required": ["sql"]
}
}
}
]
# --- Anthropic tool calling (different format for the same tool) ---
anthropic_tools = [
{
"name": "query_database",
"description": "Run a SQL query against the customer database",
"input_schema": {
"type": "object",
"properties": {
"sql": {"type": "string", "description": "SQL query to execute"},
"database": {"type": "string", "description": "Database name"}
},
"required": ["sql"]
}
}
]
# --- LangChain tool definition (yet another format) ---
from langchain.tools import tool
@tool
def query_database(sql: str, database: str = "customers") -> str:
"""Run a SQL query against the customer database."""
conn = sqlite3.connect(database)
cursor = conn.cursor()
cursor.execute(sql)
results = cursor.fetchall()
conn.close()
return json.dumps(results)
# The SAME tool defined THREE different ways for THREE different systems
# Multiply this by 50 tools and 5 LLM providers = 250 definitions to maintain
# Change the tool schema? Update it in all 250 places.
This fragmentation creates cascading problems: vendor lock-in (your tools only work with one provider), maintenance overhead (N tools times M providers equals N*M implementations), no interoperability (tools built for Claude cannot be used with GPT-4), and zero standardization (every integration is a snowflake).
1.2 HTTP for AI Agents — The USB-C Analogy
The best way to understand MCP's role is through analogies to protocols that solved similar fragmentation problems in other domains:
| Domain |
Before Standardization |
After Standardization |
Protocol |
| Web |
Custom protocols per service (Gopher, FTP, WAIS) |
Universal browser-to-server communication |
HTTP/HTTPS |
| Peripherals |
Serial, parallel, PS/2, FireWire, proprietary ports |
One connector for everything |
USB / USB-C |
| Databases |
Vendor-specific query languages per database |
Universal query language across all databases |
SQL / ODBC |
| APIs |
SOAP, XML-RPC, custom binary protocols |
Uniform resource-based API design |
REST / OpenAPI |
| AI Agents |
Custom tool formats per provider (OpenAI, Anthropic, LangChain) |
Universal tool/resource/prompt protocol |
MCP |
The USB-C Moment: Before USB-C, you needed different cables for your phone, laptop, headphones, and monitor. MCP is the USB-C moment for AI integrations — one protocol that lets any AI host connect to any capability server. Build an MCP server once, and it works with Claude Desktop, Cursor, Windsurf, VS Code, and any future MCP-compatible host.
1.3 MCP vs Alternatives — Comprehensive Comparison
MCP did not emerge in a vacuum. Several approaches to LLM-tool integration existed before it. Here is how they compare across the dimensions that matter most for production systems:
| Criterion |
OpenAI Function Calling |
ChatGPT Plugins |
LangChain Tools |
AutoGen Tools |
MCP |
| Vendor Lock-in |
High — OpenAI only |
Total — ChatGPT only |
Medium — LangChain ecosystem |
Medium — AutoGen ecosystem |
None — vendor-neutral open standard |
| Modularity |
Low — tools embedded in API call |
Low — monolithic plugin manifest |
Medium — Python decorators |
Medium — function registration |
High — decoupled server per capability |
| Interoperability |
None across providers |
None — deprecated by OpenAI |
Within LangChain only |
Within AutoGen only |
Universal — any host to any server |
| Standardization |
De facto standard for OpenAI |
Abandoned (2024) |
Community conventions |
Microsoft conventions |
Open specification with formal schema |
| Transport Options |
HTTPS only |
HTTPS only |
In-process Python |
In-process Python |
STDIO, HTTP/SSE, WebSocket, gRPC |
| Capability Discovery |
None — tools hardcoded |
Static manifest file |
Runtime introspection |
Runtime introspection |
Dynamic discovery protocol |
| Primitives |
Tools only |
Tools + auth |
Tools + retrievers |
Tools + code exec |
Resources + Tools + Prompts + Sampling |
| Security Model |
API key per call |
OAuth (limited) |
Application-level |
Application-level |
OAuth 2.0, JWT, mTLS, RBAC, sandboxing |
Why ChatGPT Plugins Failed: OpenAI launched ChatGPT Plugins in March 2023 with great fanfare, then quietly deprecated them by early 2024. The fundamental flaw was centralization — plugins had to be approved by OpenAI, hosted on specific infrastructure, and only worked within ChatGPT. MCP learned from this failure by being fully open, decentralized, and host-agnostic.
1.4 Core Design Principles
MCP was designed around five principles that distinguish it from every prior approach to AI-tool integration:
Design Principles
The Five Pillars of MCP Design
- Separation of Concerns: Hosts manage LLM interaction and UI. Clients manage protocol connections. Servers expose capabilities. Each component has a single responsibility and can be developed, deployed, and scaled independently.
- Composability: An agent can connect to multiple MCP servers simultaneously — a database server, a web search server, a file system server — and the host orchestrates them seamlessly. Capabilities compose like UNIX pipes.
- Least Privilege: Each MCP server declares exactly what capabilities it exposes, and clients can restrict which capabilities they request. A file-reading server never needs database write access. Permissions are granular and explicit.
- Deterministic Tool Interfaces: Every tool has a JSON Schema definition that specifies its inputs and outputs precisely. There is no ambiguity about what a tool expects or returns. The LLM sees the schema and can generate valid invocations reliably.
- Transport Agnosticism: MCP works over STDIO (for local processes), HTTP with Server-Sent Events (for web services), WebSocket (for bidirectional streaming), and gRPC (for high-performance). The protocol is the same regardless of transport.
Separation of Concerns
Composability
Least Privilege
Deterministic Interfaces
Transport Agnostic
# AFTER MCP: Define a tool ONCE, use it everywhere
# The same MCP server works with Claude, GPT-4, Gemini, Llama, any host
# pip install mcp
from mcp.server import Server
from mcp.types import Tool, TextContent
import json
import sqlite3
# Create an MCP server — one tool definition, universal compatibility
server = Server("database-server")
@server.list_tools()
async def list_tools():
"""Declare available tools via the MCP discovery protocol."""
return [
Tool(
name="query_database",
description="Run a read-only SQL query against the customer database",
inputSchema={
"type": "object",
"properties": {
"sql": {
"type": "string",
"description": "SQL SELECT query to execute"
},
"database": {
"type": "string",
"description": "Database name",
"default": "customers.db"
}
},
"required": ["sql"]
}
)
]
@server.call_tool()
async def call_tool(name: str, arguments: dict):
"""Execute a tool invocation from any MCP-compatible host."""
if name == "query_database":
sql = arguments["sql"]
database = arguments.get("database", "customers.db")
# Security: only allow SELECT queries
if not sql.strip().upper().startswith("SELECT"):
return [TextContent(
type="text",
text="Error: Only SELECT queries are allowed for safety."
)]
conn = sqlite3.connect(database)
cursor = conn.cursor()
cursor.execute(sql)
results = cursor.fetchall()
columns = [desc[0] for desc in cursor.description]
conn.close()
# Return structured results
formatted = [dict(zip(columns, row)) for row in results]
return [TextContent(type="text", text=json.dumps(formatted, indent=2))]
raise ValueError(f"Unknown tool: {name}")
# This SINGLE server definition works with:
# - Claude Desktop (via STDIO transport)
# - Cursor IDE (via STDIO transport)
# - Windsurf (via STDIO transport)
# - Any custom MCP host (via HTTP/SSE or WebSocket)
# - Future AI hosts that adopt MCP
# Define once. Run everywhere.
History
How MCP Evolved from Anthropic's Tool-Use Experience
MCP grew out of Anthropic's internal experience building Claude's tool-use capabilities. When Anthropic launched Claude's tool calling in 2024, they encountered the same integration fragmentation that plagued the entire industry. Every enterprise customer had to write custom glue code to connect Claude to their systems.
Anthropic recognized that this was not a Claude-specific problem — it was an industry problem. In late 2024, they open-sourced MCP as a vendor-neutral specification, explicitly designing it so that competitors could adopt it. By early 2025, Cursor, Windsurf, VS Code, Replit, and dozens of other tools had adopted the protocol, validating the design. By 2026, MCP has become the de facto standard for AI-tool integration, with over 3,000 community-built MCP servers covering databases, APIs, developer tools, business applications, and more.
Anthropic Origin
Open Source
Vendor Neutral
3000+ Servers
2. Architecture Deep Dive
MCP follows a client-server architecture built on JSON-RPC 2.0, with clearly defined roles: hosts (AI applications like Claude Desktop or VS Code), clients (protocol connectors that maintain 1:1 server connections), and servers (lightweight services exposing tools, resources, and prompts). This layered design enables secure, composable integrations where each server is sandboxed and the host controls which capabilities are exposed to the AI model.
2.1 System Overview
MCP follows a layered architecture with clear separation between four components. Every MCP interaction flows through this chain:
# MCP Architecture Overview
#
# +------------------+ +------------------+ +------------------+ +------------------+
# | | | | | | | |
# | MCP HOST |<--->| MCP CLIENT |<--->| MCP SERVER |<--->| DATA / APIs |
# | | | | | | | |
# | Claude Desktop | | Protocol Layer | | Capability | | Databases |
# | Cursor IDE | | Session Mgmt | | Provider | | REST APIs |
# | Windsurf | | Retry Logic | | Tools | | File Systems |
# | Custom App | | Transport | | Resources | | SaaS Services |
# | | | | | Prompts | | Vector DBs |
# +------------------+ +------------------+ +------------------+ +------------------+
#
# A single HOST manages multiple CLIENTs.
# Each CLIENT connects to exactly ONE SERVER.
# Each SERVER exposes capabilities from one or more DATA sources.
#
# Example: Claude Desktop (host) manages 5 clients, each connected to
# a different server: filesystem, database, GitHub, Slack, web-search.
2.2 MCP Hosts
An MCP Host is the application that the user interacts with directly. It manages the LLM, renders the UI, and orchestrates one or more MCP clients. The host is responsible for the overall user experience and for deciding how to route capabilities from multiple servers.
| Host Application |
Type |
MCP Integration |
Notable Features |
| Claude Desktop |
Desktop App |
Native MCP support via STDIO |
First MCP host; JSON config for server registration; auto-starts servers |
| Cursor |
IDE |
MCP servers for code tools |
Integrates MCP tools into code completion and chat; project-level config |
| Windsurf |
IDE |
MCP for IDE extensions |
Cascade agent uses MCP tools; supports multi-server configurations |
| VS Code + Copilot |
IDE Extension |
MCP via extension API |
GitHub Copilot Chat integrates MCP servers for workspace context |
| Custom Applications |
Any app |
MCP SDK integration |
Build your own host using the mcp Python or TypeScript SDK |
// Claude Desktop MCP configuration (~/.claude/claude_desktop_config.json)
// This tells the host which MCP servers to launch and how to connect
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/dev/projects"],
"env": {}
},
"database": {
"command": "python",
"args": ["-m", "mcp_server_sqlite", "--db-path", "/Users/dev/data/app.db"],
"env": {}
},
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_TOKEN": "ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
},
"web-search": {
"command": "python",
"args": ["-m", "mcp_server_brave_search"],
"env": {
"BRAVE_API_KEY": "BSAxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
}
}
}
Host Responsibilities: The host manages the LLM conversation loop, presents discovered tools to the LLM, routes tool calls to the appropriate client/server, handles user consent for sensitive operations, and aggregates results back into the conversation context. The host is the orchestrator — it never executes tools directly.
2.3 MCP Clients
An MCP Client is the protocol layer that sits between the host and a server. Each client manages a single connection to a single server. The host creates one client per server it needs to communicate with.
# Building an MCP client that connects to a server
# This demonstrates the client lifecycle: connect, discover, invoke, disconnect
# pip install mcp
import asyncio
import os
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
async def run_mcp_client():
"""Demonstrate the full MCP client lifecycle."""
# Step 1: Define server connection parameters
# STDIO transport — the server runs as a child process
server_params = StdioServerParameters(
command="python", # Command to launch the server
args=["-m", "mcp_server_sqlite", # Server module
"--db-path", "customers.db"], # Server-specific arguments
env={ # Environment variables
"PATH": os.getenv("PATH", ""),
"LOG_LEVEL": "INFO"
}
)
# Step 2: Connect to the server via STDIO transport
async with stdio_client(server_params) as (read_stream, write_stream):
async with ClientSession(read_stream, write_stream) as session:
# Step 3: Initialize the session (protocol handshake)
await session.initialize()
print("Session initialized successfully")
# Step 4: Discover available capabilities
# List all tools the server exposes
tools_response = await session.list_tools()
print(f"\nAvailable tools ({len(tools_response.tools)}):")
for tool in tools_response.tools:
print(f" - {tool.name}: {tool.description}")
print(f" Schema: {tool.inputSchema}")
# List all resources the server exposes
resources_response = await session.list_resources()
print(f"\nAvailable resources ({len(resources_response.resources)}):")
for resource in resources_response.resources:
print(f" - {resource.uri}: {resource.name}")
# List all prompt templates
prompts_response = await session.list_prompts()
print(f"\nAvailable prompts ({len(prompts_response.prompts)}):")
for prompt in prompts_response.prompts:
print(f" - {prompt.name}: {prompt.description}")
# Step 5: Invoke a tool
result = await session.call_tool(
"query_database",
arguments={"sql": "SELECT name, email FROM customers LIMIT 5"}
)
print(f"\nTool result: {result.content[0].text}")
# Step 6: Read a resource
resource_content = await session.read_resource("sqlite:///customers/schema")
print(f"\nResource content: {resource_content.contents[0].text}")
# Step 7: Get a prompt template
prompt_result = await session.get_prompt(
"analyze-table",
arguments={"table_name": "customers"}
)
print(f"\nPrompt template: {prompt_result.messages[0].content.text}")
print("\nSession closed. Client disconnected.")
# Run the client
# asyncio.run(run_mcp_client())
Key client responsibilities include:
- Connection management: Establishing, maintaining, and gracefully closing connections to servers
- Session handling: Managing the protocol handshake (Initialize), capability negotiation, and session state
- Streaming: Handling streamed responses for long-running operations via Server-Sent Events or WebSocket
- Retry and fault tolerance: Implementing exponential backoff, connection pooling, and circuit-breaker patterns for unreliable servers
- Message serialization: Converting between the host's internal format and MCP's JSON-RPC message format
2.4 MCP Servers
An MCP Server is the component that exposes actual capabilities — tools, resources, and prompts — to clients. Each server is a focused, single-purpose service that wraps one domain (a database, an API, a file system) in the MCP protocol.
# Complete MCP server implementation with all four primitive types
# This server provides database access with full MCP capability exposure
# pip install mcp aiosqlite
import json
import os
import asyncio
import aiosqlite
from mcp.server import Server
from mcp.types import (
Tool, Resource, Prompt, PromptMessage,
TextContent, PromptArgument, ResourceTemplate
)
# Database path from environment variable
DB_PATH = os.getenv("MCP_DB_PATH", "app.db")
# Create the MCP server instance
server = Server("enterprise-database-server")
# --- TOOLS: Actions the LLM can execute ---
@server.list_tools()
async def list_tools():
"""Expose database query and write tools."""
return [
Tool(
name="query",
description="Execute a read-only SQL SELECT query",
inputSchema={
"type": "object",
"properties": {
"sql": {"type": "string", "description": "SQL SELECT query"},
"limit": {"type": "integer", "description": "Max rows", "default": 100}
},
"required": ["sql"]
}
),
Tool(
name="insert_record",
description="Insert a new record into a table",
inputSchema={
"type": "object",
"properties": {
"table": {"type": "string", "description": "Target table name"},
"data": {
"type": "object",
"description": "Column-value pairs to insert",
"additionalProperties": True
}
},
"required": ["table", "data"]
}
)
]
@server.call_tool()
async def call_tool(name: str, arguments: dict):
"""Handle tool invocations with safety checks."""
if name == "query":
sql = arguments["sql"].strip()
limit = arguments.get("limit", 100)
# Security: only allow SELECT statements
if not sql.upper().startswith("SELECT"):
return [TextContent(type="text", text="Error: Only SELECT queries allowed.")]
# Enforce row limit
if "LIMIT" not in sql.upper():
sql = f"{sql} LIMIT {limit}"
async with aiosqlite.connect(DB_PATH) as db:
db.row_factory = aiosqlite.Row
cursor = await db.execute(sql)
rows = await cursor.fetchall()
columns = [d[0] for d in cursor.description]
results = [dict(zip(columns, row)) for row in rows]
return [TextContent(type="text", text=json.dumps(results, indent=2, default=str))]
elif name == "insert_record":
table = arguments["table"]
data = arguments["data"]
# Security: validate table name (prevent SQL injection)
if not table.isalnum():
return [TextContent(type="text", text="Error: Invalid table name.")]
columns = ", ".join(data.keys())
placeholders = ", ".join(["?"] * len(data))
values = list(data.values())
async with aiosqlite.connect(DB_PATH) as db:
await db.execute(
f"INSERT INTO {table} ({columns}) VALUES ({placeholders})",
values
)
await db.commit()
return [TextContent(type="text", text=f"Successfully inserted record into {table}.")]
raise ValueError(f"Unknown tool: {name}")
# --- RESOURCES: Data the LLM can read ---
@server.list_resources()
async def list_resources():
"""Expose database schema and table data as readable resources."""
resources = [
Resource(
uri="db://schema",
name="Database Schema",
description="Complete schema of all tables in the database",
mimeType="application/json"
)
]
# Dynamically list all tables as resources
async with aiosqlite.connect(DB_PATH) as db:
cursor = await db.execute(
"SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"
)
tables = await cursor.fetchall()
for (table_name,) in tables:
resources.append(Resource(
uri=f"db://tables/{table_name}",
name=f"Table: {table_name}",
description=f"Sample data and schema for the {table_name} table",
mimeType="application/json"
))
return resources
@server.read_resource()
async def read_resource(uri: str):
"""Return resource content for a given URI."""
if uri == "db://schema":
async with aiosqlite.connect(DB_PATH) as db:
cursor = await db.execute(
"SELECT sql FROM sqlite_master WHERE type='table'"
)
schemas = await cursor.fetchall()
return json.dumps([s[0] for s in schemas], indent=2)
if uri.startswith("db://tables/"):
table_name = uri.split("/")[-1]
if not table_name.isalnum():
return "Error: Invalid table name"
async with aiosqlite.connect(DB_PATH) as db:
# Return schema + sample rows
cursor = await db.execute(f"PRAGMA table_info({table_name})")
columns = await cursor.fetchall()
cursor = await db.execute(f"SELECT * FROM {table_name} LIMIT 10")
rows = await cursor.fetchall()
col_names = [c[1] for c in columns]
return json.dumps({
"table": table_name,
"columns": [{"name": c[1], "type": c[2], "nullable": not c[3]} for c in columns],
"sample_rows": [dict(zip(col_names, row)) for row in rows],
"row_count": len(rows)
}, indent=2, default=str)
raise ValueError(f"Unknown resource URI: {uri}")
# --- PROMPTS: Reusable templates for the LLM ---
@server.list_prompts()
async def list_prompts():
"""Expose reusable prompt templates."""
return [
Prompt(
name="analyze-table",
description="Generate a comprehensive analysis prompt for a database table",
arguments=[
PromptArgument(
name="table_name",
description="Name of the table to analyze",
required=True
),
PromptArgument(
name="focus",
description="Analysis focus: 'quality', 'patterns', or 'summary'",
required=False
)
]
),
Prompt(
name="write-query",
description="Generate a prompt to help write a SQL query for a specific question",
arguments=[
PromptArgument(
name="question",
description="Natural language question to answer with SQL",
required=True
)
]
)
]
@server.get_prompt()
async def get_prompt(name: str, arguments: dict):
"""Return a populated prompt template."""
if name == "analyze-table":
table_name = arguments["table_name"]
focus = arguments.get("focus", "summary")
# Fetch schema for context
async with aiosqlite.connect(DB_PATH) as db:
cursor = await db.execute(f"PRAGMA table_info({table_name})")
columns = await cursor.fetchall()
schema_info = ", ".join([f"{c[1]} ({c[2]})" for c in columns])
return {
"messages": [
PromptMessage(
role="user",
content=TextContent(
type="text",
text=f"Analyze the '{table_name}' table with focus on {focus}.\n\n"
f"Schema: {schema_info}\n\n"
f"Please provide:\n"
f"1. Data quality assessment\n"
f"2. Key patterns and distributions\n"
f"3. Potential issues or anomalies\n"
f"4. Recommended queries for deeper analysis"
)
)
]
}
elif name == "write-query":
question = arguments["question"]
# Fetch full schema for context
async with aiosqlite.connect(DB_PATH) as db:
cursor = await db.execute(
"SELECT sql FROM sqlite_master WHERE type='table'"
)
schemas = await cursor.fetchall()
schema_text = "\n".join([s[0] for s in schemas if s[0]])
return {
"messages": [
PromptMessage(
role="user",
content=TextContent(
type="text",
text=f"Write a SQL query to answer: {question}\n\n"
f"Database schema:\n{schema_text}\n\n"
f"Requirements:\n"
f"- Use only SELECT statements\n"
f"- Include appropriate JOINs if needed\n"
f"- Add LIMIT clause for safety\n"
f"- Explain the query logic"
)
)
]
}
raise ValueError(f"Unknown prompt: {name}")
Server design considerations for production:
- Stateless vs Stateful: Prefer stateless servers where possible. State should live in the data layer (database, cache), not the server process. This enables horizontal scaling and fault tolerance.
- Latency: Tool invocations should complete within 5 seconds for interactive use. For long-running operations, use streaming responses to provide progress updates.
- Observability: Instrument every tool call with structured logging, request tracing (correlation IDs), and metrics (latency histograms, error rates).
- Idempotency: Write operations should be idempotent when possible. If the client retries a failed insert, it should not create duplicate records.
2.5 Data Layer Integration
MCP servers bridge the gap between AI agents and the data they need. The data layer spans both local and remote sources:
| Data Source |
Type |
MCP Integration Pattern |
Example Server |
| Local Filesystem |
Local |
Resources for reading, Tools for writing |
@modelcontextprotocol/server-filesystem |
| SQLite / PostgreSQL |
Local / Remote |
Resources for schema, Tools for queries |
mcp-server-sqlite, mcp-server-postgres |
| REST / GraphQL APIs |
Remote |
Tools that wrap HTTP calls |
Custom server per API |
| SaaS Platforms |
Remote |
Tools for CRUD operations on SaaS entities |
mcp-server-github, mcp-server-slack |
| Vector Databases |
Local / Remote |
Resources for similarity search, Tools for indexing |
Custom server wrapping Chroma, Pinecone, Qdrant |
| Knowledge Graphs |
Remote |
Resources for traversal, Tools for queries |
Custom server wrapping Neo4j, Amazon Neptune |
# MCP server wrapping a vector database for semantic search
# Demonstrates the Resources pattern for RAG-style retrieval
# pip install mcp chromadb sentence-transformers
import os
import json
import chromadb
from mcp.server import Server
from mcp.types import Tool, Resource, TextContent
# Initialize ChromaDB with persistent storage
CHROMA_PATH = os.getenv("CHROMA_DB_PATH", "./chroma_data")
chroma_client = chromadb.PersistentClient(path=CHROMA_PATH)
server = Server("vector-search-server")
@server.list_tools()
async def list_tools():
"""Expose semantic search and indexing tools."""
return [
Tool(
name="semantic_search",
description="Search the knowledge base using natural language",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string", "description": "Natural language search query"},
"collection": {"type": "string", "description": "Collection name", "default": "documents"},
"top_k": {"type": "integer", "description": "Number of results", "default": 5}
},
"required": ["query"]
}
),
Tool(
name="index_document",
description="Add a document to the knowledge base",
inputSchema={
"type": "object",
"properties": {
"text": {"type": "string", "description": "Document text to index"},
"metadata": {"type": "object", "description": "Document metadata"},
"collection": {"type": "string", "description": "Target collection", "default": "documents"}
},
"required": ["text"]
}
)
]
@server.call_tool()
async def call_tool(name: str, arguments: dict):
"""Handle vector search and indexing operations."""
if name == "semantic_search":
query = arguments["query"]
collection_name = arguments.get("collection", "documents")
top_k = arguments.get("top_k", 5)
collection = chroma_client.get_or_create_collection(collection_name)
results = collection.query(query_texts=[query], n_results=top_k)
# Format results with metadata and relevance scores
formatted = []
for i, (doc, meta, dist) in enumerate(zip(
results["documents"][0],
results["metadatas"][0],
results["distances"][0]
)):
formatted.append({
"rank": i + 1,
"text": doc,
"metadata": meta,
"similarity_score": round(1 - dist, 4) # Convert distance to similarity
})
return [TextContent(type="text", text=json.dumps(formatted, indent=2))]
elif name == "index_document":
text = arguments["text"]
metadata = arguments.get("metadata", {})
collection_name = arguments.get("collection", "documents")
collection = chroma_client.get_or_create_collection(collection_name)
# Generate a deterministic ID for idempotency
import hashlib
doc_id = hashlib.sha256(text.encode()).hexdigest()[:16]
collection.upsert(
documents=[text],
metadatas=[metadata],
ids=[doc_id]
)
return [TextContent(
type="text",
text=f"Document indexed successfully. ID: {doc_id}, Collection: {collection_name}"
)]
raise ValueError(f"Unknown tool: {name}")
@server.list_resources()
async def list_resources():
"""Expose collection metadata as resources."""
collections = chroma_client.list_collections()
return [
Resource(
uri=f"vector://collections/{col.name}",
name=f"Collection: {col.name}",
description=f"Metadata and stats for the {col.name} vector collection",
mimeType="application/json"
)
for col in collections
]
Case Study
Claude Desktop's MCP Ecosystem
Claude Desktop was the first production MCP host, and its ecosystem demonstrates the power of the protocol. A typical power user's Claude Desktop configuration connects to 5-10 MCP servers simultaneously:
- Filesystem server — read and write project files directly from chat
- GitHub server — create issues, review PRs, search repositories
- Slack server — send messages, search conversations, manage channels
- PostgreSQL server — query production databases (read-only)
- Brave Search server — real-time web search with citations
Claude can seamlessly combine capabilities: "Search GitHub for open issues about authentication, check the relevant code files, query the database for affected users, and draft a Slack message to the team with your findings." One prompt, five MCP servers, zero custom integration code.
Claude Desktop
Multi-Server
Ecosystem
Zero Custom Code
3. Core MCP Primitives
MCP defines four core primitives that cover the full spectrum of AI-system interactions. Together they form a complete vocabulary: read data (Resources), take actions (Tools), reuse templates (Prompts), and delegate reasoning (Sampling).
3.1 Resources (READ) — Structured Data Access
Resources represent data that the LLM can read but not modify. They are identified by URIs and return structured or unstructured content. Think of Resources as a read-only API for the LLM's knowledge.
| Resource Type |
URI Pattern |
Content |
Use Case |
| Documents |
file:///docs/guide.md |
Markdown, PDF, text |
Knowledge base articles, documentation |
| Database Queries |
db://tables/users/schema |
JSON schema, sample rows |
Schema discovery, data previews |
| API Responses |
api://weather/current |
JSON data |
Real-time data feeds |
| Vector Search Results |
vector://search?q=deployment |
Ranked document chunks |
Semantic retrieval for RAG |
| Configuration |
config://app/settings |
JSON/YAML config |
Application state, feature flags |
Advanced resource patterns include: pagination (using cursor-based or offset parameters in the URI), filtering (query parameters that narrow results), chunking (splitting large documents into LLM-friendly sizes), and caching (ETags or last-modified headers to avoid re-fetching unchanged data).
Tools are the workhorse of MCP — they let the LLM do things. Unlike Resources (read-only), Tools can have side effects: writing to databases, calling APIs, sending emails, creating files. Every Tool is defined by a JSON Schema that makes its interface completely explicit.
# Advanced tool patterns: idempotency, side-effect control, tool chaining
# Demonstrates production-grade tool implementation
# pip install mcp httpx
import os
import json
import hashlib
import httpx
from datetime import datetime, timezone
from mcp.server import Server
from mcp.types import Tool, TextContent
# API key from environment
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN", "")
server = Server("github-tools-server")
@server.list_tools()
async def list_tools():
"""Expose GitHub operations as MCP tools."""
return [
Tool(
name="create_issue",
description="Create a GitHub issue with title, body, and labels",
inputSchema={
"type": "object",
"properties": {
"repo": {
"type": "string",
"description": "Repository in 'owner/name' format"
},
"title": {
"type": "string",
"description": "Issue title",
"maxLength": 256
},
"body": {
"type": "string",
"description": "Issue body (supports Markdown)"
},
"labels": {
"type": "array",
"items": {"type": "string"},
"description": "Labels to apply",
"default": []
},
"idempotency_key": {
"type": "string",
"description": "Unique key to prevent duplicate creation"
}
},
"required": ["repo", "title", "body"]
}
),
Tool(
name="search_code",
description="Search for code across GitHub repositories",
inputSchema={
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query (supports GitHub search syntax)"
},
"language": {
"type": "string",
"description": "Filter by programming language"
},
"max_results": {
"type": "integer",
"description": "Maximum results to return",
"default": 10,
"maximum": 50
}
},
"required": ["query"]
}
)
]
@server.call_tool()
async def call_tool(name: str, arguments: dict):
"""Execute GitHub tool invocations with safety and idempotency."""
headers = {
"Authorization": f"Bearer {GITHUB_TOKEN}",
"Accept": "application/vnd.github.v3+json",
"X-GitHub-Api-Version": "2022-11-28"
}
async with httpx.AsyncClient(base_url="https://api.github.com") as client:
if name == "create_issue":
repo = arguments["repo"]
title = arguments["title"]
body = arguments["body"]
labels = arguments.get("labels", [])
# Idempotency: check if issue with same title already exists
idempotency_key = arguments.get(
"idempotency_key",
hashlib.sha256(f"{repo}:{title}".encode()).hexdigest()[:12]
)
# Search for existing issue with the idempotency marker
search_response = await client.get(
f"/repos/{repo}/issues",
headers=headers,
params={"state": "all", "per_page": 5}
)
if search_response.status_code == 200:
existing = [
i for i in search_response.json()
if i.get("title") == title
]
if existing:
return [TextContent(
type="text",
text=json.dumps({
"status": "already_exists",
"issue_number": existing[0]["number"],
"url": existing[0]["html_url"],
"message": "Issue with identical title already exists."
}, indent=2)
)]
# Create the issue
response = await client.post(
f"/repos/{repo}/issues",
headers=headers,
json={
"title": title,
"body": f"{body}\n\n---\n_idempotency_key: {idempotency_key}_",
"labels": labels
}
)
if response.status_code == 201:
issue = response.json()
return [TextContent(
type="text",
text=json.dumps({
"status": "created",
"issue_number": issue["number"],
"url": issue["html_url"],
"created_at": issue["created_at"]
}, indent=2)
)]
else:
return [TextContent(
type="text",
text=f"Error creating issue: {response.status_code} - {response.text}"
)]
elif name == "search_code":
query = arguments["query"]
language = arguments.get("language", "")
max_results = arguments.get("max_results", 10)
# Build GitHub search query
search_query = query
if language:
search_query += f" language:{language}"
response = await client.get(
"/search/code",
headers=headers,
params={"q": search_query, "per_page": min(max_results, 50)}
)
if response.status_code == 200:
data = response.json()
results = []
for item in data.get("items", [])[:max_results]:
results.append({
"repository": item["repository"]["full_name"],
"path": item["path"],
"url": item["html_url"],
"score": item.get("score", 0)
})
return [TextContent(
type="text",
text=json.dumps({
"total_count": data.get("total_count", 0),
"results": results
}, indent=2)
)]
else:
return [TextContent(
type="text",
text=f"Search error: {response.status_code} - {response.text}"
)]
raise ValueError(f"Unknown tool: {name}")
3.3 Prompts (REUSE) — Reusable Templates
MCP Prompts are reusable, parameterized templates that servers expose for common interaction patterns. They are not just strings — they are structured message sequences that can include system prompts, user messages, and even pre-filled assistant responses.
# Advanced prompt patterns: versioning, parameterization, injection defense
# Demonstrates production-grade prompt templates
# pip install mcp
from mcp.server import Server
from mcp.types import (
Prompt, PromptArgument, PromptMessage, TextContent
)
server = Server("prompt-library-server")
# Prompt template registry with versioning
PROMPT_TEMPLATES = {
"code-review": {
"version": "2.1",
"description": "Generate a thorough code review with security analysis",
"system_prompt": (
"You are a senior software engineer conducting a code review. "
"Focus on: correctness, security vulnerabilities, performance, "
"readability, and adherence to best practices. "
"IMPORTANT: Never execute code suggestions. Only analyze and recommend."
),
"user_template": (
"Please review the following {language} code:\n\n"
"```{language}\n{code}\n```\n\n"
"Context: {context}\n\n"
"Focus areas: {focus_areas}\n\n"
"Provide your review in the following format:\n"
"1. Summary (1-2 sentences)\n"
"2. Critical Issues (security, correctness)\n"
"3. Improvements (performance, readability)\n"
"4. Positive Aspects\n"
"5. Suggested Refactoring (with code examples)"
)
},
"incident-response": {
"version": "1.3",
"description": "Guide incident response and root cause analysis",
"system_prompt": (
"You are an SRE incident commander. Help analyze the incident, "
"identify root causes, and recommend mitigations. "
"Be systematic and prioritize by severity. "
"CRITICAL: Do not suggest running destructive commands."
),
"user_template": (
"Incident: {incident_title}\n"
"Severity: {severity}\n"
"Service: {service_name}\n"
"Symptoms: {symptoms}\n"
"Timeline: {timeline}\n\n"
"Please provide:\n"
"1. Initial assessment and severity validation\n"
"2. Likely root causes (ranked by probability)\n"
"3. Immediate mitigation steps\n"
"4. Investigation queries to run\n"
"5. Post-incident action items"
)
}
}
@server.list_prompts()
async def list_prompts():
"""Expose versioned prompt templates with parameter definitions."""
return [
Prompt(
name="code-review",
description=f"[v{PROMPT_TEMPLATES['code-review']['version']}] "
f"{PROMPT_TEMPLATES['code-review']['description']}",
arguments=[
PromptArgument(name="code", description="Code to review", required=True),
PromptArgument(name="language", description="Programming language", required=True),
PromptArgument(name="context", description="PR context or description", required=False),
PromptArgument(name="focus_areas", description="Specific areas to focus on", required=False)
]
),
Prompt(
name="incident-response",
description=f"[v{PROMPT_TEMPLATES['incident-response']['version']}] "
f"{PROMPT_TEMPLATES['incident-response']['description']}",
arguments=[
PromptArgument(name="incident_title", description="Incident title", required=True),
PromptArgument(name="severity", description="P0-P4", required=True),
PromptArgument(name="service_name", description="Affected service", required=True),
PromptArgument(name="symptoms", description="Observed symptoms", required=True),
PromptArgument(name="timeline", description="Event timeline", required=False)
]
)
]
@server.get_prompt()
async def get_prompt(name: str, arguments: dict):
"""Return populated prompt with injection defense."""
if name not in PROMPT_TEMPLATES:
raise ValueError(f"Unknown prompt: {name}")
template = PROMPT_TEMPLATES[name]
# Injection defense: sanitize all user-provided arguments
sanitized_args = {}
for key, value in arguments.items():
if isinstance(value, str):
# Remove potential prompt injection markers
sanitized = value.replace("IGNORE PREVIOUS INSTRUCTIONS", "[FILTERED]")
sanitized = sanitized.replace("", "[FILTERED]")
sanitized = sanitized.replace("", "[FILTERED]")
sanitized_args[key] = sanitized
else:
sanitized_args[key] = value
# Fill defaults for optional arguments
sanitized_args.setdefault("context", "No additional context provided")
sanitized_args.setdefault("focus_areas", "All areas")
sanitized_args.setdefault("timeline", "Not provided")
# Build the message sequence
user_text = template["user_template"].format(**sanitized_args)
return {
"messages": [
PromptMessage(
role="assistant",
content=TextContent(type="text", text=template["system_prompt"])
),
PromptMessage(
role="user",
content=TextContent(type="text", text=user_text)
)
]
}
3.4 Sampling (THINK) — Delegated Reasoning
Sampling is MCP's most distinctive primitive. It inverts the normal flow: instead of the host asking the server to execute a tool, the server asks the host to perform an LLM completion. This enables servers to leverage AI reasoning without needing their own LLM access.
Important Distinction: Sampling requests flow from server to host, the opposite direction of tool calls. The server says "I need the LLM to reason about this data before I can continue." The host fulfills the request using its LLM and returns the result. This keeps all LLM interaction centralized in the host while letting servers participate in the reasoning chain.
Key sampling use cases:
- Tool result summarization: A database server returns 500 rows, then asks the host to summarize the key findings before returning to the user
- Multi-step planning: A code analysis server asks the host to plan which files to examine based on an initial code scan
- Content classification: A content moderation server asks the host to classify user input before deciding which tool to invoke
- Error interpretation: A deployment server encounters an error log and asks the host to interpret the stack trace before suggesting remediation
3.5 Transports (CONNECT) — Communication Channels
MCP is transport-agnostic — the protocol messages are the same regardless of how they are delivered. The choice of transport depends on the deployment context:
| Transport |
Mechanism |
Best For |
Latency |
Scalability |
| STDIO |
Standard input/output of child process |
Local development, desktop apps (Claude Desktop, Cursor) |
Lowest (~1ms) |
Single user only |
| HTTP + SSE |
HTTP POST for requests, Server-Sent Events for streaming |
Web services, multi-user, cloud deployments |
Low (~10-50ms) |
Horizontal scaling via load balancer |
| WebSocket |
Persistent bidirectional connection |
Real-time streaming, long-lived sessions |
Lowest for sustained connections (~5ms) |
Good with connection managers |
| gRPC |
Protocol Buffers over HTTP/2 |
High-performance microservices, large payloads |
Lowest at scale (~2-5ms) |
Excellent — built for microservices |
# Running the same MCP server over different transports
# The server logic is identical — only the transport layer changes
# pip install mcp uvicorn
import asyncio
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.server.sse import SseServerTransport
from mcp.types import Tool, TextContent
# Create the server (transport-independent)
server = Server("multi-transport-demo")
@server.list_tools()
async def list_tools():
"""Same tools regardless of transport."""
return [
Tool(
name="greet",
description="Generate a greeting message",
inputSchema={
"type": "object",
"properties": {
"name": {"type": "string", "description": "Name to greet"}
},
"required": ["name"]
}
)
]
@server.call_tool()
async def call_tool(name: str, arguments: dict):
if name == "greet":
return [TextContent(type="text", text=f"Hello, {arguments['name']}!")]
raise ValueError(f"Unknown tool: {name}")
# --- Transport Option 1: STDIO (for local/desktop use) ---
async def run_stdio():
"""Run as a child process communicating via stdin/stdout."""
async with stdio_server() as (read_stream, write_stream):
await server.run(read_stream, write_stream, server.create_initialization_options())
# --- Transport Option 2: HTTP + SSE (for web/cloud use) ---
def create_sse_app():
"""Create an HTTP app with Server-Sent Events transport."""
from starlette.applications import Starlette
from starlette.routing import Route
sse_transport = SseServerTransport("/messages")
async def handle_sse(request):
"""Handle SSE connections from MCP clients."""
async with sse_transport.connect_sse(
request.scope, request.receive, request._send
) as streams:
await server.run(
streams[0], streams[1],
server.create_initialization_options()
)
async def handle_messages(request):
"""Handle incoming JSON-RPC messages via HTTP POST."""
await sse_transport.handle_post_message(
request.scope, request.receive, request._send
)
app = Starlette(routes=[
Route("/sse", endpoint=handle_sse),
Route("/messages", endpoint=handle_messages, methods=["POST"]),
])
return app
# To run STDIO: asyncio.run(run_stdio())
# To run HTTP/SSE: uvicorn.run(create_sse_app(), host="0.0.0.0", port=8000)
4. The FastMCP Python SDK — Hands-On Guide
Now that you understand the MCP architecture and its core primitives conceptually, let us write real code. The FastMCP SDK is the official high-level Python library that makes building MCP servers simple and intuitive. If the previous sections explained the what and why of MCP, this section covers the how.
Analogy: FastMCP is to MCP what Flask is to HTTP — it handles the protocol plumbing (JSON-RPC messages, schema generation, transport negotiation) so you can focus entirely on your tools and data. You write normal Python functions; FastMCP converts them into fully compliant MCP capabilities automatically.
Getting Started with FastMCP
FastMCP lives in the mcp Python package. Install it with pip and create your first server in just three lines:
# pip install "mcp[cli]" httpx
from mcp.server.fastmcp import FastMCP
# Create an MCP server instance — the name identifies your server to clients
mcp = FastMCP("my-awesome-server")
That is it. The FastMCP constructor takes a server name (this appears in client UIs like Claude Desktop or Cursor) and returns a server instance. You then add capabilities to this instance using decorators.
The mcp[cli] install extra includes the mcp command-line tool for testing and debugging your servers. The httpx library is commonly used for making async HTTP requests inside your tools.
The @mcp.tool() decorator is the most important concept in FastMCP. It takes a normal Python function and registers it as a tool that any connected LLM can call. The key insight is that you write a normal Python function with good docstrings and type hints, and FastMCP converts it into a fully-described, schema-validated MCP tool automatically.
Here is a concrete example — a weather alerts tool from the Anthropic documentation:
@mcp.tool()
async def get_alerts(state: str) -> str:
"""Get weather alerts for a US state.
Args:
state: Two-letter US state code (e.g. CA, NY)
"""
url = f"{NWS_API_BASE}/alerts/active/area/{state}"
data = await make_nws_request(url)
if not data or "features" not in data:
return "Unable to fetch alerts or no alerts found."
alerts = [format_alert(f) for f in data["features"]]
return "\n---\n".join(alerts)
When an LLM sees this tool, it knows: name=get_alerts, takes a state string, returns weather alerts. The docstring tells the LLM when to use it, and the Args section tells it how to call it correctly.
Here is exactly how FastMCP translates your Python code into the MCP protocol:
| Your Code |
MCP Schema |
Purpose |
| Function name |
tool.name |
How the LLM identifies the tool |
| Docstring |
tool.description |
How the LLM decides to use it |
| Type hints |
inputSchema (JSON Schema) |
Input validation |
| Args docstring |
Parameter descriptions |
Helps LLM provide correct arguments |
| Return type |
Output format |
What the LLM receives back |
This is the magic of FastMCP: good Python practices (type hints, docstrings) directly become good MCP tool descriptions. There is no separate schema file, no configuration YAML, no manual JSON Schema authoring.
@mcp.resource() — Exposing Data to LLMs
The @mcp.resource() decorator exposes read-only data that the application can retrieve. Unlike tools (which the LLM decides to call), resources are application-controlled — the host application decides when to read them. Think of resources as files or API endpoints that provide context to the LLM.
Resources use URI patterns — static URIs for fixed data and URI templates for dynamic data:
@mcp.resource("config://app")
def get_config() -> str:
"""Get the current application configuration."""
return json.dumps({"theme": "dark", "language": "en", "version": "2.1.0"})
@mcp.resource("users://{user_id}/profile")
def get_user_profile(user_id: str) -> str:
"""Get a user's profile by their ID."""
# In production, this would query a database
profiles = {"alice": "Alice Smith - Engineer", "bob": "Bob Jones - Designer"}
return profiles.get(user_id, f"User {user_id} not found")
The first resource uses a static URI (config://app) — there is exactly one configuration. The second uses a URI template (users://{user_id}/profile) — the {user_id} placeholder means the application can request any user's profile by substituting the ID.
@mcp.prompt() — Reusable Templates
The @mcp.prompt() decorator creates reusable prompt templates that users can invoke. These are user-controlled — think of them like slash commands in Slack or Discord. Prompts let you package complex, well-crafted instructions into simple, parameterized templates.
@mcp.prompt()
def review_code(code: str, language: str = "python") -> str:
"""Review code for bugs and improvements.
Args:
code: The source code to review
language: Programming language of the code
"""
return f"""Please review this {language} code for:
1. Bugs and potential errors
2. Performance improvements
3. Security vulnerabilities
4. Code style and best practices
Code to review:
```{language}
{code}
```"""
When a user selects this prompt in their MCP client, they are asked to provide the code and optionally the language. The template then generates a well-structured review request for the LLM.
Running Your MCP Server
With your tools, resources, and prompts defined, running your server is a single line. The transport parameter determines how clients connect:
# Run with STDIO transport (for local clients like Claude Desktop, Cursor)
mcp.run(transport="stdio")
Here is a complete, copy-paste-ready weather server that ties everything together:
# weather_server.py — Complete MCP Weather Server
# pip install "mcp[cli]" httpx
import json
import httpx
from mcp.server.fastmcp import FastMCP
# Create the MCP server
mcp = FastMCP("weather-server")
NWS_API_BASE = "https://api.weather.gov"
USER_AGENT = "weather-app/1.0"
async def make_nws_request(url: str) -> dict | None:
"""Make a request to the NWS API with proper headers."""
headers = {"User-Agent": USER_AGENT, "Accept": "application/geo+json"}
async with httpx.AsyncClient() as client:
response = await client.get(url, headers=headers, timeout=30.0)
response.raise_for_status()
return response.json()
def format_alert(feature: dict) -> str:
"""Format a single weather alert for display."""
props = feature["properties"]
return f"""
Event: {props.get('event', 'Unknown')}
Area: {props.get('areaDesc', 'Unknown')}
Severity: {props.get('severity', 'Unknown')}
Description: {props.get('description', 'No description')}
Instructions: {props.get('instruction', 'No instructions')}
"""
@mcp.tool()
async def get_alerts(state: str) -> str:
"""Get weather alerts for a US state.
Args:
state: Two-letter US state code (e.g. CA, NY)
"""
url = f"{NWS_API_BASE}/alerts/active/area/{state}"
data = await make_nws_request(url)
if not data or "features" not in data:
return "Unable to fetch alerts or no alerts found."
alerts = [format_alert(f) for f in data["features"]]
return "\n---\n".join(alerts) if alerts else "No active alerts for this state."
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get the weather forecast for a location.
Args:
latitude: Latitude of the location
longitude: Longitude of the location
"""
# First get the forecast grid endpoint
points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
points_data = await make_nws_request(points_url)
if not points_data:
return "Unable to fetch forecast data for this location."
forecast_url = points_data["properties"]["forecast"]
forecast_data = await make_nws_request(forecast_url)
if not forecast_data:
return "Unable to fetch forecast."
periods = forecast_data["properties"]["periods"][:5]
forecasts = [f"{p['name']}: {p['detailedForecast']}" for p in periods]
return "\n---\n".join(forecasts)
@mcp.resource("config://weather")
def get_weather_config() -> str:
"""Get the weather server configuration and supported regions."""
return json.dumps({
"api": "National Weather Service",
"coverage": "United States",
"update_frequency": "Every 15 minutes"
})
@mcp.prompt()
def weather_briefing(state: str) -> str:
"""Generate a comprehensive weather briefing for a state.
Args:
state: Two-letter US state code (e.g. CA, NY)
"""
return f"""Please provide a comprehensive weather briefing for {state}:
1. Check current weather alerts
2. Summarize any severe weather warnings
3. Provide a general outlook
Use the available weather tools to gather this information."""
# Start the server
if __name__ == "__main__":
mcp.run(transport="stdio")
To connect this server to Claude Desktop, add it to your Claude Desktop configuration file:
{
"mcpServers": {
"weather": {
"command": "python",
"args": ["weather_server.py"]
}
}
}
Key Insight: The entire FastMCP SDK is designed around one principle — write normal Python functions with good docstrings and type hints, and FastMCP handles everything else: schema generation, validation, transport, and protocol compliance. The three decorators map directly to the three core MCP primitives: @mcp.tool() = LLM calls it, @mcp.resource() = app reads it, @mcp.prompt() = user invokes it.
5. Protocol Flow & Lifecycle
Understanding MCP’s protocol lifecycle is essential for building reliable integrations. Every MCP session follows a structured flow: initialization (capability negotiation), operation (request/response and notification exchanges), and shutdown (graceful disconnection). This section traces the complete message flow from connection establishment through tool invocation, showing how JSON-RPC messages coordinate between client and server at each stage.
5.1 End-to-End Flow
Understanding the complete message flow is essential for debugging and optimizing MCP-based systems. Here is the full lifecycle of a user request through the MCP stack:
# End-to-End MCP Flow: User asks "How many active users do we have?"
#
# Step 1: USER INPUT
# User types: "How many active users do we have?"
#
# Step 2: HOST PROCESSING
# Claude Desktop receives the message.
# Host adds the message to the conversation context.
# Host includes tool descriptions from all connected MCP servers.
#
# Step 3: LLM DECISION
# Claude sees available tools: [query_database, search_web, read_file, ...]
# Claude decides to use "query_database" tool.
# Claude generates: {"tool": "query_database", "args": {"sql": "SELECT COUNT(*) ..."}}
#
# Step 4: HOST -> CLIENT -> SERVER
# Host identifies which client manages the database server.
# Client sends JSON-RPC tool invocation to the server.
# Server executes the SQL query against the actual database.
#
# Step 5: SERVER -> CLIENT -> HOST
# Server returns: {"result": [{"count": 12847}]}
# Client passes result back to host.
# Host injects tool result into the conversation context.
#
# Step 6: LLM SYNTHESIS
# Claude sees the tool result in context.
# Claude generates: "You currently have 12,847 active users."
#
# Step 7: USER RESPONSE
# Host renders Claude's response in the chat UI.
# Total time: ~2-4 seconds (LLM inference dominates)
5.2 Message Types
MCP uses JSON-RPC 2.0 as its message format. Every interaction between client and server is a JSON-RPC message. The key message types in the protocol lifecycle are:
| Phase |
Message Type |
Direction |
Purpose |
| Initialization |
initialize |
Client -> Server |
Protocol handshake, version negotiation, capability exchange |
| Initialization |
initialized |
Client -> Server |
Confirms initialization is complete |
| Discovery |
tools/list |
Client -> Server |
Request list of available tools with schemas |
| Discovery |
resources/list |
Client -> Server |
Request list of available resources with URIs |
| Discovery |
prompts/list |
Client -> Server |
Request list of available prompt templates |
| Invocation |
tools/call |
Client -> Server |
Execute a tool with provided arguments |
| Invocation |
resources/read |
Client -> Server |
Fetch content of a resource by URI |
| Invocation |
prompts/get |
Client -> Server |
Get a populated prompt template |
| Sampling |
sampling/createMessage |
Server -> Client |
Request LLM completion from the host |
| Notifications |
notifications/tools/list_changed |
Server -> Client |
Server's available tools have changed |
| Error |
JSON-RPC error |
Either direction |
Structured error with code, message, and data |
5.3 State Management
MCP carefully separates stateless and stateful concerns across the architecture:
- Servers are preferably stateless: Each tool invocation should be self-contained. The server receives all necessary context in the request and returns a complete response. This allows servers to be restarted, scaled horizontally, or replaced without losing state.
- Clients are session-aware: Clients maintain connection state (session ID, negotiated capabilities, transport state) for the duration of a session. If a server restarts, the client re-initializes and re-discovers capabilities.
- Hosts manage conversation context: The host is responsible for managing the conversation history, context window budget, and deciding which tool results to include in the LLM prompt. This is where context window management becomes critical — a tool that returns 10,000 tokens of data may need to be summarized before injection.
5.4 Complete MCP Message Flow Simulation
To see the full protocol in action, this simulation traces every JSON-RPC message exchanged during a complete MCP session — from initialization handshake through tool discovery, resource listing, tool execution, and graceful shutdown. Running this simulation reveals the exact message structure and sequencing that real MCP clients and servers use, making it an invaluable reference for debugging protocol-level issues.
# Complete MCP protocol flow simulation
# This demonstrates every message type in the correct sequence
# pip install mcp
import json
import asyncio
from datetime import datetime, timezone
from dataclasses import dataclass, field
from typing import Any
@dataclass
class MCPMessage:
"""Represents a single MCP JSON-RPC message."""
jsonrpc: str = "2.0"
method: str = ""
params: dict = field(default_factory=dict)
result: Any = None
error: dict = None
id: int = None
def to_json(self):
"""Serialize to JSON-RPC format."""
msg = {"jsonrpc": self.jsonrpc}
if self.method:
msg["method"] = self.method
if self.params:
msg["params"] = self.params
if self.result is not None:
msg["result"] = self.result
if self.error:
msg["error"] = self.error
if self.id is not None:
msg["id"] = self.id
return json.dumps(msg, indent=2)
def simulate_mcp_flow():
"""Simulate the complete MCP protocol flow with all message types."""
messages = []
msg_id = 0
# --- Phase 1: Initialization ---
msg_id += 1
init_request = MCPMessage(
method="initialize",
params={
"protocolVersion": "2025-03-26",
"capabilities": {
"tools": {}, # Client supports tool invocations
"resources": {}, # Client supports resource reading
"prompts": {}, # Client supports prompt templates
"sampling": {} # Client supports sampling requests
},
"clientInfo": {
"name": "claude-desktop",
"version": "1.5.0"
}
},
id=msg_id
)
messages.append(("CLIENT -> SERVER", init_request))
# Server responds with its capabilities
init_response = MCPMessage(
result={
"protocolVersion": "2025-03-26",
"capabilities": {
"tools": {"listChanged": True}, # Server supports tool change notifications
"resources": {"subscribe": True}, # Server supports resource subscriptions
"prompts": {"listChanged": True}, # Server supports prompt change notifications
"sampling": {} # Server may request LLM completions
},
"serverInfo": {
"name": "enterprise-database-server",
"version": "2.1.0"
}
},
id=msg_id
)
messages.append(("SERVER -> CLIENT", init_response))
# Client confirms initialization
initialized = MCPMessage(method="notifications/initialized")
messages.append(("CLIENT -> SERVER", initialized))
# --- Phase 2: Capability Discovery ---
msg_id += 1
list_tools_request = MCPMessage(
method="tools/list",
params={},
id=msg_id
)
messages.append(("CLIENT -> SERVER", list_tools_request))
list_tools_response = MCPMessage(
result={
"tools": [
{
"name": "query_database",
"description": "Execute a read-only SQL query",
"inputSchema": {
"type": "object",
"properties": {
"sql": {"type": "string", "description": "SQL SELECT query"},
"limit": {"type": "integer", "default": 100}
},
"required": ["sql"]
}
},
{
"name": "list_tables",
"description": "List all tables in the database",
"inputSchema": {
"type": "object",
"properties": {},
"required": []
}
}
]
},
id=msg_id
)
messages.append(("SERVER -> CLIENT", list_tools_response))
# --- Phase 3: Tool Invocation ---
msg_id += 1
tool_call = MCPMessage(
method="tools/call",
params={
"name": "query_database",
"arguments": {
"sql": "SELECT COUNT(*) as active_users FROM users WHERE status = 'active'",
"limit": 1
}
},
id=msg_id
)
messages.append(("CLIENT -> SERVER", tool_call))
tool_result = MCPMessage(
result={
"content": [
{
"type": "text",
"text": json.dumps([{"active_users": 12847}])
}
],
"isError": False
},
id=msg_id
)
messages.append(("SERVER -> CLIENT", tool_result))
# --- Phase 4: Resource Read ---
msg_id += 1
resource_read = MCPMessage(
method="resources/read",
params={"uri": "db://tables/users/schema"},
id=msg_id
)
messages.append(("CLIENT -> SERVER", resource_read))
resource_response = MCPMessage(
result={
"contents": [
{
"uri": "db://tables/users/schema",
"mimeType": "application/json",
"text": json.dumps({
"columns": ["id", "name", "email", "status", "created_at"],
"types": ["INTEGER", "TEXT", "TEXT", "TEXT", "TIMESTAMP"]
})
}
]
},
id=msg_id
)
messages.append(("SERVER -> CLIENT", resource_response))
# --- Phase 5: Error Handling ---
msg_id += 1
bad_tool_call = MCPMessage(
method="tools/call",
params={
"name": "query_database",
"arguments": {"sql": "DROP TABLE users"} # Dangerous query
},
id=msg_id
)
messages.append(("CLIENT -> SERVER", bad_tool_call))
error_response = MCPMessage(
error={
"code": -32602,
"message": "Invalid params: Only SELECT queries are allowed",
"data": {"attempted_query": "DROP TABLE users", "policy": "read_only"}
},
id=msg_id
)
messages.append(("SERVER -> CLIENT", error_response))
# --- Print the complete flow ---
print("=" * 70)
print("MCP PROTOCOL FLOW SIMULATION")
print(f"Timestamp: {datetime.now(timezone.utc).isoformat()}")
print("=" * 70)
for i, (direction, msg) in enumerate(messages, 1):
print(f"\n--- Message {i}: {direction} ---")
print(msg.to_json())
print(f"\n{'=' * 70}")
print(f"Total messages exchanged: {len(messages)}")
print(f"Phases covered: Initialization, Discovery, Invocation, Resource Read, Error Handling")
print(f"{'=' * 70}")
# Run the simulation
simulate_mcp_flow()
6. Authentication & Security
Security is not an afterthought in MCP — it is built into the protocol's design. Because MCP servers can access databases, filesystems, APIs, and other sensitive systems, a robust security model is essential.
6.1 Authentication Mechanisms
| Mechanism |
How It Works |
Best For |
Complexity |
| API Keys |
Static secret passed as environment variable or header |
Local development, single-user servers |
Low |
| OAuth 2.0 |
Token-based flow with scopes and refresh |
Multi-user, SaaS integrations, delegated access |
Medium-High |
| JWT Tokens |
Signed tokens with claims (user, permissions, expiry) |
Stateless auth across microservices |
Medium |
| mTLS |
Mutual TLS with client certificates |
Zero-trust environments, inter-service auth |
High |
6.2 Authorization & RBAC
Authentication verifies who you are; authorization determines what you can do. MCP supports fine-grained access control at the tool, resource, and argument level:
# MCP auth middleware: JWT validation + RBAC authorization
# Demonstrates production-grade security for MCP servers
# pip install mcp pyjwt cryptography
import os
import json
import jwt
import time
import functools
from typing import Callable
from mcp.server import Server
from mcp.types import Tool, TextContent
# Security configuration from environment
JWT_SECRET = os.getenv("MCP_JWT_SECRET", "change-me-in-production")
JWT_ALGORITHM = "HS256"
# Role-based access control matrix
# Maps roles to allowed tools and their permitted argument patterns
RBAC_POLICY = {
"analyst": {
"allowed_tools": ["query", "list_tables", "describe_table"],
"restrictions": {
"query": {
"sql_must_start_with": "SELECT", # Read-only
"forbidden_tables": ["audit_logs", "credentials"] # Sensitive tables
}
}
},
"developer": {
"allowed_tools": ["query", "list_tables", "describe_table", "insert_record"],
"restrictions": {
"query": {
"sql_must_start_with": "SELECT",
"forbidden_tables": ["credentials"]
},
"insert_record": {
"allowed_tables": ["logs", "events", "metrics"]
}
}
},
"admin": {
"allowed_tools": ["*"], # All tools
"restrictions": {} # No restrictions
}
}
def validate_jwt_token(token: str) -> dict:
"""Validate a JWT token and extract claims."""
try:
payload = jwt.decode(token, JWT_SECRET, algorithms=[JWT_ALGORITHM])
# Check expiration
if payload.get("exp", 0) < time.time():
raise ValueError("Token expired")
return {
"user_id": payload["sub"],
"role": payload.get("role", "analyst"), # Default to least privilege
"permissions": payload.get("permissions", []),
"issued_at": payload.get("iat"),
"expires_at": payload.get("exp")
}
except jwt.InvalidTokenError as e:
raise ValueError(f"Invalid token: {e}")
def authorize_tool_call(user_claims: dict, tool_name: str, arguments: dict) -> bool:
"""Check if a user is authorized to call a specific tool with given arguments."""
role = user_claims["role"]
if role not in RBAC_POLICY:
return False # Unknown role — deny by default
policy = RBAC_POLICY[role]
# Check if tool is allowed for this role
allowed = policy["allowed_tools"]
if "*" not in allowed and tool_name not in allowed:
return False
# Check tool-specific restrictions
restrictions = policy.get("restrictions", {}).get(tool_name, {})
if tool_name == "query":
sql = arguments.get("sql", "").strip().upper()
# Check SQL command restriction
required_prefix = restrictions.get("sql_must_start_with", "")
if required_prefix and not sql.startswith(required_prefix):
return False
# Check forbidden tables
forbidden = restrictions.get("forbidden_tables", [])
for table in forbidden:
if table.upper() in sql:
return False
elif tool_name == "insert_record":
table = arguments.get("table", "")
allowed_tables = restrictions.get("allowed_tables", [])
if allowed_tables and table not in allowed_tables:
return False
return True
def generate_sample_jwt(user_id: str, role: str, hours_valid: int = 8) -> str:
"""Generate a sample JWT token for testing."""
now = int(time.time())
payload = {
"sub": user_id,
"role": role,
"iat": now,
"exp": now + (hours_valid * 3600),
"permissions": RBAC_POLICY.get(role, {}).get("allowed_tools", [])
}
return jwt.encode(payload, JWT_SECRET, algorithm=JWT_ALGORITHM)
# --- Demonstration ---
def demonstrate_auth():
"""Show the auth system in action."""
# Generate tokens for different roles
analyst_token = generate_sample_jwt("alice", "analyst")
developer_token = generate_sample_jwt("bob", "developer")
admin_token = generate_sample_jwt("charlie", "admin")
print("=== MCP Auth Middleware Demonstration ===\n")
# Test scenarios
scenarios = [
("Analyst: SELECT query", analyst_token, "query",
{"sql": "SELECT * FROM users LIMIT 10"}),
("Analyst: SELECT from credentials", analyst_token, "query",
{"sql": "SELECT * FROM credentials"}),
("Analyst: INSERT (forbidden tool)", analyst_token, "insert_record",
{"table": "logs", "data": {"msg": "test"}}),
("Developer: SELECT query", developer_token, "query",
{"sql": "SELECT * FROM users LIMIT 10"}),
("Developer: INSERT into logs", developer_token, "insert_record",
{"table": "logs", "data": {"msg": "test"}}),
("Developer: INSERT into users (forbidden)", developer_token, "insert_record",
{"table": "users", "data": {"name": "hack"}}),
("Admin: DELETE query (all allowed)", admin_token, "query",
{"sql": "DELETE FROM temp_data"}),
]
for description, token, tool, args in scenarios:
claims = validate_jwt_token(token)
authorized = authorize_tool_call(claims, tool, args)
status = "ALLOWED" if authorized else "DENIED"
print(f" [{status}] {description}")
print(f" Role: {claims['role']}, Tool: {tool}")
if not authorized:
print(f" Reason: Policy violation for role '{claims['role']}'")
print()
demonstrate_auth()
6.3 Security Best Practices
Security Checklist
Production MCP Security Checklist
- Least Privilege: Each server should expose the minimum set of tools required. A file-reading server should never expose file-writing tools unless explicitly needed.
- Input Validation: Validate all tool arguments against their JSON Schema before execution. Reject malformed inputs at the protocol level, not the application level.
- Sandboxing: Run MCP servers in isolated environments (Docker containers, Firecracker VMs, or OS-level sandboxes). Limit filesystem access, network access, and system calls.
- Rate Limiting: Implement per-user and per-tool rate limits to prevent abuse. An agent stuck in a loop could make thousands of tool calls per minute without rate limiting.
- Audit Logging: Log every tool invocation with timestamp, user identity, tool name, arguments (sanitized), result status, and execution duration. These logs are essential for security forensics and compliance.
- Prompt Injection Mitigation: Never pass raw tool results directly into system prompts. Sanitize tool outputs to remove potential injection strings. Mark tool results as "untrusted" in the context window.
- Secret Management: Never embed API keys, database passwords, or other secrets in server code. Use environment variables, secret managers (HashiCorp Vault, AWS Secrets Manager), or secure key stores.
- TLS Everywhere: All HTTP-based MCP transports should use TLS 1.3. For internal service-to-service communication, use mTLS with short-lived certificates.
Least Privilege
Input Validation
Sandboxing
Rate Limiting
Audit Logging
6.4 Data Privacy
MCP's architecture inherently supports privacy-preserving patterns because the server (which accesses data) is separate from the host (which runs the LLM). This separation enables several important privacy architectures:
- Local-first architectures: STDIO transport keeps all data on the user's machine. The MCP server reads local files and databases; data never leaves the device (only the LLM inference call goes to the cloud).
- Secure enclaves: MCP servers can run inside trusted execution environments (Intel SGX, AWS Nitro Enclaves) where even the server operator cannot access the data being processed.
- Encryption at rest: Servers should encrypt any cached or persisted data using AES-256-GCM. Encryption keys should be managed via a KMS, never hardcoded.
- Encryption in transit: All MCP transports (except STDIO, which uses OS process isolation) should use TLS 1.3 for encryption in transit. HTTP/SSE and WebSocket transports must enforce HTTPS/WSS.
- Data minimization: Tools should return only the data the LLM needs, not entire database tables. A query for "active user count" should return the count, not all user records.
Case Study
Cursor's MCP Integration — IDE as MCP Host
Cursor, the AI-powered code editor, demonstrates a compelling MCP integration pattern. As an MCP host, Cursor connects to servers that provide:
- Project context: An MCP server that indexes the codebase and exposes semantic code search as a resource
- Build tools: An MCP server that wraps the project's build system (npm, cargo, make) as tools
- Testing: An MCP server that runs test suites and returns structured results
- Documentation: An MCP server that provides framework documentation as resources
The key insight is that Cursor's AI assistant seamlessly combines these capabilities: "Find all usages of the deprecated `authenticate()` function, run the tests to confirm they pass, then refactor each call site to use the new `verify_identity()` function." This requires reading code (resource), executing tests (tool), understanding context (sampling), and writing code (tool) — all through MCP.
Cursor IDE
Code Context
Multi-Server
Developer Workflow
Exercises & Self-Assessment
Exercise 1
Build Your First MCP Server
Create a minimal MCP server that wraps a local JSON file as both a Resource and a set of Tools:
- Create a JSON file with 10+ records (e.g., a product catalog, employee directory, or recipe book)
- Implement a Resource that returns the full dataset and a Resource that returns schema information
- Implement Tools:
search (filter by field), get_by_id (fetch single record), add_record (append new record)
- Implement a Prompt template for "analyze this dataset"
- Test with the MCP Inspector CLI tool or connect to Claude Desktop
Exercise 2
MCP Architecture Diagram
Draw a complete architecture diagram for the following scenario and label every MCP component:
- A customer support application (the Host) that uses Claude as its LLM
- Three MCP servers: (a) Zendesk ticket system, (b) product database, (c) knowledge base with vector search
- Show the flow when a user asks: "What's the status of ticket #4521 and does our warranty cover the reported issue?"
- Label each message type (initialize, tools/list, tools/call, resources/read)
- Identify where authentication occurs and what type you would use for each server
Exercise 3
Security Audit
Review the following MCP server configuration and identify all security vulnerabilities:
- A server that exposes
execute_sql with no query validation (accepts any SQL)
- API keys passed as tool arguments instead of environment variables
- No rate limiting on tool invocations
- Tool results returned without sanitization (could contain prompt injection payloads)
- For each vulnerability, write the fix and explain the attack vector it prevents
Exercise 4
Reflective Questions
- Why does MCP separate Hosts, Clients, and Servers into three distinct roles instead of combining them? What would break if they were merged?
- Compare MCP's Resources primitive to a traditional REST API GET endpoint. What does MCP add that REST does not? What does REST provide that MCP does not?
- Explain why Sampling (server-to-host LLM requests) is necessary. Give an example where a server cannot accomplish its task without delegating reasoning to the LLM.
- MCP supports four transports: STDIO, HTTP/SSE, WebSocket, gRPC. For each, describe a scenario where it is the best choice and a scenario where it is the worst choice.
- ChatGPT Plugins failed, but MCP succeeded. Identify three specific design decisions in MCP that address Plugin weaknesses.
Exercise 5
Transport Comparison Lab
Implement the same simple MCP server (a weather lookup tool) using two different transports:
- STDIO transport (for local use with Claude Desktop)
- HTTP/SSE transport (for remote use with a web client)
- Measure the latency difference between the two transports using 100 sequential tool calls
- Write up your findings: When is the latency difference significant? When is it negligible?
Conclusion & Next Steps
You now have a thorough understanding of the Model Context Protocol — the open standard that is becoming the universal integration layer for AI applications. Here are the key takeaways from Part 13:
- MCP solves AI integration fragmentation the same way HTTP solved web communication and USB-C solved peripheral connectivity — through a vendor-neutral, open protocol
- The Host/Client/Server architecture separates concerns cleanly: hosts manage LLM interaction, clients manage protocol connections, servers expose capabilities, and data sources provide the underlying information
- Four core primitives cover every type of AI-system interaction: Resources (read data), Tools (take actions), Prompts (reuse templates), and Sampling (delegate reasoning)
- Transport agnosticism means the same server works locally (STDIO), on the web (HTTP/SSE), in real-time applications (WebSocket), and in high-performance microservices (gRPC)
- The protocol lifecycle follows a clear sequence: initialize, discover capabilities, invoke tools/resources, handle errors — all using JSON-RPC 2.0 messages
- Security is built into the architecture through least privilege, RBAC, JWT/OAuth authentication, input validation, sandboxing, audit logging, and prompt injection mitigation
- MCP outperforms alternatives (OpenAI function calling, LangChain tools, ChatGPT Plugins) on every dimension that matters for production: vendor neutrality, modularity, interoperability, and security
Next in the Series
In Part 14: MCP in Production, we will take everything from this foundational chapter and apply it at scale — building production-grade MCP servers, integrating with real-world APIs and databases, implementing observability and monitoring, scaling MCP architectures, and building complete agent systems that combine multiple MCP servers into powerful autonomous workflows.
Continue the Series
Part 14: MCP in Production
Building production-grade MCP servers, real-world integrations, observability, scaling patterns, and complete agent systems.
Read Article
Part 12: Ecosystem & Frameworks
LlamaIndex, Haystack, HuggingFace, vLLM — the broader AI framework landscape that MCP integrates with.
Read Article
Part 15: Evaluation & LLMOps
Prompt evaluation, tracing, LangSmith integration, and experiment tracking for AI applications.
Read Article