MCP: Model Context Protocol

Why MCP Exists

Before MCP, every AI integration was bespoke glue code. The problem:

N models × M services = N×M custom integrations.

MCP solves this: N models + M servers = N+M things to build.

Any MCP client (Claude, Cursor, your own agent) talks to any MCP server (GitHub, filesystem, Postgres) the same way.

What MCP Is at the Protocol Level

MCP is JSON-RPC 2.0 — remote procedure calls over JSON.

// Request (client → server)
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": { "name": "read_file", "arguments": { "path": "/etc/hosts" } }
}

// Response (server → client)
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": { "content": [{ "type": "text", "text": "127.0.0.1 localhost\n..." }] }
}

// Notification (either direction, no id, no response expected)
{
  "jsonrpc": "2.0",
  "method": "notifications/progress",
  "params": { "progressToken": "abc", "progress": 50 }
}

No REST, no GraphQL, no custom framing. Just JSON-RPC.

Transport Mechanisms

stdio (local)

Claude Code spawns the MCP server as a child process. JSON-RPC travels over stdin/stdout pipes.

// .claude/settings.json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user"]
    }
  }
}

Use for: local tools, filesystem access, running CLIs.

HTTP + SSE (remote)

Client POSTs requests, server pushes responses via Server-Sent Events.

Use for: shared servers, remote infra, multi-client scenarios.

Note: A newer Streamable HTTP transport (spec 2025-03-26) replaces SSE with single-endpoint bidirectional streaming. Not yet universal.

Connection Lifecycle

Client                          Server
  │──── initialize ─────────────▶│  { protocolVersion, capabilities }
  │◀─── initialized ──────────────│  { protocolVersion, capabilities }
  │──── notifications/initialized▶│  (client says "ready")
  │  [normal operation]           │
  │──── ping ────────────────────▶│
  │◀─── pong ─────────────────────│

The capabilities exchange negotiates what features are supported — whether the server can notify on tool list changes, whether resources support subscriptions, etc.

The Three Primitives

Tools — Model-controlled

The LLM decides when to call these.

// tools/list response
{
  "tools": [{
    "name": "query_database",
    "description": "Run a read-only SQL query against the production DB",
    "inputSchema": {
      "type": "object",
      "properties": {
        "sql": { "type": "string" },
        "limit": { "type": "number", "default": 100 }
      },
      "required": ["sql"]
    }
  }]
}

// tools/call response
{
  "content": [{ "type": "text", "text": "[{\"id\": 1, \"name\": \"Alice\"}]" }],
  "isError": false
}

Resources — Application-controlled

Data the LLM or host reads — not called like functions, fetched like files.

// resources/read response
{
  "contents": [{
    "uri": "postgres://prod/schema",
    "mimeType": "application/json",
    "text": "{ \"tables\": [...] }"
  }]
}

Resources can be subscribed to — the server pushes notifications/resources/updated when content changes. Useful for live log tailing, file watching.

Prompts — User-controlled

Reusable templates surfaced to the user, not auto-invoked by the LLM. In Claude Code these appear as slash commands.

Building a Custom MCP Server

Python

from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp import types

app = Server("my-server")

@app.list_tools()
async def list_tools() -> list[types.Tool]:
    return [
        types.Tool(
            name="get_system_info",
            description="Get basic system information",
            inputSchema={
                "type": "object",
                "properties": {
                    "metric": { "type": "string", "enum": ["cpu", "memory", "disk"] }
                },
                "required": ["metric"]
            }
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
    if name == "get_system_info":
        import psutil
        metric = arguments["metric"]
        if metric == "cpu":
            return [types.TextContent(type="text", text=f"CPU: {psutil.cpu_percent()}%")]
    raise ValueError(f"Unknown tool: {name}")

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await app.run(read_stream, write_stream, app.create_initialization_options())

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

TypeScript

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { CallToolRequestSchema, ListToolsRequestSchema } from "@modelcontextprotocol/sdk/types.js";

const server = new Server(
  { name: "my-server", version: "1.0.0" },
  { capabilities: { tools: {} } }
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [{
    name: "echo",
    description: "Echoes input back",
    inputSchema: {
      type: "object",
      properties: { message: { type: "string" } },
      required: ["message"]
    }
  }]
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === "echo") {
    return { content: [{ type: "text", text: request.params.arguments.message }] };
  }
  throw new Error(`Unknown tool: ${request.params.name}`);
});

await server.connect(new StdioServerTransport());

Error Handling

Two layers:

Layer	When	Example
Protocol error	Infrastructure broke	`-32601 Method not found`
Tool error (`isError: true`)	Tool ran, reported failure	`"Permission denied"`

Protocol errors mean the plumbing is broken. Tool errors mean the tool ran — the model should reason about them (retry, ask user, try differently).

Security Model

stdio servers: trusted by definition (you spawned them, they run as your user)
HTTP servers: OAuth 2.0 recommended, or API keys / mTLS
Capability scoping: only expose what’s needed
Roots: mechanism for clients to tell servers which paths are in scope

The Full Flow

User: "Query the prod DB and summarize the top users"

Claude (LLM)
  emits: tool_use { name: "query_database", input: { sql: "SELECT..." } }

Claude Code
  routes to MCP server "postgres"
  sends: tools/call { name: "query_database", arguments: {...} }

postgres-mcp server
  runs query → returns rows

Claude Code
  wraps as tool_result

Claude (LLM)
  reads rows, summarizes, responds

MCP is USB-C for AI integrations. The LLM never knows or cares whether a tool is native or MCP — it just sees definitions and calls them. The host handles routing.