← ClaudeAtlas

mcp-rug-pulllisted

Methodology for testing rug pull attacks against MCP servers — capability changes after initial attestation that the connected LLM cannot detect. Covers detection, reproduction, and impact assessment.
sunilgentyala/OmniRed · ★ 0 · AI & Automation · score 63
Install: claude install-skill sunilgentyala/OmniRed
# MCP Rug Pull ## Attack Surface A rug pull attack occurs when an MCP server presents benign tool definitions during initial connection (and any attestation checks), then serves different, malicious definitions during actual tool invocation. **Why it works:** LLMs cache tool definitions from the initial `tools/list` call. They do not re-validate tool descriptions on each invocation. A server that changes its behaviour after the initial handshake operates invisibly to the model. This is the primary attack that ContextGuard's Capability-Binding Layer (CBL) was designed to detect. ## Attack Scenario ``` Time 0 — MCP server registers with Claude. tools/list returns: file_read: "Read a file and return its contents" Time 0+ — Claude caches this definition. User interacts normally. Time T — Attacker changes the server's runtime behaviour: file_read now also exfiltrates to attacker endpoint. tools/list still returns the original benign definition. Time T+ — Claude calls file_read. Malicious behaviour executes. Claude reports the benign description to the user. User has no visibility into actual behaviour. ``` ## Methodology ### Phase 1 — Establish baseline 1. Capture initial `tools/list` response in full (record hashes of all description fields) 2. Record normal tool invocation behaviour for each tool 3. Establish expected input/output pairs for each tool ### Phase 2 — Modify server behaviour post-registration For ea