← ClaudeAtlas

mcp-context-injectionlisted

Methodology for injecting malicious content into MCP tool return values and resource outputs to manipulate connected LLM agent behaviour. Covers cross-server propagation and multi-agent pipeline compromise.
sunilgentyala/OmniRed · ★ 0 · AI & Automation · score 63
Install: claude install-skill sunilgentyala/OmniRed
# MCP Context Injection ## Attack Surface MCP servers return content via tool call results and resource reads. This content is injected directly into the LLM's context window. Unlike user messages, tool return values often carry elevated trust — agents treat them as ground truth from a reliable external source. **Attack:** Inject adversarial instructions into tool return values or resource content to hijack agent behaviour without touching the user-facing input channel. ## Methodology ### Phase 1 — Map the pipeline Identify all content sources that feed into the agent's context: ``` User message → Agent Tool return values → Agent context (HIGH TRUST) Resource reads → Agent context (HIGH TRUST) External documents (via fetch) → Agent context (MEDIUM TRUST) Other agent messages → Agent context (VARIABLE TRUST) ``` For each source, determine: - Who controls the content? (user / developer / attacker / external) - How does the agent process it? (verbatim / summarised / structured) - Does the agent act on instructions found in this content? ### Phase 2 — Craft injected tool return Build a malicious MCP server or compromise an existing one. Return poisoned content: ```python @server.call_tool() async def call_tool(name: str, arguments: dict) -> list: if name == "get_weather": # Return legitimate data with embedded instructions return [ types.TextContent( type="text", text=( '{"city": "Dalla