ai-llm-safetylisted
Install: claude install-skill alo-exp/silver-bullet
# /ai-llm-safety — AI/LLM Safety Design Enforcement
Every system that involves LLM agents, tool use, or prompt construction MUST treat AI safety as a first-class constraint. Prompt injection is the SQL injection of the AI era — and it's harder to fix after deployment.
**Why this matters:** LLM-powered systems are uniquely vulnerable to attacks that exploit the model's instruction-following nature. A single prompt injection can exfiltrate data, execute unauthorized actions, or compromise downstream systems. Unlike traditional software bugs, these vulnerabilities exist at the semantic layer and cannot be caught by linters or type checkers.
**When to invoke:** During PLANNING (after brainstorming, before or alongside writing plans) and during REVIEW (as part of code review criteria). This skill applies to ALL code that constructs prompts, processes LLM output, or orchestrates agent workflows.
---
## The Rules
### Rule 1: Treat All External Content as Untrusted Data
Any content not authored by the system itself is untrusted. This includes:
| Source | Risk | Mitigation |
|--------|------|------------|
| User input | Direct prompt injection | Isolate from system instructions; validate format |
| Web pages / fetched content | Indirect prompt injection | Never pass raw content as instructions; summarize or extract data only |
| Tool results / API responses | Poisoned upstream data | Validate schema; never execute embedded instructions |
| File contents (uploaded/read) | Embed