ai-security

Featured

Use when assessing AI/ML systems for prompt injection, jailbreak vulnerabilities, model inversion risk, data poisoning exposure, or agent tool abuse. Covers MITRE ATLAS technique mapping, injection signature detection, and adversarial robustness scoring.

AI & Automation 23,342 stars 3210 forks Updated 1 weeks ago MIT

Install

View on GitHub

Quality Score: 91/100

Stars 20%

100

Recency 20%

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# AI Security AI and LLM security assessment skill for detecting prompt injection, jailbreak vulnerabilities, model inversion risk, data poisoning exposure, and agent tool abuse. This is NOT general application security (see security-pen-testing) or behavioral anomaly detection in infrastructure (see threat-detection) — this is about security assessment of AI/ML systems and LLM-based agents specifically. --- ## Table of Contents - [Overview](#overview) - [AI Threat Scanner Tool](#ai-threat-scanner-tool) - [Prompt Injection Detection](#prompt-injection-detection) - [Jailbreak Assessment](#jailbreak-assessment) - [Model Inversion Risk](#model-inversion-risk) - [Data Poisoning Risk](#data-poisoning-risk) - [Agent Tool Abuse](#agent-tool-abuse) - [MITRE ATLAS Coverage](#mitre-atlas-coverage) - [Guardrail Design Patterns](#guardrail-design-patterns) - [Workflows](#workflows) - [Anti-Patterns](#anti-patterns) - [Cross-References](#cross-references) --- ## Overview ### What This Skill Does This skill provides the methodology and tooling for **AI/ML security assessment** — scanning for prompt injection signatures, scoring model inversion and data poisoning risk, mapping findings to MITRE ATLAS techniques, and recommending guardrail controls. It supports LLMs, classifiers, and embedding models. ### Distinction from Other Security Skills | Skill | Focus | Approach | |-------|-------|----------| | **ai-security** (this) | AI/ML system security | Specialized — LLM injection, mo...

Details

Author: alirezarezvani
Repository: alirezarezvani/claude-skills
Created: 9 months ago
Last Updated: 1 weeks ago
Language: Python
License: MIT

Integrates with

OpenAI · AI Anthropic · AI

Bundled in these plugins

claude-skills

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

ai--llm-security

LLM and AI application security testing — prompt injection, jailbreak resistance, OWASP LLM Top 10 (2025), RAG and agent/tool-use security, model supply chain, and AI red teaming for authorized assessments

207 Updated 1 weeks ago

Masriyan

AI & Automation Featured

moai-ref-llm-security

AI/LLM defensive security reference: prompt-injection defense, OWASP LLM Top 10 defensive mapping, MCP and agentic tool-call hardening, training-data poisoning detection, model-output validation and guardrails, MITRE ATLAS defensive correlation, and NIST AI RMF governance. Agent-extending skill that amplifies backend, security, and AI-application engineering with production-grade defensive patterns for LLM-backed systems. NOT for: offensive techniques (jailbreak authoring, attack-payload crafting, red-team exploitation), model training or fine-tuning methodology, prompt optimization for capability, web-app OWASP Top 10 (see moai-ref-owasp-checklist), or general API design (see moai-ref-api-patterns).

1,143 Updated today

modu-ai

AI & Automation Listed

senior-ai-safety-engineer

Use when threat modeling an LLM or agent system, defending against prompt injection (direct and indirect), designing output safety pipelines, hardening tool use authorization, running an authorized red team set, classifying a system under EU AI Act / NIST AI RMF / ISO 42001, responding to an AI safety incident (jailbreak gone public, harmful output reported, system prompt leak), or evaluating training data privacy risk. Triggers: AI safety, LLM security, prompt injection, indirect prompt injection, jailbreak, output safety, content filter, moderation, model exfiltration, prompt extraction, system prompt leak, agent safety, tool safety, EU AI Act, NIST AI RMF, ISO 42001, OWASP LLM Top 10, red team AI, refusal, harmful content. Produces AI threat models, defense in depth diagrams, red team sets, output safety pipelines, tool authorization matrices, regulatory classification docs, incident response plans. Not for the agent loop itself, see senior-ai-agent-engineer.

0 Updated 1 weeks ago

iamdemetris