← ClaudeAtlas

seo-robots-ailisted

Audit robots.txt for AI crawler access policies. Checks GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and other AI crawlers. Use when user says "robots AI", "AI crawlers", "block AI", "allow AI bots", "AI crawl policy".
YogeshKu7877/claude-seo-skills · ★ 4 · AI & Automation · score 80
Install: claude install-skill YogeshKu7877/claude-seo-skills
# AI Crawler Robots.txt Audit Analyzes a site's robots.txt specifically for AI crawler access policies. Complements `/seo-technical` (which does a broad robots.txt check) with deep AI-specific analysis. @skills/seo/references/ai-crawlers-guide.md ## AI Crawler Registry | Bot Name | Owner | Purpose | |---|---|---| | GPTBot | OpenAI | Training data + ChatGPT web search | | OAI-SearchBot | OpenAI | ChatGPT search only (not training) | | ChatGPT-User | OpenAI | ChatGPT browsing (real-time) | | ClaudeBot | Anthropic | Training data collection | | anthropic-ai | Anthropic | Anthropic web crawler | | PerplexityBot | Perplexity | AI search engine | | Google-Extended | Google | Gemini / AI training (not Search) | | Bytespider | ByteDance | TikTok / AI training | | CCBot | Common Crawl | Open dataset used by many AI models | | Applebot-Extended | Apple | Apple Intelligence training | | cohere-ai | Cohere | AI model training | | FacebookBot | Meta | Meta AI training | | Meta-ExternalAgent | Meta | Meta AI browsing agent | | Amazonbot | Amazon | Alexa / AI training | | Diffbot | Diffbot | AI knowledge graph | | ImagesiftBot | ImagesiftBot | AI image training | | Omgili | Webz.io | AI data feeds | ## Inputs - `url`: The website URL to audit (will fetch `/robots.txt` from site root) - Normalize to domain root: `example.com/page` → `https://example.com/robots.txt` ## Execution 1. **Fetch robots.txt**: WebFetch `<domain>/robots.txt` - If 404 → report "No robots.txt found — all c