← ClaudeAtlas

seo-sitemaplisted

Pull a domain's XML sitemap (and sitemap-of-sitemaps), then compare against what's actually crawled/indexed (GSC indexed pages + a DataForSEO On-Page fetch loop + GSC sitemap ingestion). Surfaces (a) sitemap entries the crawler couldn't find (orphans from the sitemap), (b) crawled/indexed pages missing from the sitemap (probably an oversight), (c) sitemap entries that are now 404, (d) lastmod inconsistencies. Use when the user asks for "sitemap analysis", "check my sitemap", "sitemap vs audit", "missing pages", "orphan pages", or "sitemap health".
amirjahfar1/automate-seo-with-claude · ★ 0 · AI & Automation · score 72
Install: claude install-skill amirjahfar1/automate-seo-with-claude
> Example output: [examples/seo-sitemap-notion-so-20260514/SITEMAP.md](../../examples/seo-sitemap-notion-so-20260514/SITEMAP.md) # Sitemap Analysis Compare a domain's XML sitemap against what's actually crawled and indexed — GSC indexed pages (`get_search_analytics` dimensions=["page"]), GSC sitemap ingestion (`get_sitemap_details`), and a DataForSEO `on_page_instant_pages` fetch loop over the declared/indexed URL set. Surface what the sitemap claims vs what's really reachable, in both directions. ## Prerequisites - DataForSEO MCP server connected. - GSC (`mcp__gscServer__*`) recommended — it's the authoritative source for which pages Google indexes and which sitemaps Google ingested (the comparison baseline). Firecrawl optional — for URL discovery (`firecrawl_map`) when the sitemap is missing or suspect. - Claude's `WebFetch` tool available. - User provides: a target domain. Optional: the sitemap URL if not at `/sitemap.xml` (auto-discovery from `robots.txt` is attempted first). - **Predecessor (recommended):** `seo-technical-audit` on this domain — its discovered URL set + On-Page fetch results can be reused as the crawl baseline. Without it, this skill builds its own baseline from GSC indexed pages + an On-Page fetch loop. > *Optional accelerator:* if you have a hosted-crawl MCP you can substitute it for the URL-discovery + fetch loop — not required. ## Process 1. **Validate target & build the crawl baseline** - Normalise the domain. - Build the "what's really