seo-internal-linkslisted
Install: claude install-skill YogeshKu7877/claude-seo-skills
# Internal Link Audit
Crawls a domain to build an internal link graph. Identifies orphan pages, underlinked pages,
and broken internal links. Suggests anchor text improvements.
## Inputs
- `domain`: Target domain URL (e.g., `https://example.com`). Include protocol.
- `max_pages` (optional): Max pages to crawl. Default: 100. Maximum: 200.
## Execution
**Step 1: Crawl Site**
Use `scripts/fetch_page.py` and `scripts/parse_html.py` to crawl the site:
```bash
# Fetch homepage
python3 scripts/fetch_page.py <domain>
# Parse HTML to extract links
python3 scripts/parse_html.py <html_content>
```
Follow internal links only (same domain hostname). Normalize URLs:
- Remove trailing slashes (treat `/about` and `/about/` as same)
- Remove URL fragments (`#section`)
- Preserve query strings only if they appear to be content (not tracking: strip `utm_*`, `ref=`, `source=`)
Respect robots.txt: fetch `<domain>/robots.txt` first, skip disallowed paths.
Cap crawl at `max_pages`. Track crawl queue (BFS order from homepage).
**Step 2: Build Link Graph**
For each crawled page, record:
- Source URL
- Target URL (all internal links found)
- Anchor text for each link
Build adjacency structure:
- `outbound[url]` = list of (target, anchor_text)
- `inbound[url]` = list of (source, anchor_text)
**Step 3: Calculate Per-Page Metrics**
For each crawled page:
- Inbound internal links count (links from other crawled pages pointing here)
- Outbound internal links count (links from this page to oth