firecrawl-policy-guardrails

Featured

Implement Firecrawl scraping policy enforcement: domain blocklists, credit budgets, content filtering, and robots.txt compliance guardrails. Use when setting up scraping policies, enforcing crawl limits, or preventing accidental scraping of prohibited domains. Trigger with phrases like "firecrawl policy", "firecrawl guardrails", "firecrawl domain blocklist", "firecrawl scraping rules", "firecrawl compliance".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Firecrawl Policy Guardrails ## Overview Automated guardrails for Firecrawl scraping pipelines. Web scraping carries legal (robots.txt, ToS), ethical (rate limiting, attribution), and cost (credit burn) risks. This skill implements domain blocklists, credit budgets, content quality gates, and per-domain rate limits as enforceable policies. ## Instructions ### Step 1: Domain Policy Enforcement ```typescript import FirecrawlApp from "@mendable/firecrawl-js"; const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY!, }); class ScrapePolicy { // Domains that explicitly prohibit scraping in their ToS static BLOCKED_DOMAINS = [ "facebook.com", "instagram.com", // Meta ToS "linkedin.com", // LinkedIn ToS "twitter.com", "x.com", // X/Twitter ToS ]; // Domains with sensitive/regulated content static SENSITIVE_DOMAINS = [ "*.gov", "*.mil", // Government "*.edu", // Educational (FERPA) ]; static validateUrl(url: string): void { const hostname = new URL(url).hostname; for (const blocked of this.BLOCKED_DOMAINS) { if (hostname === blocked || hostname.endsWith(`.${blocked}`)) { throw new PolicyViolation(`Domain "${hostname}" is blocked: ToS prohibits scraping`); } } for (const pattern of this.SENSITIVE_DOMAINS) { const regex = new RegExp("^" + pattern.replace("*.", ".*\\.") + "$"); if (regex.test(hostname)) { ...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

firecrawl-reliability-patterns

Implement Firecrawl reliability patterns: circuit breakers, crawl fallbacks, and content validation. Use when building fault-tolerant scraping pipelines, implementing crawl-to-scrape fallback, or adding content quality gates to Firecrawl integrations. Trigger with phrases like "firecrawl reliability", "firecrawl circuit breaker", "firecrawl fallback", "firecrawl resilience", "firecrawl fault tolerant".

2,266 Updated today
jeremylongshore
AI & Automation Featured

firecrawl-enterprise-rbac

Configure Firecrawl team access control with per-key credit limits and domain restrictions. Use when managing multiple API keys per team, implementing credit budgets per consumer, or controlling which domains each team can scrape. Trigger with phrases like "firecrawl RBAC", "firecrawl teams", "firecrawl enterprise", "firecrawl access control", "firecrawl permissions".

2,266 Updated today
jeremylongshore
AI & Automation Featured

firecrawl-observability

Monitor Firecrawl scraping pipelines with metrics, credit tracking, and quality alerts. Use when implementing monitoring for Firecrawl operations, setting up dashboards, or configuring alerting for scrape failures and credit consumption. Trigger with phrases like "firecrawl monitoring", "firecrawl metrics", "firecrawl observability", "monitor firecrawl", "firecrawl alerts".

2,266 Updated today
jeremylongshore
AI & Automation Listed

firecrawl

Scrape pages, crawl public sites, map URLs, and run JSON-schema extraction through managed or self-hosted Firecrawl.

108 Updated today
HybridAIOne
AI & Automation Featured

firecrawl-cost-tuning

Optimize Firecrawl costs through crawl limits, format selection, caching, and credit monitoring. Use when analyzing Firecrawl billing, reducing API costs, or implementing credit budget alerts. Trigger with phrases like "firecrawl cost", "firecrawl billing", "reduce firecrawl costs", "firecrawl pricing", "firecrawl credits", "firecrawl budget".

2,266 Updated today
jeremylongshore