delegate-locallisted

Use this skill to offload non-reasoning text work to locally-installed models (Ollama or MLX) via `delegate.sh`, keeping content on-device and freeing the main-agent context window. Saves API tokens too, though context protection and privacy are the headline values, not cost. MUST use whenever the user asks to summarise a log/diff/file/PR/issue, draft a commit message/changelog/release note, triage or classify many items, extract structured fields from free text, skim many files for a one-liner, or rewrite or reformat prose. MUST also use when the user explicitly asks to delegate, names a local backend or model (Ollama, MLX, "a local model"), wants content kept on-device or offline for privacy, or wants to save API tokens. Bare "run it locally" usually means running the app, server, or tests on the user's own machine, not delegating to a local model, so treat "locally" as a delegation signal only when paired with an explicit delegate verb, a named local model, or a keep-it-private reason. MUST also use after
IsmaelMartinez/delegate-local · ★ 1 · AI & Automation · score 67

Install: claude install-skill IsmaelMartinez/delegate-local

# Delegate Local Offload non-agentic text tasks to locally-installed models via the local inference backend. The headline value is privacy (content stays on-device) and context protection (the main-agent window is not consumed by paragraph-fills); token cost savings are real but typically pennies per session and a side effect, not the reason to reach. Local models are strong summarisers and weak reasoners — scope accordingly. **Core insight (from local-brain):** you do not need a framework. You need `context | ollama run model`. ## Operating mode Default to auto-delegate. When this skill is loaded into the conversation and the user's task matches the Fits list below, delegate immediately without asking permission. After every successful `delegate.sh` call, read the `delegate-meta:` line emitted to the tool's stderr — it carries `model`, `tier`, `backend`, `tokens_local`, and `duration_ms` as space-separated `key=value` pairs (plus `recipe="NAME"` when a `--recipe` flag was used). String-typed fields are double-quoted (`model="qwen3.6:35b-a3b-q8_0"`); integer fields are bare. In your reply, surface the model and the local-token count so the user can both spot a bad answer and see what stayed on-device: e.g. "Delegated to qwen3.6:35b-a3b-q8_0 (prose tier) — ~578 tokens kept local in 1.4 s." Frame it as "kept local," not "saved from Claude" — `tokens_local` is the local model's tokenizer view of total chars in + out divided by 4 (the same value `scripts/metrics-summary.sh` r