skill-improver

Solid

Autoresearch loop for Claude Code skills — greedy keep/discard hill climbing on a 10-dimension quality rubric, with blind subagent validation for self-scoring bias, plus a `freshen` mode that probes external references (release notes, docs, deprecation signals) and applies verified updates, plus a `trigger` mode that measures and tunes the skill's frontmatter description until it reliably fires when it should and stays silent when it shouldn't (60/40 train/test split, 7 runs/query, blinded test scores).

AI & Automation 3 stars 1 forks Updated yesterday MIT

Install

View on GitHub

Quality Score: 79/100

Stars 20%

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Skill Improver — Autoresearch for SKILL.md > **Core Philosophy:** The human programs the researcher, not the research. > Apply Karpathy's autoresearch methodology — greedy hill climbing with > keep/discard against a scalar metric — to autonomously improve Claude Code > skills. ## Invocation Argument grammar: ``` /skill-improver <mode> <target> [--opts] ``` - `<mode>` — `improve` (default) | `score` | `freshen` | `trigger` | `philosophy` | `batch` - `<target>` — skill name (e.g. `gh-cli`), absolute SKILL.md path, `--all`, or glob (e.g. `vllm-*`) - `[--opts]` — mode-specific flags (e.g. `--iterations 15`, `--probe-budget 30`, `--runs-per-query 5`) Examples: ``` /skill-improver freshen autoresearch /skill-improver score gh-cli /skill-improver improve ~/.claude/skills/helm /skill-improver trigger vllm-caching /skill-improver trigger gh-cli --missed "find issue with label X" /skill-improver batch freshen --all /skill-improver freshen --group 'vllm-*' ``` If `<mode>` is omitted, default to `improve`. If `<target>` is omitted and mode is not `batch`, prompt the user. For `batch`, the target after `batch` selects the sub-mode (`freshen`, `improve`, `trigger`, or `philosophy`, default `improve`); the target list comes from `scripts/scan-skills.sh`. The `--missed "<phrase>"` flag (trigger mode only, repeatable) seeds the eval set with user-reported failures as gold should-trigger queries. ## The Improvement Loop ### Phase 0: Setup 1. Identify the target skill. Accept a pat...

Details

Author: air-gapped
Repository: air-gapped/skills
Created: 3 months ago
Last Updated: yesterday
Language: Python
License: MIT

Integrates with

Anthropic · AI Kubernetes · Infrastructure

Bundled in these plugins

skills

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

skill-improve

Improve a skill using a test-fix-retest loop. Runs static checks, proposes targeted fixes, rewrites the skill, re-tests, and keeps or reverts based on score change.

65 Updated 3 days ago

striderZA

AI & Automation Listed

skill-improver

Audit and improve Claude skills against the Anthropic skill guide. Use when creating new skills, improving existing ones, or preparing skills for ClawHub distribution. Triggers on skill audit, improve skill, new skill, skill quality, or ClawHub publish.

67 Updated today

atrislabs

Data & Documents Listed

skill-optimizer

Use when the user wants to analyze, audit, or improve their Agent Skills (SKILL.md files). Triggers on /optimize-skill, /skill-audit, 'optimize skills', 'analyze skills', 'check my skills', 'skill quality'. Also use proactively when the user mentions skills aren't triggering, skills feel broken, or asks why a skill didn't fire.

1 Updated today

Bennaco7539