← ClaudeAtlas

bm25listed

Ranked content search over any text corpus using BM25 (via xhluca/bm25s). Corpus-agnostic: works on cloned repos, project knowledge stores, uploaded files/archives, and any local directory. Stateless — builds an in-memory index each invocation, no cache, no persistence. Use when you need ranked multi-word content search beyond grep, or when picking the "most relevant files for these terms" across a corpus. Triggers on "rank these documents", "search this corpus", "find content about X", "which files are most about Y", or multi-word concept queries against a known body of text.
oaustegard/claude-skills · ★ 124 · Data & Documents · score 84
Install: claude install-skill oaustegard/claude-skills
# bm25 Ranked content search over any text corpus. One CLI, in-memory BM25 index per process, with a session-local disk cache so repeat invocations against the same corpus load in tens of milliseconds instead of rebuilding. ## Setup ```bash uv pip install --system --break-system-packages bm25s ``` Install is sub-second on a warm uv cache. That's the entire dependency. ## Usage ```bash BM25=/mnt/skills/user/bm25/scripts/bm25.py # Local directory python3 $BM25 ./repo 'csrf middleware' # Multiple queries against the same in-memory index (build once, query many) python3 $BM25 ./repo 'csrf middleware' 'session backend' 'queryset filter' # Cloned GitHub repo via tarball (one HTTP call) python3 $BM25 'github.com/django/django' 'atomic transaction' python3 $BM25 'github.com/django/django@stable/5.0.x' 'atomic transaction' # Project knowledge or uploads python3 $BM25 project 'RAG scaling laws' python3 $BM25 uploads 'tax loss harvesting' # Filters python3 $BM25 ./repo 'auth flow' --exclude 'tests/*' --exclude '*/tests/*' python3 $BM25 ./repo 'config' --include '*.py' --include '*.toml' # Interactive (REPL — single corpus, many queries) python3 $BM25 ./repo --interactive # JSON output for piping python3 $BM25 ./repo 'auth flow' --json ``` ## Corpus types | Spec | Meaning | |------|---------| | `./path` or `/abs/path` | Local directory | | `uploads` | `/mnt/user-data/uploads/` | | `project` | `/mnt/project/` | | `github.com/owner/repo[@ref]` | Tarball fetch via GitHub API