bare-eval

Solid

Run isolated eval and grading calls using CC 2.1.81 --bare mode. Constructs claude -p --bare invocations for skill evaluation, trigger testing, and LLM grading without plugin/hook interference. Use when running eval pipelines, grading skill outputs, benchmarking prompt quality, or testing trigger accuracy in isolation.

AI & Automation 188 stars 15 forks Updated today MIT

Install

View on GitHub

Quality Score: 86/100

Stars 20%
76
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Bare Eval — Isolated Evaluation Calls Run `claude -p --bare` for fast, clean eval/grading without plugin overhead. **CC 2.1.81 required.** The `--bare` flag skips hooks, LSP, plugin sync, and skill directory walks. ## When to Use - Grading skill outputs against assertions - Trigger classification (which skill matches a prompt) - Description optimization iterations - Any scripted `-p` call that doesn't need plugins ## When NOT to Use - Testing skill routing (needs `--plugin-dir`) - Testing agent orchestration (needs full plugin context) - Interactive sessions ## Prerequisites ```bash # --bare requires ANTHROPIC_API_KEY (OAuth/keychain disabled) export ANTHROPIC_API_KEY="sk-ant-..." # Verify CC version claude --version # Must be >= 2.1.81 ``` ## Quick Reference | Call Type | Command Pattern | |-----------|----------------| | Grading | `claude -p "$prompt" --bare --max-turns 1 --output-format text` | | Trigger | `claude -p "$prompt" --bare --json-schema "$schema" --output-format json` | | Streaming grade | `claude -p "$prompt" --bare --max-turns 1 --output-format stream-json` | | Optimize | `echo "$prompt" \| claude -p --bare --max-turns 1 --output-format text` | | Force-skill | `claude -p "$prompt" --bare --print --append-system-prompt "$content"` | | @-file in prompt | `claude -p "grade @fixtures/case-1.md against rubric" --bare` (CC 2.1.113 Remote Control autocomplete) | ### `--output-format stream-json` Newline-delimited JSON events (one per token/tool-call) ...

Details

Author
yonatangross
Repository
yonatangross/orchestkit
Created
5 months ago
Last Updated
today
Language
TypeScript
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category