evaluate-scenarios

Solid

Decompose each scenario into clean-context forks, measure framework-overhead bytes and hops per fork, and report a feasibility signal (heaviest fork's net load) and a cost signal (overhead summed across forks). Use to measure the operational overhead the framework imposes per agent.

AI & Automation 69 stars 9 forks Updated today CC-BY-4.0

Install

View on GitHub

Quality Score: 88/100

Stars 20%
61
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

## EXECUTE NOW **Target: $ARGUMENTS** Parse immediately: - empty → evaluate all scenarios in `test/scenarios/` - a scenario name → evaluate only that scenario - `compare` → evaluate all and compare against a previous run if one exists This harness measures **operational overhead**: the framework instructions an agent must read on top of the task's own content. The unit is the **fork**, not the operation — every `cp-skill-*` runs `context: fork`, so each fork pays its overhead from a fresh context. See `kb/notes/feasibility-is-the-heaviest-forks-net-load.md` for the model. ### 1. Discover scenario files ```bash ls test/scenarios/*.md ``` Read each scenario. Each has a `## Forks` section with one subsection per fork; each fork has a table of loads: `load | kind | source | hops`, where `kind` is `overhead`, `content`, or `spared`. ### 2. Config (override via $ARGUMENTS, e.g. `notesize=3000 candidates=4 budget=50000 agents_per_fork=on`) | Knob | Default | Meaning | |---|---|---| | `notesize` | 2,000 B | average note/body read | | `candidates` | 3 | content notes opened where a fork prospects bodies | | `spared_bodies` | 3 | bodies an index or description-listing read lets a fork skip | | `index_size` | 3,000 B | one curated index read or scoped description listing | | `validate_out` | 500 B | bytes a `commonplace-validate` run returns into context | | `budget` | 50,000 B | usable-window soft ceiling for the feasibility flag (overhead + content + room to reason) | | `agent...

Details

Author
zby
Repository
zby/commonplace
Created
3 months ago
Last Updated
today
Language
Python
License
CC-BY-4.0

Similar Skills

Semantically similar based on skill content — not just same category