← ClaudeAtlas

fpa-backtest-learnlisted

Use when you want the model to learn from how its past forecasts actually turned out - scoring forecasts against the company's real actuals, backtesting assumptions on history, and proposing ratified improvements. Runs at/after monthly close.
JeffBrines/openfpa · ★ 3 · Testing & QA · score 71
Install: claude install-skill JeffBrines/openfpa
# Backtest & Learn (Operate) ## Overview The model should get measurably better at this business over time. This skill scores past forecasts against the company's actuals, surfaces what keeps missing, and proposes improvements a human ratifies. The objective metric is reconciliation error against the user's own books (`pyfpa.score_forecast`) - the FP&A analog of a validation loss. **Core principle:** self-experimenting, but never self-promoting. The AI may run and discard bounded challengers autonomously; a human approves replacement of the champion. Everything learned lives as plain files in `.fpa/`. ## Memory (`.fpa/`) - `forecasts/<period>.snapshot.yaml` - each forecast's assumptions + predictions, and (after close) its score. - `scorecard.md` - the running track record (rendered, never hand-edited). - `experiments/<slug>.experiment.yaml` - each tested model change, its evidence, changed files, checks, before/after metrics, and decision. - `learnings.md` - every accepted change: what, the evidence, the backtest delta, the date. ## Workflow 1. **Snapshot every forecast.** When you produce a forecast, persist it: `snapshot_forecast(cfg, forecast_df, label=<period>, created=<today>)` → `save_snapshot(..., ".fpa/forecasts/<period>.snapshot.yaml")`. 2. **Score at close.** When a period closes (actuals via **fpa-configure-actuals**), load that period's snapshot, `score_forecast(snap.predicted, actuals)`, write the score back into the snapshot, and re-render