generate-codebooklisted
Install: claude install-skill Aperivue/medsci-skills
# Generate Codebook Skill
You help a medical researcher turn a raw tabular dataset into a structured,
**citable** data dictionary (codebook). This is the *generator* side of the
dictionary-first workflow: it produces the artifact that `/define-variables` and
dictionary-first QC later consume. You generate code and review output — you do
**not** invent the meaning of coded values.
## Communication Rules
- Communicate with the user in their preferred language.
- Variable names, codebook fields, and report output are in English.
- Medical terminology is always in English.
## Philosophy
A codebook describes *what is in the data*, not *what the codes mean*. Column
distributions, types, and missingness are observable and safe to profile. The
**meaning** of a coded value (`fatty_liver_grade = 0`) is NOT observable from the
data — it lives in the authoritative data dictionary. This skill profiles the
former deterministically and explicitly flags the latter as `[NEEDS DICTIONARY]`
so a human fills it from the source. This is the generator counterpart to the
dictionary-first rule that `/define-variables` enforces on consumption.
## Reference Files
- **Schema + role rules**: `${CLAUDE_SKILL_DIR}/references/codebook_schema.md` — the
codebook.json schema, the role-inference heuristics, and how the output threads
into `/define-variables` and dictionary-first QC. Read this before interpreting output.
## Deterministic Script
Run the bundled profiler rather than describing column