feature-engineerlisted
Install: claude install-skill RAFCERAY/claude-skills-data-tasks
# Feature Engineer
A standardized feature engineering skill. Same logic, every dataset, fully traceable.
## When to use this skill
Activate when the user has a **clean-ish tabular dataset** and wants to make it model-ready. Typical signals:
- "Prepare these features for a model"
- "Encode my categorical variables"
- "Build a feature pipeline"
- "I want to train a model on this — what should I do first?"
- "Fais du feature engineering" / "encode mes variables"
**Pre-condition:** the data should already be reasonably clean. If you see > 30% missing in many columns or obvious quality issues, **call `eda-explorer` first** and tell the user to clean before feature engineering.
**Do NOT activate this skill for:**
- Initial exploration (use `eda-explorer`)
- Time-series-specific features (lags, rolling windows) — wait for `time-series-features` skill
- Text NLP feature extraction (out of scope)
## Workflow
For every dataset, follow these 5 phases in order:
### Phase 1 — Type detection
Auto-classify each column into one of:
- `numeric` (int, float, bool)
- `categorical_low_card` (object, < 10 unique values)
- `categorical_high_card` (object, 10–50 unique values)
- `categorical_very_high_card` (object, > 50 unique values)
- `datetime`
- `id_or_constant` (drop these — `n_unique == n_rows` or `n_unique == 1`)
- `text` (object with average string length > 30 chars — out of scope, drop with warning)
Print the classification table to the user **before** proceeding.
### Phase 2 —