RAFCERAY

data-quality-report

Use this skill when the user wants to generate an auditable PDF data quality report from a tabular dataset (CSV, Excel, Parquet). Triggers include "generate a quality report", "audit this dataset", "produce a compliance PDF", "DAMA quality report", "génère un rapport qualité", "audit data quality", "rapport de conformité données", "PDF auditable", "rapport gouvernance". Produces a multi-page PDF aligned to DAMA-DMBOK2 six dimensions of data quality (completeness, uniqueness, validity, consistency, timeliness, accuracy), a machine-readable JSON, and an audit log. Includes RGPD/ISO 8000/BCBS 239 compliance checks.

eda-explorer

Use this skill when the user uploads or references a tabular dataset (CSV, Excel, Parquet, TSV) and asks to explore, profile, summarize, understand, or do EDA on it. Triggers include "explore this dataset", "what's in this data", "EDA on", "profile this", "describe this dataset", "tell me about these data", "fais une exploration", "profile-moi", "fais l'EDA". Generates a standardized 9-section EDA report covering shape, schema, missing values, descriptive statistics, distributions, correlations, outliers, a data quality score, and recommendations.

feature-engineer

Use this skill when the user wants to prepare a dataset for a machine learning model — encoding categorical variables, scaling numeric features, decomposing datetime columns, creating interactions, or building a reusable feature pipeline. Triggers include "prepare features", "encode my data", "feature engineering", "build a pipeline", "make this ML-ready", "fais du feature engineering", "encode mes variables", "prépare mes features pour un modèle". Outputs a fitted scikit-learn pipeline (pickled), a feature dictionary (JSON), and the transformed dataset.

missing-data-imputation

Use this skill when the user wants to fill missing values in a tabular dataset and obtain a reusable, fitted scikit-learn imputer plus an auditable report of what was imputed. Triggers include "impute missing values", "fill the NaNs", "handle missing data", "KNN imputation", "iterative / MICE imputation", "remplir les valeurs manquantes", "impute mes données", "gérer les données manquantes", "imputation KNN", "imputation itérative". Supports numeric strategies (mean, median, KNN, iterative/MICE) and categorical strategies (mode, constant), exports a pickled fitted imputer for reuse on new data, and produces a JSON report mapping each column to its strategy, fill value, and missing counts. Aligned to the DAMA-DMBOK2 Completeness dimension.

AI & Automation Listed

time-series-features

Use this skill when the user has a time-series dataset (rows ordered chronologically by a datetime column) and wants to engineer time-aware features for forecasting or modeling. Triggers include "create lag features", "rolling window features", "time series feature engineering", "forecast features", "make this dataset ready for time series modeling", "génère des features temporelles", "feature engineering séries temporelles", "lags et rolling", "détecte la saisonnalité". Generates lag features, rolling aggregations, datetime decomposition, business calendar variables (holidays), and stationarity diagnostics. Includes chronological train/test split and ADF test.