← ClaudeAtlas

eda-explorerlisted

Use this skill when the user uploads or references a tabular dataset (CSV, Excel, Parquet, TSV) and asks to explore, profile, summarize, understand, or do EDA on it. Triggers include "explore this dataset", "what's in this data", "EDA on", "profile this", "describe this dataset", "tell me about these data", "fais une exploration", "profile-moi", "fais l'EDA". Generates a standardized 9-section EDA report covering shape, schema, missing values, descriptive statistics, distributions, correlations, outliers, a data quality score, and recommendations.
RAFCERAY/claude-skills-data-tasks · ★ 0 · Data & Documents · score 60
Install: claude install-skill RAFCERAY/claude-skills-data-tasks
# EDA Explorer A reproducible exploratory data analysis skill. Same dataset → same report structure → no surprises. ## When to use this skill Activate this skill whenever the user wants to **understand a tabular dataset** before doing anything else with it. Typical signals: - User uploads a `.csv`, `.xlsx`, `.parquet`, or `.tsv` file - User asks "what's in this data?", "explore this", "do an EDA", "fais une exploration" - User mentions a dataset path or DataFrame variable and asks to "profile" it - User mentions wanting to understand the shape, quality, or distributions **Do NOT activate this skill for:** - Pure modeling questions (use feature-engineer or model-trainer instead) - Quick one-off questions about a single column (just answer directly) - Datasets with > 1M rows without sampling first ## Workflow For every dataset, follow these steps in order: 1. **Load & inspect** — read the file with the right pandas reader, check shape and dtypes 2. **Run all 9 sections** of the standard report (below) — never skip a section 3. **Output as a single markdown report** with clear `##` headers per section 4. **End with concrete recommendations**, not generic advice ## Standard EDA Report Structure The report **must** have exactly these 9 sections, in this order: ### 1. Dataset Overview - File name, file size on disk, file format - Shape: `(n_rows, n_cols)` - Memory footprint (`df.memory_usage(deep=True).sum()` in MB) - A small `head(5)` preview ### 2. Schema & Types - Ta