data-explorationlisted
Install: claude install-skill Safen99/opencode-cowork-plugins
# Data Exploration Skill
Systematic methodology for profiling datasets, assessing data quality, discovering patterns, and understanding schemas.
## Data Profiling Methodology
### Phase 1: Structural Understanding
Before analyzing any data, understand its structure:
**Table-level questions:**
- How many rows and columns?
- What is the grain (one row per what)?
- What is the primary key? Is it unique?
- When was the data last updated?
- How far back does the data go?
**Column classification:**
Categorize each column as one of:
- **Identifier**: Unique keys, foreign keys, entity IDs
- **Dimension**: Categorical attributes for grouping/filtering (status, type, region, category)
- **Metric**: Quantitative values for measurement (revenue, count, duration, score)
- **Temporal**: Dates and timestamps (created_at, updated_at, event_date)
- **Text**: Free-form text fields (description, notes, name)
- **Boolean**: True/false flags
- **Structural**: JSON, arrays, nested structures
### Phase 2: Column-Level Profiling
For each column, compute:
**All columns:**
- Null count and null rate
- Distinct count and cardinality ratio (distinct / total)
- Most common values (top 5-10 with frequencies)
- Least common values (bottom 5 to spot anomalies)
**Numeric columns (metrics):**
```
min, max, mean, median (p50)
standard deviation
percentiles: p1, p5, p25, p75, p95, p99
zero count
negative count (if unexpected)
```
**String columns (dimensions, text):**
```
min length, max length, avg l