data-wrangler

Solid

Production-grade tabular data manipulation using pandas & openpyxl. This skill should be used when editing, creating, filtering, sorting, merging, pivoting, deduplicating, validating, or transforming CSV, Excel (xlsx/xls), JSON, Parquet, or TSV files. Supports 18 operations via CLI scripts, advanced Excel formatting (multi-sheet, freeze, auto-filter, validation, styling), and file-converter integration for format pipelines.

Data & Documents 26 stars 8 forks Updated 1 weeks ago MIT

Install

Quality Score: 84/100

Stars 20%

48

Recency 20%

90

Frontmatter 20%

70

Documentation 15%

100

Issue Health 10%

50

License 10%

100

Description 5%

100

Skill Content

# Data Wrangler Manipulate tabular data (CSV, Excel, JSON, Parquet, TSV) w/ pandas-powered scripts. Two scripts cover all operations: `data_wrangler.py` for data ops, `excel_toolkit.py` for Excel-specific features. ## When to Use - User asks to read, edit, filter, sort, or transform CSV/Excel/JSON/Parquet/TSV files - User asks to merge/join datasets, deduplicate, fill missing values, or validate data - User asks to create Excel workbooks w/ formatting, dropdowns, freeze panes, or multi-sheet - User asks to pivot, unpivot, group-by, aggregate, sample, or split datasets - User asks to add computed columns, rename columns, cast types, or apply formulas - User asks to convert between data formats (CSV -> Excel, JSON -> Parquet, etc.) - User asks to inspect/profile data structure, types, nulls, stats ## Prerequisites ```bash # Required pip install pandas openpyxl # Optional (per feature) pip install pyarrow # Parquet support pip install xlrd # Legacy .xls read pip install pandasql # SQL queries on DataFrames pip install fastparquet # Alternative Parquet engine ``` ## Quick Routing | Task | Script | Command | |------|--------|---------| | Inspect/profile data | `data_wrangler.py` | `inspect` | | Filter rows | `data_wrangler.py` | `filter --where "expr"` | | Sort by columns | `data_wrangler.py` | `sort --by Col --desc` | | Group & aggregate | `data_wrangler.py` | `group --by Col --agg "Col:func"` | | Merge/join files | `data_wrangler.py` | `...

Details

Author: georgekhananaev
Repository: georgekhananaev/claude-skills-vault
Created: 6 months ago
Last Updated: 1 weeks ago
Language: Python
License: MIT

Similar Skills

Semantically similar based on skill content — not just same category

Data & Documents Solid

xlsx

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .xltx, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

17 Updated yesterday

Data & Documents Featured

xlsx

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .xltx, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

822 Updated 2 days ago

Data & Documents Listed

clean-data-xls

Clean up messy spreadsheet data — trim whitespace, fix inconsistent casing, convert numbers-stored-as-text, standardize dates, remove duplicates, and flag mixed-type columns. Use when data is messy, inconsistent, or needs prep before analysis. Triggers on "clean this data", "clean up this sheet", "normalize this data", "fix formatting", "dedupe", "standardize this column", and "this data is messy".

1 Updated yesterday