data-pipeline-builder

Solid

Designs and builds ETL/ELT data pipelines. Takes data sources, destination, transformation requirements. Generates pipeline code (Python/SQL), scheduling config, error handling, monitoring setup, and data quality checks. Outputs data-pipeline-spec.md + implementation files.

Data & Documents 180 stars 30 forks Updated 4 days ago MIT

Install

View on GitHub

Quality Score: 91/100

Stars 20%
75
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Data Pipeline Builder Design and implement production-grade ETL/ELT data pipelines: take data sources, a destination, and transformation requirements, then produce a complete pipeline specification plus all implementation files needed to run it. ## Contents - `references/project-structure.md` -- output file layout, architecture pattern selection, component selection. - `references/python-patterns.md` -- Python code standards and base extractor/transformer/loader/retry patterns. - `references/quality-checks.md` -- composable data quality check framework and built-in checks. - `references/orchestration-config.md` -- Airflow DAG, pipeline config YAML, and monitoring/alerting patterns. - `references/spec-template.md` -- the `data-pipeline-spec.md` output template. ## Workflow 1. Gather requirements. If the user gave clear requirements, proceed to design. Otherwise ask targeted questions: data sources (databases, APIs, files, streams); destination (warehouse, lake, database); transformations (joins, aggregations, filters, business rules); freshness requirement (real-time, hourly, daily); technology preferences (Airflow, dbt, Spark, cloud provider); data quality and compliance requirements. 2. Analyze and design. Catalog each source (connection type, auth, schema, volume, CDC availability, rate limits). Define the destination (platform, schema design, partitioning, clustering, access patterns). Map transformations (field mappings, business logic, type conversions, joins, ag...

Details

Author
OneWave-AI
Repository
OneWave-AI/claude-skills
Created
7 months ago
Last Updated
4 days ago
Language
N/A
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category