data-pipeline-spec

Solid

Design an ETL/ELT data pipeline specification. Use when asked to design a data pipeline, spec an ETL or ELT process, document a data ingestion workflow, or plan a data integration. Produces a complete pipeline spec with sources, transforms, destinations, SLAs, error handling, and data quality rules.

Data & Documents 915 stars 165 forks Updated 3 days ago MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%
99
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Data Pipeline Spec Skill This skill produces a complete data pipeline specification covering sources, transformations, destinations, scheduling, SLAs, error handling, data quality checks, and monitoring requirements. Output is ready for engineering handoff or architecture review. ## Required Inputs Ask the user for these if not provided: - **Pipeline purpose** — what business question or workflow does this pipeline serve? - **Source systems** — where does data come from? (databases, APIs, files, event streams) - **Destination** — where does data land? (data warehouse, data lake, downstream DB, reporting tool) - **Transformation type** — ETL (transform before loading) or ELT (load raw, transform in warehouse)? - **Frequency / SLA** — how often must data be fresh? (real-time / hourly / daily / weekly) - **Volume estimate** — approximate rows/events per run - **Data quality requirements** — completeness, deduplication, freshness, schema enforcement - **Team or stack** — any specific tools in use? (Airflow, dbt, Fivetran, Spark, Kafka, etc.) ## Output Structure --- # Data Pipeline Spec: [Pipeline Name] **Purpose:** [One sentence — what decision or workflow does this pipeline enable?] **Type:** [ETL / ELT / Streaming / Batch] **Owner:** [Team or individual] **Version:** [1.0] **Date:** [Date] **Status:** [Draft / Under Review / Approved] --- ## 1. Overview [2–3 sentences describing the pipeline end-to-end: what data moves, from where to where, at what cadence, and why....

Details

Author
mohitagw15856
Repository
mohitagw15856/pm-claude-skills
Created
4 months ago
Last Updated
3 days ago
Language
Shell
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

Data & Documents Listed

pipeline-design

Design ETL/ELT pipelines end-to-end — source connectors, extraction strategies, transform logic, load patterns, idempotency, scheduling, and error handling. Use this skill whenever the user is starting a new ingestion job, planning how data moves from a source (REST API, database, file, webhook, message queue) into a data warehouse or data lake. Also trigger when the user asks about pipeline architecture, incremental vs. full loads, backfill strategies, CDC, retry logic, or orchestration choices (Airflow, Prefect, dbt). This skill should feel like pairing with a senior data engineer on day one of a new pipeline project.

0 Updated 4 days ago
Methasit-Pun
Data & Documents Featured

data-engineering-data-pipeline

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

39,227 Updated today
sickn33
Data & Documents Listed

data-engineering-data-pipeline

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

335 Updated today
aiskillstore
Data & Documents Listed

pipeline-architect

Designs and implements data pipelines: ETL/ELT, streaming, batch processing, schema migrations, and data warehouse architecture. Covers Kafka, Airflow, dbt, Spark, ClickHouse, BigQuery, Snowflake, Redis Streams, and more. Use this skill when the user asks about data pipelines, ETL jobs, data transformation, streaming setup, data warehouse design, CDC, schema migrations, data quality checks, or anything involving moving data from source to target. Also triggers on "build a pipeline," "migrate data from X to Y," "set up streaming," "design my data warehouse," or "data quality is bad, help me fix it."

1 Updated 2 days ago
mturac
Data & Documents Solid

etl-pipeline-builder

Build and manage ETL pipelines for data migration with transformation, CDC, and monitoring

1,034 Updated today
a5c-ai