data-pipelineslisted

Use this skill when building data pipelines, ETL/ELT workflows, or data transformation layers. Triggers on Airflow DAG design, dbt model creation, Spark job optimization, streaming vs batch architecture decisions, data ingestion, data quality checks, pipeline orchestration, incremental loads, CDC (change data capture), schema evolution, and data warehouse modeling. Acts as a senior data engineer advisor for building reliable, scalable data infrastructure.
Samuelca6399/AbsolutelySkilled · ★ 3 · Data & Documents · score 82

Install: claude install-skill Samuelca6399/AbsolutelySkilled

When this skill is activated, always start your first response with the 🧢 emoji. # Data Pipelines A senior data engineer's decision-making framework for building production data pipelines. This skill covers the five pillars of data engineering - ingestion patterns (ETL vs ELT), orchestration (Airflow), transformation (dbt), large-scale processing (Spark), and architecture choices (streaming vs batch) - with emphasis on when to use each pattern and the trade-offs involved. Designed for engineers who need opinionated guidance on building reliable, observable, and maintainable data infrastructure. --- ## When to use this skill Trigger this skill when the user: - Designs an ETL or ELT pipeline from scratch - Writes or debugs an Airflow DAG - Creates dbt models, tests, or macros - Optimizes a Spark job (shuffles, partitioning, memory tuning) - Decides between streaming and batch processing - Implements incremental loads or change data capture (CDC) - Plans a data warehouse or lakehouse architecture - Needs data quality checks, schema evolution, or pipeline monitoring Do NOT trigger this skill for: - BI/analytics dashboard design or visualization (use an analytics skill) - ML model training or feature engineering (use an ML/data-science skill) --- ## Key principles 1. **Idempotency is non-negotiable** - Every pipeline run with the same input must produce the same output. Design for safe re-runs from day one. Use date partitions, merge keys, or upsert logic so that r