architecting-datalisted
Install: claude install-skill ancoleman/ai-design-components
# Data Architecture
## Purpose
Guide architects and platform engineers through strategic data architecture decisions for modern cloud-native data platforms.
## When to Use This Skill
Invoke this skill when:
- Designing a new data platform or modernizing legacy systems
- Choosing between data lake, data warehouse, or data lakehouse
- Deciding on data modeling approaches (dimensional, normalized, data vault, wide tables)
- Evaluating centralized vs data mesh architecture
- Selecting open table formats (Apache Iceberg, Delta Lake, Apache Hudi)
- Designing medallion architecture (bronze, silver, gold layers)
- Implementing data governance and cataloging
## Core Concepts
### 1. Storage Paradigms
Three primary patterns for analytical data storage:
**Data Lake:** Centralized repository for raw data at scale
- Schema-on-read, cost-optimized ($0.02-0.03/GB/month)
- Use when: Diverse data sources, exploratory analytics, ML/AI training data
**Data Warehouse:** Structured repository optimized for BI
- Schema-on-write, ACID transactions, fast queries
- Use when: Known BI requirements, strong governance needed
**Data Lakehouse:** Hybrid combining lake flexibility with warehouse reliability
- Open table formats (Iceberg, Delta Lake), ACID on object storage
- Use when: Mixed BI + ML workloads, cost optimization (60-80% cheaper than warehouse)
**Decision Framework:**
- BI/Reporting only + Known queries → Data Warehouse
- ML/AI primary + Raw data needed → Data Lake or Lakehouse
- Mi