nw-data-architecture-patterns

Solid

Data architecture patterns (warehouse, lake, lakehouse, mesh), ETL/ELT pipelines, streaming architectures, scaling strategies, and schema design patterns

Data & Documents 523 stars 54 forks Updated 1 weeks ago MIT

Install

View on GitHub

Quality Score: 92/100

Stars 20%
91
Recency 20%
90
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Data Architecture Patterns ## Architecture Selection Decision Tree Structured only -> **Data Warehouse** | Mixed + SQL analytics -> **Data Lakehouse** | Mixed + ML-primary -> **Data Lake** | Large org + autonomous domains -> **Data Mesh** ## Data Warehouse Schema: structured, schema-on-write | Data: tables, rows, columns | Governance: centralized | Query: SQL analytics, BI | Architecture: centralized single source of truth ### Schema Patterns **Star Schema**: Central fact table (measures) surrounded by denormalized dimension tables. Best for BI dashboards, standard reporting. **Snowflake Schema**: Normalized dimensions (dimensions reference other dimensions). Reduces storage, increases JOIN complexity. Best when storage cost matters more than query speed. ### Kimball vs Inmon **Kimball (Bottom-Up)**: Build data marts first, integrate later | Star schema, business-process driven | Faster initial delivery | Best for quick wins, department-level analytics **Inmon (Top-Down)**: Build enterprise DW first, derive data marts | Normalized 3NF enterprise model | Higher upfront effort | Best for large enterprises needing single source of truth Technology: Snowflake | Amazon Redshift | Google BigQuery | Azure Synapse Analytics ## Data Lake Schema-on-read, flexible | All formats (structured, semi-structured, unstructured) | Raw data in native format | Query via Athena, Spark SQL, PySpark, Pandas | Risk: "data swamp" without governance ### Organization Zones: **raw** (land...

Details

Author
nWave-ai
Repository
nWave-ai/nWave
Created
3 months ago
Last Updated
1 weeks ago
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

architecting-data

Strategic guidance for designing modern data platforms, covering storage paradigms (data lake, warehouse, lakehouse), modeling approaches (dimensional, normalized, data vault, wide tables), data mesh principles, and medallion architecture patterns. Use when architecting data platforms, choosing between centralized vs decentralized patterns, selecting table formats (Iceberg, Delta Lake), or designing data governance frameworks.

368 Updated 5 months ago
ancoleman
AI & Automation Featured

snowflake-architecture-variants

Choose and implement Snowflake architecture blueprints: data lakehouse, data mesh, data sharing, and Snowpark-native patterns for different scales. Use when designing Snowflake data platforms, choosing between architectures, or implementing data sharing and Snowpark patterns. Trigger with phrases like "snowflake architecture", "snowflake lakehouse", "snowflake data mesh", "snowflake data sharing", "snowflake Snowpark".

2,266 Updated today
jeremylongshore
Data & Documents Listed

data-lake-architect

Provides architectural guidance for data lake design including partitioning strategies, storage layout, schema design, and lakehouse patterns. Activates when users discuss data lake architecture, partitioning, or large-scale data organization.

335 Updated today
aiskillstore
Data & Documents Listed

pipeline-architect

Designs and implements data pipelines: ETL/ELT, streaming, batch processing, schema migrations, and data warehouse architecture. Covers Kafka, Airflow, dbt, Spark, ClickHouse, BigQuery, Snowflake, Redis Streams, and more. Use this skill when the user asks about data pipelines, ETL jobs, data transformation, streaming setup, data warehouse design, CDC, schema migrations, data quality checks, or anything involving moving data from source to target. Also triggers on "build a pipeline," "migrate data from X to Y," "set up streaming," "design my data warehouse," or "data quality is bad, help me fix it."

1 Updated 2 days ago
mturac
Data & Documents Listed

data-warehousing

Use this skill when designing data warehouses, building star or snowflake schemas, implementing slowly changing dimensions (SCDs), writing analytical SQL for Snowflake or BigQuery, creating fact and dimension tables, or planning ETL/ELT pipelines for analytics. Triggers on dimensional modeling, surrogate keys, conformed dimensions, warehouse architecture, data vault, partitioning strategies, materialized views, and any task requiring OLAP schema design or warehouse query optimization.

3 Updated today
Samuelca6399