developing-incremental-modelslisted
Install: claude install-skill AltimateAI/data-engineering-skills
# dbt Incremental Model Development
**Choose the right strategy. Design the unique_key carefully. Handle edge cases.**
## When to Use Incremental
| Scenario | Recommendation |
|----------|----------------|
| Source data < 10M rows | Use `table` (simpler, full refresh is fast) |
| Source data > 10M rows | Consider `incremental` |
| Source data updated in place | Use `incremental` with `merge` strategy |
| Append-only source (logs, events) | Use `incremental` with `append` strategy |
| Partitioned warehouse data | Use `insert_overwrite` if supported |
**Default to `table` unless you have a clear performance reason for incremental.**
## Critical Rules
1. **ALWAYS test with `--full-refresh` first** before relying on incremental logic
2. **ALWAYS verify unique_key is truly unique** in both source and target
3. **If merge fails 3+ times**, check unique_key for duplicates
4. **Run full refresh periodically** to prevent data drift
## Workflow
### 1. Confirm Incremental is Needed
```bash
# Check source table size
dbt show --inline "select count(*) from {{ source('schema', 'table') }}"
```
If count < 10 million, consider using `table` instead. Incremental adds complexity.
### 2. Understand the Source Data Pattern
Before choosing a strategy, answer:
- **Is data append-only?** (new rows added, never updated)
- **Are existing rows updated?** (need merge/upsert)
- **Is there a reliable timestamp?** (for filtering new data)
- **What's the unique identifier?** (for merge matching