data-contractslisted
Install: claude install-skill Methasit-Pun/data_engineer_claude_skills
# Data Contracts
## Why Contracts Exist
Without contracts, schema changes are discovered by broken dashboards or failed pipeline runs — often hours or days after the change shipped. A data contract is a formal agreement between the team that produces a dataset (producer) and the teams that consume it (consumers). It specifies exactly what the dataset promises, and gives consumers a way to know when that promise is broken.
Contracts aren't bureaucracy — they're the interface boundary for data, the same way an API contract is the interface boundary for a service.
---
## Contract Structure
A minimal contract covers:
1. **Schema** — fields, types, nullability, allowed values
2. **Grain** — what one row represents (e.g., "one row per customer per day")
3. **Freshness** — how recent the data is guaranteed to be
4. **SLA** — when the data is available by (e.g., "available by 6am UTC daily")
5. **Owner** — who to contact when something breaks
6. **Version** — the contract version, so consumers can pin to a stable version
### Example contract in YAML
```yaml
# contracts/churn_features_v2.yaml
contract:
name: churn_features
version: "2.1.0"
owner: data-engineering@company.com
consumers:
- ml-team@company.com
- analytics@company.com
grain: "One row per customer_id per feature_date"
sla:
available_by: "06:00 UTC"
freshness_max_lag_hours: 24
schema:
- name: customer_id
type: STRING
nullable: false
description: "FK to dim_customers"
- n