data-contractslisted

Define and enforce schema contracts between producer and consumer teams — field types, nullability, allowed values, versioning, breaking vs. non-breaking changes, and change detection patterns. Use this skill whenever two teams or services share a dataset and upstream changes keep breaking the downstream silently, when a team wants to formalize what a dataset "promises" to its consumers, or when setting up schema validation at pipeline boundaries. Also trigger when the user asks about schema evolution, backward/forward compatibility, schema registries, Great Expectations for inter-team contracts, or when a producer is about to make a "small" schema change and you want to assess its downstream impact. If upstream and downstream are owned by different people, this skill should be active.
Methasit-Pun/data_engineer_claude_skills · ★ 1 · Data & Documents · score 62

Install: claude install-skill Methasit-Pun/data_engineer_claude_skills

# Data Contracts ## Why Contracts Exist Without contracts, schema changes are discovered by broken dashboards or failed pipeline runs — often hours or days after the change shipped. A data contract is a formal agreement between the team that produces a dataset (producer) and the teams that consume it (consumers). It specifies exactly what the dataset promises, and gives consumers a way to know when that promise is broken. Contracts aren't bureaucracy — they're the interface boundary for data, the same way an API contract is the interface boundary for a service. --- ## Contract Structure A minimal contract covers: 1. **Schema** — fields, types, nullability, allowed values 2. **Grain** — what one row represents (e.g., "one row per customer per day") 3. **Freshness** — how recent the data is guaranteed to be 4. **SLA** — when the data is available by (e.g., "available by 6am UTC daily") 5. **Owner** — who to contact when something breaks 6. **Version** — the contract version, so consumers can pin to a stable version ### Example contract in YAML ```yaml # contracts/churn_features_v2.yaml contract: name: churn_features version: "2.1.0" owner: data-engineering@company.com consumers: - ml-team@company.com - analytics@company.com grain: "One row per customer_id per feature_date" sla: available_by: "06:00 UTC" freshness_max_lag_hours: 24 schema: - name: customer_id type: STRING nullable: false description: "FK to dim_customers" - n