← ClaudeAtlas

verify-pipelinelisted

Run a full health check across the MDS pipeline: ingestion (dlt/Airbyte) load status, BigQuery freshness per source, ingest reconciliation (source-vs-destination row counts), dbt model freshness, MCP server health, and raw-vs-staging row count integrity. Invoke when the user wants to confirm the pipeline is healthy or asks 'is everything working?'
pol-cc/agentic-data-engineer · ★ 1 · Data & Documents · score 72
Install: claude install-skill pol-cc/agentic-data-engineer
# verify-pipeline > **Status**: v0.10.0 — references written; read-only health check operational. **Ingest reconciliation is now a first-class layer** (source-vs-destination row counts, dlt `_dlt_loads` freshness, sequence/gap checks) — mandatory after every dlt load to catch the silent data gap a mis-set incremental cursor leaves without crashing. ## What this skill does Runs deterministic checks across every layer of the MDS and produces a one-page report. Read-only — never modifies state. Safe to invoke at any time. ## Preflight ```bash if [ ! -f .agentic-data-engineer.json ]; then echo "[abort] not a managed MDS deployment" exit 1 fi ``` ## Checks performed | Layer | Check | Pass criterion | |---|---|---| | **Tailscale** | `tailscale status` on the VPS via SSH | VPS reachable, all nodes online | | **Ingestion** (dlt/Airbyte) | dlt `_dlt_loads` last-load status + age per source (or Airbyte `GET /jobs` when `stack.ingest == "airbyte"`) | Latest load completed within `freshness_thresholds.green_hours` (default 26h) | | **BigQuery raw** | `__TABLES__` modification time per raw dataset | Updated within green_hours | | **Ingest reconciliation** | Source-vs-destination row count per source; dlt `_dlt_loads` status; sequence/gap check on monotonic keys | Destination matches source within `reconciliation_tolerance` (default 0); no sequence gaps; latest `_dlt_loads.status = 0` | | **BigQuery integrity** | Row count `raw.<table>` vs `staging.stg_<table>` | Difference with