← ClaudeAtlas

wap-ingestionlisted

Ingest data from S3 into bauplan using the Write-Audit-Publish pattern for safe data loading. Use when loading new data from S3, performing safe data ingestion, or when the user mentions WAP, data ingestion, importing parquet/csv/jsonl files, or needs to safely load data with quality checks.
aiskillstore/marketplace · ★ 329 · Data & Documents · score 82
Install: claude install-skill aiskillstore/marketplace
# Write-Audit-Publish (WAP) Pattern Implement WAP by writing a Python script using the `bauplan` SDK. Do NOT use CLI commands. **The three steps**: Write (ingest to temp branch) → Audit (quality checks) → Publish (merge to main) **Branch safety**: All operations happen on a temporary branch, NEVER on `main`. By default, branches are kept open for inspection after success or failure. **Atomic multi-table operations**: `merge_branch` is atomic. You can create or modify multiple tables on a branch, and when you merge, either all changes apply to main or none do. This enables safe multi-table ingestion workflows. ## Required User Input Before writing the WAP script, you MUST ask the user for the following parameters: 1. **S3 path** (required): The S3 URI pattern for the source data (e.g., `s3://bucket/path/*.parquet`) 2. **Table name** (required): The name for the target table 3. **On success behavior** (optional): - `inspect` (default): Keep the branch open for user inspection before merging - `merge`: Automatically merge to main and delete the branch 4. **On failure behavior** (optional): - `keep` (default): Leave the branch open for inspection/debugging - `delete`: Delete the failed branch ## WAP Script Template See [wap_template.py](wap_template.py) for the complete template. Minimal usage: ```python from wap_template import wap_ingest branch, success = wap_ingest( table_name="orders", s3_path="s3://my-bucket/data/*.parquet", namespace="baup