← ClaudeAtlas

cost-optimization-datalisted

Query cost analysis, partition pruning, slot reservation strategies, storage tiering, and cloud data warehouse cost reduction. Use this skill whenever the cloud data bill is unexpectedly high, a specific query is scanning too much data, the team wants to understand what's driving BigQuery/Snowflake/Redshift costs, or when choosing between on-demand vs. reserved capacity. Also trigger when the user mentions bytes scanned, slot utilization, query cost, storage costs, Redshift concurrency, Snowflake credits, or when trying to set up cost alerts and budgets. If someone says "our BigQuery bill jumped" or "this query is expensive", this skill should be active immediately.
Methasit-Pun/data_engineer_claude_skills · ★ 0 · Data & Documents · score 62
Install: claude install-skill Methasit-Pun/data_engineer_claude_skills
# Cost Optimization for Cloud Data Infrastructure ## Find the Money First Before optimizing anything, identify the actual cost drivers. Cloud consoles lie by omission — the default billing view shows totals, not which queries or tables are responsible. ### BigQuery: find expensive queries ```sql -- Top 20 most expensive queries in the last 7 days SELECT user_email, total_bytes_processed / POW(1024, 4) AS tb_processed, total_bytes_processed / POW(1024, 4) * 6.25 AS estimated_cost_usd, -- on-demand rate query, creation_time FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AND job_type = 'QUERY' AND state = 'DONE' ORDER BY total_bytes_processed DESC LIMIT 20; ``` ```sql -- Top tables by bytes scanned (which tables are being read most expensively) SELECT referenced_table.table_id AS table_name, COUNT(*) AS query_count, SUM(total_bytes_processed) / POW(1024, 4) AS total_tb_scanned FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT, UNNEST(referenced_tables) AS referenced_table WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) GROUP BY 1 ORDER BY total_tb_scanned DESC LIMIT 20; ``` ### Snowflake: find credit-burning warehouses ```sql -- Warehouse credit consumption by day SELECT warehouse_name, DATE(start_time) AS usage_date, SUM(credits_used) AS credits_used, SUM(credits_used) * 3.0 AS estimated_cost_usd --