← ClaudeAtlas

fabric-delta-spark-perflisted

Troubleshoot and optimize Delta Lake and Apache Spark performance in Microsoft Fabric. Use when diagnosing slow Spark jobs, small file problems, data skew, shuffle bottlenecks, out-of-memory errors, V-Order tuning, OPTIMIZE/VACUUM operations, partition strategy, resource profile selection (writeHeavy, readHeavyForSpark, readHeavyForPBI), autotune configuration, Native Execution Engine, broadcast joins, AQE (Adaptive Query Execution), or when Spark notebooks or Spark Job Definitions run slower than expected in Fabric Lakehouse workloads.
PatrickGallucci/fabric-skills · ★ 13 · Data & Documents · score 78
Install: claude install-skill PatrickGallucci/fabric-skills
# Microsoft Fabric Delta Lake Spark Performance remediate Systematic workflows for diagnosing and resolving Apache Spark and Delta Lake performance issues in Microsoft Fabric Lakehouse environments. ## When to Use This Skill Activate when the user mentions any of the following: - Spark job is slow, taking too long, or timing out - Small file problem, too many small files, file fragmentation - Data skew, straggler tasks, unbalanced partitions - Out of memory (OOM) errors on driver or executor - Shuffle spill, excessive shuffle read/write - OPTIMIZE, VACUUM, bin-compaction, or table maintenance - V-Order, Z-Order, or Parquet optimization - Resource profiles: writeHeavy, readHeavyForSpark, readHeavyForPBI - Autotune, Adaptive Query Execution (AQE), broadcast join thresholds - Native Execution Engine configuration - Streaming performance, microbatch tuning, checkpoint issues - Spark pool sizing, autoscale, dynamic executor allocation - Direct Lake performance tied to Delta table structure - Capacity throttling, TooManyRequestsForCapacity errors ## Prerequisites - Microsoft Fabric workspace with Data Engineering or Data Science experience - Apache Spark notebooks or Spark Job Definitions - Lakehouse with Delta tables - Appropriate Fabric capacity SKU (F2 through F2048) ## Quick Diagnostic Workflow When a user reports slow Spark performance, follow this triage sequence: ### Step 1: Identify the Symptom Category | Symptom | Likely Root Cause | Jump To | |---------|--------