databricks-common-errors

Featured

Diagnose and fix Databricks common errors and exceptions. Use when encountering Databricks errors, debugging failed jobs, or troubleshooting cluster and notebook issues. Trigger with phrases like "databricks error", "fix databricks", "databricks not working", "debug databricks", "spark error".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Databricks Common Errors ## Overview Quick-reference diagnostic guide for the most frequent Databricks errors. Covers cluster failures, Spark OOM, Delta Lake conflicts, permissions, schema mismatches, rate limits, and job run failures with real SDK/SQL solutions. ## Prerequisites - Databricks CLI configured - Access to cluster/job logs - `databricks-sdk` installed for programmatic debugging ## Instructions ### Step 1: Identify the Error Source ```bash # Get failed run details databricks runs get --run-id $RUN_ID --output json | jq '{ state: .state.result_state, message: .state.state_message, tasks: [.tasks[] | {key: .task_key, state: .state.result_state, error: .state.state_message}] }' ``` ### Step 2: Match and Fix --- ### CLUSTER_NOT_READY / INVALID_STATE ``` ClusterNotReadyException: Cluster 0123-456789-abcde is not in a RUNNING state ``` **Cause:** Cluster is starting, terminating, or in error state. ```python from databricks.sdk import WorkspaceClient from databricks.sdk.service.compute import State w = WorkspaceClient() cluster = w.clusters.get(cluster_id="0123-456789-abcde") if cluster.state in (State.PENDING, State.RESTARTING): w.clusters.ensure_cluster_is_running("0123-456789-abcde") elif cluster.state == State.TERMINATED: w.clusters.start_and_wait(cluster_id="0123-456789-abcde") elif cluster.state == State.ERROR: reason = cluster.termination_reason print(f"Cluster error: {reason.code} — {reason.parameters}") # Common: CLOUD_PROV...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

databricks-hello-world

Create a minimal working Databricks example with cluster and notebook. Use when starting a new Databricks project, testing your setup, or learning basic Databricks patterns. Trigger with phrases like "databricks hello world", "databricks example", "databricks quick start", "first databricks notebook", "create cluster".

2,266 Updated today
jeremylongshore
Data & Documents Listed

databricks-core

Databricks CLI operations: auth, profiles, data exploration, and bundles. Contains up-to-date guidelines for Databricks-related CLI tasks.

0 Updated 2 days ago
pgoell
AI & Automation Featured

databricks-sdk-patterns

Apply production-ready Databricks SDK patterns for Python and REST API. Use when implementing Databricks integrations, refactoring SDK usage, or establishing team coding standards for Databricks. Trigger with phrases like "databricks SDK patterns", "databricks best practices", "databricks code patterns", "idiomatic databricks".

2,266 Updated today
jeremylongshore
AI & Automation Featured

databricks-observability

Set up comprehensive observability for Databricks with metrics, traces, and alerts. Use when implementing monitoring for Databricks jobs, setting up dashboards, or configuring alerting for pipeline health. Trigger with phrases like "databricks monitoring", "databricks metrics", "databricks observability", "monitor databricks", "databricks alerts", "databricks logging".

2,266 Updated today
jeremylongshore
AI & Automation Featured

databricks-incident-runbook

Execute Databricks incident response procedures with triage, mitigation, and postmortem. Use when responding to Databricks-related outages, investigating job failures, or running post-incident reviews for pipeline failures. Trigger with phrases like "databricks incident", "databricks outage", "databricks down", "databricks on-call", "databricks emergency", "job failed".

2,266 Updated today
jeremylongshore