azure-diagnostics

Solid

Debug Azure production issues on Azure using AppLens, Azure Monitor, resource health, and safe triage. WHEN: debug production issues, troubleshoot container apps, troubleshoot functions, troubleshoot AKS, kubectl cannot connect, kube-system/CoreDNS failures, pod pending, crashloop, node not ready, upgrade failures, analyze logs, KQL, insights, image pull failures, cold start issues, health probe failures, resource health, root cause of errors.

DevOps & Infrastructure 607 stars 93 forks Updated 2 months ago MIT

Install

View on GitHub

Quality Score: 95/100

Stars 20%
93
Recency 20%
75
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Azure Diagnostics > **AUTHORITATIVE GUIDANCE — MANDATORY COMPLIANCE** > > This document is the **official source** for debugging and troubleshooting Azure production issues. Follow these instructions to diagnose and resolve common Azure service problems systematically. ## Triggers Activate this skill when user wants to: - Debug or troubleshoot production issues - Diagnose errors in Azure services - Analyze application logs or metrics - Fix image pull, cold start, or health probe issues - Investigate why Azure resources are failing - Find root cause of application errors - Troubleshoot Azure Function Apps (invocation failures, timeouts, binding errors) - Find the App Insights or Log Analytics workspace linked to a Function App - Troubleshoot AKS clusters, nodes, pods, ingress, or Kubernetes networking issues ## Rules 1. Start with systematic diagnosis flow 2. Use AppLens (MCP) for AI-powered diagnostics when available 3. Check resource health before deep-diving into logs 4. Select appropriate troubleshooting guide based on service type 5. Document findings and attempted remediation steps 6. Route AKS incidents to the dedicated AKS troubleshooting document --- ## Quick Diagnosis Flow 1. **Identify symptoms** - What's failing? 2. **Check resource health** - Is Azure healthy? 3. **Review logs** - What do logs show? 4. **Analyze metrics** - Performance patterns? 5. **Investigate recent changes** - What changed? --- ## Troubleshooting Guides by Service | Service | Comm...

Details

Author
microsoft
Repository
microsoft/azure-skills
Created
3 months ago
Last Updated
2 months ago
Language
Bicep
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

DevOps & Infrastructure Solid

azure-diagnostics

Debug Azure production issues on Azure using AppLens, Azure Monitor, resource health, and safe triage. WHEN: debug production issues, troubleshoot container apps, troubleshoot functions, troubleshoot AKS, kubectl cannot connect, kube-system/CoreDNS failures, pod pending, crashloop, node not ready, upgrade failures, analyze logs, KQL, insights, image pull failures, cold start issues, health probe failures, resource health, root cause of errors.

1,998 Updated 2 months ago
microsoft
DevOps & Infrastructure Listed

azure-diagnostics

Debug and troubleshoot production issues on Azure. Covers Container Apps diagnostics, log analysis with KQL, health checks, and common issue resolution for image pulls, cold starts, and health probes. USE FOR: debug production issues, troubleshoot container apps, analyze logs with KQL, fix image pull failures, resolve cold start issues, investigate health probe failures, check resource health, view application logs, find root cause of errors DO NOT USE FOR: deploying applications (use azure-deploy), creating new resources (use azure-prepare), setting up monitoring (use azure-observability), cost optimization (use azure-cost-optimization)

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

diagnosing-azure-deployment-failures

Matches Azure deploy / CI / runtime failures against 37+ documented gotchas with verified fixes (BCP258, AADSTS70021, sqlcmd-not-found, FC1-CLI-silent-fallback, ACS dataLocation quirks, SSE timeouts, and more). Delegates to Microsoft's azure-diagnostics for live log/metric queries. Use when a deploy fails, a deployed app misbehaves, or a CI step errors.

1 Updated 2 weeks ago
alexpizarro