← All integrations

Grafana

Monitoring
grafana.com →
158 skills · 57 Featured · 784,888 total stars

Commonly used with

Skills using Grafana (158)

AI & Automation Featured

grafana-dashboards

Create and manage production-ready Grafana dashboards for comprehensive system observability.

27,984 Updated today
davila7
DevOps & Infrastructure Featured

observability-engineer

Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows.

27,984 Updated today
davila7
Web & Frontend Featured

prometheus-configuration

Complete guide to Prometheus setup, metric collection, scrape configuration, and recording rules.

27,984 Updated today
davila7
AI & Automation Featured

adobe-incident-runbook

Execute Adobe incident response procedures with triage, mitigation, and postmortem for Firefly Services, PDF Services, and I/O Events outages. Use when responding to Adobe-related incidents, investigating API failures, or running post-incident reviews. Trigger with phrases like "adobe incident", "adobe outage", "adobe down", "adobe on-call", "adobe emergency".

2,359 Updated today
jeremylongshore
AI & Automation Featured

adobe-observability

Set up comprehensive observability for Adobe API integrations with Prometheus metrics, OpenTelemetry traces, structured logging, and alert rules covering Firefly, PDF Services, and Photoshop APIs. Trigger with phrases like "adobe monitoring", "adobe metrics", "adobe observability", "monitor adobe", "adobe alerts", "adobe tracing".

2,359 Updated today
jeremylongshore
AI & Automation Featured

algolia-observability

Set up observability for Algolia: Prometheus metrics for search latency/errors, OpenTelemetry tracing, structured logging, and Grafana dashboards. Trigger: "algolia monitoring", "algolia metrics", "algolia observability", "monitor algolia", "algolia alerts", "algolia tracing", "algolia dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

castai-reference-architecture

CAST AI reference architecture for multi-cluster Kubernetes cost optimization. Use when designing CAST AI deployment across environments, planning Terraform module structure, or establishing team standards. Trigger with phrases like "cast ai architecture", "cast ai best practices", "cast ai multi-cluster", "cast ai terraform structure".

2,359 Updated today
jeremylongshore
AI & Automation Featured

clay-observability

Monitor Clay enrichment pipeline health, credit consumption, and data quality metrics. Use when setting up dashboards for Clay operations, configuring alerts for credit burn, or tracking enrichment success rates. Trigger with phrases like "clay monitoring", "clay metrics", "clay observability", "monitor clay", "clay alerts", "clay dashboard", "clay credit tracking".

2,359 Updated today
jeremylongshore
AI & Automation Featured

clickhouse-observability

Monitor ClickHouse with Prometheus metrics, Grafana dashboards, system table queries, and alerting for query performance, merge health, and resource usage. Use when setting up ClickHouse monitoring, building Grafana dashboards, or configuring alerts for production ClickHouse deployments. Trigger: "clickhouse monitoring", "clickhouse metrics", "clickhouse Grafana", "clickhouse observability", "monitor clickhouse", "clickhouse Prometheus".

2,359 Updated today
jeremylongshore
AI & Automation Featured

clickup-observability

Monitor ClickUp API integrations with metrics, tracing, structured logging, and alerting using Prometheus, OpenTelemetry, and Grafana. Trigger: "clickup monitoring", "clickup metrics", "clickup observability", "monitor clickup", "clickup alerts", "clickup tracing", "clickup dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

cohere-incident-runbook

Execute Cohere incident response procedures with triage, mitigation, and postmortem. Use when responding to Cohere API outages, investigating errors, or running post-incident reviews for Cohere integration failures. Trigger with phrases like "cohere incident", "cohere outage", "cohere down", "cohere on-call", "cohere emergency", "cohere broken".

2,359 Updated today
jeremylongshore
AI & Automation Featured

coreweave-observability

Set up GPU monitoring and observability for CoreWeave workloads. Use when implementing GPU metrics dashboards, configuring alerts, or tracking inference latency and throughput. Trigger with phrases like "coreweave monitoring", "coreweave observability", "coreweave gpu metrics", "coreweave grafana".

2,359 Updated today
jeremylongshore
AI & Automation Featured

coreweave-reference-architecture

Reference architecture for CoreWeave GPU cloud deployments. Use when designing ML infrastructure, planning multi-model serving, or establishing CoreWeave deployment standards. Trigger with phrases like "coreweave architecture", "coreweave design", "coreweave infrastructure", "coreweave best practices".

2,359 Updated today
jeremylongshore
AI & Automation Featured

customerio-observability

Set up Customer.io monitoring and observability. Use when implementing metrics, structured logging, alerting, or Grafana dashboards for Customer.io integrations. Trigger: "customer.io monitoring", "customer.io metrics", "customer.io dashboard", "customer.io alerts", "customer.io observability".

2,359 Updated today
jeremylongshore
AI & Automation Featured

deepgram-observability

Set up comprehensive observability for Deepgram integrations. Use when implementing monitoring, setting up dashboards, or configuring alerting for Deepgram integration health. Trigger: "deepgram monitoring", "deepgram metrics", "deepgram observability", "monitor deepgram", "deepgram alerts", "deepgram dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

documenso-observability

Implement monitoring, logging, and tracing for Documenso integrations. Use when setting up observability, implementing metrics collection, or debugging production issues. Trigger with phrases like "documenso monitoring", "documenso metrics", "documenso logging", "documenso tracing", "documenso observability".

2,359 Updated today
jeremylongshore
AI & Automation Featured

fireflies-observability

Monitor Fireflies.ai integration health with metrics, alerts, and dashboards. Use when implementing monitoring, setting up alerting, or tracking transcript processing reliability. Trigger with phrases like "fireflies monitoring", "fireflies metrics", "fireflies observability", "monitor fireflies", "fireflies alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

flexport-observability

Set up observability for Flexport logistics integrations with metrics, structured logging, distributed tracing, and alerting dashboards. Trigger: "flexport monitoring", "flexport observability", "flexport metrics", "flexport alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

gamma-observability

Implement comprehensive observability for Gamma integrations. Use when setting up monitoring, logging, tracing, or building dashboards for Gamma API usage. Trigger with phrases like "gamma monitoring", "gamma logging", "gamma metrics", "gamma observability", "gamma dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

intercom-observability

Set up observability for Intercom integrations with metrics, traces, and alerts. Use when implementing monitoring for Intercom API operations, setting up dashboards, or configuring alerting for integration health. Trigger with phrases like "intercom monitoring", "intercom metrics", "intercom observability", "monitor intercom", "intercom alerts", "intercom tracing".

2,359 Updated today
jeremylongshore
AI & Automation Featured

klaviyo-observability

Set up observability for Klaviyo integrations with metrics, traces, and alerts. Use when implementing monitoring for Klaviyo API operations, setting up dashboards, or configuring alerting for Klaviyo integration health. Trigger with phrases like "klaviyo monitoring", "klaviyo metrics", "klaviyo observability", "monitor klaviyo", "klaviyo alerts", "klaviyo tracing".

2,359 Updated today
jeremylongshore
AI & Automation Featured

langchain-observability

Set up comprehensive observability for LangChain applications with LangSmith tracing, OpenTelemetry, Prometheus metrics, and alerts. Trigger: "langchain monitoring", "langchain metrics", "langchain observability", "langchain tracing", "LangSmith", "langchain alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

langfuse-observability

Set up comprehensive observability for Langfuse with metrics, dashboards, and alerts. Use when implementing monitoring for LLM operations, setting up dashboards, or configuring alerting for Langfuse integration health. Trigger with phrases like "langfuse monitoring", "langfuse metrics", "langfuse observability", "monitor langfuse", "langfuse alerts", "langfuse dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

lindy-observability

Monitor Lindy AI agent health, task success rates, and credit consumption. Use when setting up monitoring, building dashboards, configuring alerts, or tracking agent performance over time. Trigger with phrases like "lindy monitoring", "lindy observability", "lindy metrics", "lindy logging", "lindy dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

linear-observability

Implement monitoring, logging, and alerting for Linear integrations. Use when setting up metrics collection, dashboards, or configuring alerts for Linear API usage. Trigger: "linear monitoring", "linear observability", "linear metrics", "linear logging", "monitor linear", "linear Prometheus", "linear Grafana".

2,359 Updated today
jeremylongshore
AI & Automation Featured

linktree-performance-tuning

Optimize Linktree API integration performance with caching, batching, and rate limit strategies. Use when Linktree API calls are slow, hitting rate limits, or profile pages serve stale link data. Trigger with "linktree performance tuning".

2,359 Updated today
jeremylongshore
AI & Automation Featured

linktree-prod-checklist

Prod Checklist for Linktree. Trigger: "linktree prod checklist".

2,359 Updated today
jeremylongshore
Testing & QA Featured

load-testing-apis

Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load. Trigger with phrases like "load test the API", "stress test API", or "benchmark API performance".

2,359 Updated today
jeremylongshore
AI & Automation Featured

logging-api-requests

Monitor and log API requests with correlation IDs, performance metrics, and security audit trails. Use when auditing API requests and responses. Trigger with phrases like "log API requests", "add API logging", or "track API calls".

2,359 Updated today
jeremylongshore
AI & Automation Featured

lucidchart-prod-checklist

Prod Checklist for Lucidchart. Trigger: "lucidchart prod checklist".

2,359 Updated today
jeremylongshore
AI & Automation Featured

maintainx-observability

Implement comprehensive observability for MaintainX integrations. Use when setting up monitoring, logging, tracing, and alerting for MaintainX API integrations. Trigger with phrases like "maintainx monitoring", "maintainx logging", "maintainx metrics", "maintainx observability", "maintainx alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

mindtickle-prod-checklist

Prod Checklist for MindTickle. Trigger: "mindtickle prod checklist".

2,359 Updated today
jeremylongshore
AI & Automation Featured

miro-observability

Set up observability for Miro REST API v2 integrations with Prometheus metrics, OpenTelemetry traces, structured logging, and Grafana dashboards. Trigger with phrases like "miro monitoring", "miro metrics", "miro observability", "monitor miro", "miro alerts", "miro tracing".

2,359 Updated today
jeremylongshore
AI & Automation Featured

mistral-observability

Set up comprehensive observability for Mistral AI with metrics, traces, and alerts. Use when implementing monitoring for Mistral AI operations, setting up dashboards, or configuring alerting for integration health. Trigger with phrases like "mistral monitoring", "mistral metrics", "mistral observability", "monitor mistral", "mistral alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

monitoring-apis

Build real-time API monitoring dashboards with metrics, alerts, and health checks. Use when tracking API health and performance metrics. Trigger with phrases like "monitor the API", "add API metrics", or "setup API monitoring".

2,359 Updated today
jeremylongshore
AI & Automation Featured

navan-observability

Use when setting up monitoring, logging, and alerting for Navan API integrations in production environments. Trigger with "navan observability" or "navan monitoring" or "navan api dashboards".

2,359 Updated today
jeremylongshore
AI & Automation Featured

notion-observability

Set up observability for Notion integrations with metrics, traces, and alerts. Use when implementing monitoring for Notion API calls, setting up dashboards, or configuring alerting for Notion integration health. Trigger with phrases like "notion monitoring", "notion metrics", "notion observability", "monitor notion", "notion alerts", "notion tracing".

2,359 Updated today
jeremylongshore
AI & Automation Featured

palantir-observability

Set up observability for Palantir Foundry integrations with metrics, logging, and alerts. Use when implementing monitoring for Foundry API calls, setting up dashboards, or configuring alerting for Foundry integration health. Trigger with phrases like "palantir monitoring", "foundry metrics", "palantir observability", "monitor foundry", "foundry alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

posthog-observability

Monitor PostHog integration health: event ingestion rates, feature flag evaluation latency, billing volume tracking, and Prometheus/Grafana alerting. Trigger: "posthog monitoring", "posthog metrics", "posthog observability", "monitor posthog", "posthog alerts", "posthog dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

running-chaos-tests

Execute chaos engineering experiments to test system resilience. Use when performing specialized testing. Trigger with phrases like "run chaos tests", "test resilience", or "inject failures".

2,359 Updated today
jeremylongshore
AI & Automation Featured

running-performance-tests

Execute load testing, stress testing, and performance benchmarking. Use when performing specialized testing. Trigger with phrases like "run load tests", "test performance", or "benchmark the system".

2,359 Updated today
jeremylongshore
AI & Automation Featured

salesforce-observability

Set up observability for Salesforce integrations with API limit monitoring, error tracking, and alerting. Use when implementing monitoring for Salesforce operations, tracking API consumption, or configuring alerting for Salesforce integration health. Trigger with phrases like "salesforce monitoring", "salesforce metrics", "salesforce observability", "monitor salesforce", "salesforce alerts", "salesforce API usage dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

sentry-observability

Integrate Sentry with your observability stack — logging, metrics, APM, and dashboards. Use when connecting Sentry to winston/pino/structlog, correlating errors with business metrics, deciding between Sentry performance and Datadog/New Relic, building Sentry Discover dashboards, or linking events to external tools via extra context. Trigger: "sentry observability", "sentry logging", "sentry metrics", "sentry grafana", "sentry datadog correlation", "sentry discover dashboard".

2,359 Updated today
jeremylongshore
AI & Automation Featured

snowflake-observability

Set up Snowflake observability using ACCOUNT_USAGE views, alerts, and external monitoring. Use when implementing Snowflake monitoring dashboards, setting up query performance tracking, or configuring alerting for warehouse and pipeline health. Trigger with phrases like "snowflake monitoring", "snowflake metrics", "snowflake observability", "snowflake dashboard", "snowflake alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

vastai-observability

Monitor Vast.ai GPU instance health, utilization, and costs. Use when setting up monitoring dashboards, configuring alerts, or tracking GPU utilization and spending. Trigger with phrases like "vastai monitoring", "vastai metrics", "vastai observability", "monitor vastai", "vastai alerts".

2,359 Updated today
jeremylongshore
AI & Automation Featured

webflow-observability

Set up observability for Webflow integrations — Prometheus metrics for API calls, OpenTelemetry tracing, structured logging with pino, Grafana dashboards, and alerting for rate limits, errors, and latency. Trigger with phrases like "webflow monitoring", "webflow metrics", "webflow observability", "monitor webflow", "webflow alerts", "webflow tracing".

2,359 Updated today
jeremylongshore
Data & Documents Featured

data-engineering-data-pipeline

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

40,440 Updated today
sickn33
AI & Automation Featured

devops-troubleshooter

Expert DevOps troubleshooter specializing in rapid incident response, advanced debugging, and modern observability.

40,440 Updated today
sickn33
AI & Automation Featured

grafana-dashboards

Create and manage production-ready Grafana dashboards for comprehensive system observability.

40,440 Updated today
sickn33
AI & Automation Featured

observability-engineer

Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows.

40,440 Updated today
sickn33
AI & Automation Featured

prometheus-configuration

Complete guide to Prometheus setup, metric collection, scrape configuration, and recording rules.

40,440 Updated today
sickn33
DevOps & Infrastructure Featured

golang-observability

Golang everyday observability — the always-on signals in production. Covers structured logging with slog, Prometheus metrics, OpenTelemetry distributed tracing, continuous profiling with pprof/Pyroscope, server-side RUM event tracking, alerting, and Grafana dashboards. Apply when instrumenting Go services for production monitoring, setting up metrics or alerting, adding OpenTelemetry tracing, correlating logs with traces, migrating legacy loggers (zap/logrus/zerolog) to slog, adding observability to new features, or implementing GDPR/CCPA-compliant tracking with Customer Data Platforms (CDP). Not for temporary deep-dive performance investigation (→ See golang-benchmark and golang-performance skills).

2,093 Updated 2 days ago
samber
AI & Automation Featured

grafana-dashboard-creator

Create grafana dashboard creator operations. Auto-activating skill for DevOps Advanced. Triggers on: grafana dashboard creator, grafana dashboard creator Part of the DevOps Advanced skill category. Use when working with grafana dashboard creator functionality. Trigger with phrases like "grafana dashboard creator", "grafana creator", "grafana".

2,359 Updated today
jeremylongshore
AI & Automation Featured

building-incident-response-dashboard

Builds real-time incident response dashboards in Splunk, Elastic, or Grafana to provide SOC analysts and leadership with situational awareness during active incidents, tracking affected systems, containment status, IOC spread, and response timeline. Use when IR teams need unified visibility during incident coordination and post-incident reporting.

15,448 Updated 1 weeks ago
mukul975
AI & Automation Featured

building-soc-metrics-and-kpi-tracking

Builds SOC performance metrics and KPI tracking dashboards measuring Mean Time to Detect (MTTD), Mean Time to Respond (MTTR), alert quality ratios, analyst productivity, and detection coverage using SIEM data. Use when SOC leadership needs operational visibility, continuous improvement tracking, or executive-level reporting on security operations effectiveness.

15,448 Updated 1 weeks ago
mukul975
AI & Automation Featured

building-vulnerability-aging-and-sla-tracking

Implement a vulnerability aging dashboard and SLA tracking system to measure remediation performance against severity-based timelines and drive accountability.

15,448 Updated 1 weeks ago
mukul975
AI & Automation Featured

implementing-api-abuse-detection-with-rate-limiting

Implement API abuse detection using token bucket, sliding window, and adaptive rate limiting algorithms to prevent DDoS, brute force, and credential stuffing attacks.

15,448 Updated 1 weeks ago
mukul975
AI & Automation Solid

dashboard-generator

Generate monitoring dashboards for Grafana and DataDog with alert integration

1,313 Updated today
a5c-ai
AI & Automation Solid

metrics-schema-generator

Generate metrics schemas for Prometheus, OpenTelemetry, and Grafana dashboards

1,313 Updated today
a5c-ai
AI & Automation Solid

prometheus-grafana

Expert skill for Prometheus metrics and Grafana dashboards. Write and validate PromQL queries, generate Grafana dashboard JSON, create alerting and recording rules, analyze metric cardinality, and debug scrape configurations.

1,313 Updated today
a5c-ai
AI & Automation Solid

service-mesh

Service mesh configuration and operations expertise for Istio, Linkerd, and Consul Connect

1,313 Updated today
a5c-ai
AI & Automation Solid

dashboard-builder

Build monitoring dashboards that answer real operator questions for Grafana, SigNoz, and similar platforms. Use when turning metrics into a working dashboard instead of a vanity board.

213,908 Updated today
affaan-m
DevOps & Infrastructure Solid

azure-kubernetes

Plan, create, and configure production-ready Azure Kubernetes Service (AKS) clusters. Covers Day-0 checklist, SKU selection (Automatic vs Standard), networking options (private API server, Azure CNI Overlay, egress configuration), security, and operations (autoscaling, upgrade strategy, cost analysis). WHEN: create AKS environment, provision AKS environment, enable AKS observability, design AKS networking, choose AKS SKU, secure AKS.

1,998 Updated 2 months ago
microsoft
DevOps & Infrastructure Solid

azure-kubernetes

Plan, create, and configure production-ready Azure Kubernetes Service (AKS) clusters. Covers Day-0 checklist, SKU selection (Automatic vs Standard), networking options (private API server, Azure CNI Overlay, egress configuration), security, and operations (autoscaling, upgrade strategy, cost analysis). WHEN: create AKS environment, provision AKS environment, enable AKS observability, design AKS networking, choose AKS SKU, secure AKS.

607 Updated 2 months ago
microsoft
DevOps & Infrastructure Solid

monitoring-expert

Configures monitoring systems, implements structured logging pipelines, creates Prometheus/Grafana dashboards, defines alerting rules, and instruments distributed tracing. Implements Prometheus/Grafana stacks, conducts load testing, performs application profiling, and plans infrastructure capacity. Use when setting up application monitoring, adding observability to services, debugging production issues with logs/metrics/traces, running load tests with k6 or Artillery, profiling CPU/memory bottlenecks, or forecasting capacity needs.

9,846 Updated 3 weeks ago
Jeffallan
Testing & QA Solid

k6-performance-testing

k6 load testing expertise for performance validation and analysis

1,313 Updated today
a5c-ai
AI & Automation Solid

opentelemetry-integrator

Integrate OpenTelemetry tracing and metrics into SDKs

1,313 Updated today
a5c-ai
AI & Automation Solid

oma-observability

Intent-based observability + traceability router across layers, boundaries, and signals. Routes to vendor-specific skills via category taxonomy; owns transport tuning, meta-observability, incident forensics. Use for observability, traceability, telemetry, APM, RUM, metrics, logs, traces, profiles, SLO, incident forensics, tracing architecture work.

1,081 Updated today
first-fluke
AI & Automation Solid

creating-apm-dashboards

This skill enables Claude to create Application Performance Monitoring (APM) dashboards. It is triggered when the user requests the creation of a new APM dashboard, monitoring dashboard, or a dashboard for application performance. The skill helps define key metrics and visualizations for monitoring application health, performance, and user experience across multiple platforms like Grafana and Datadog. Use this skill when the user needs assistance setting up a new monitoring solution or expanding an existing one. The plugin supports the creation of dashboards focusing on golden signals, request metrics, resource utilization, database metrics, cache metrics, business metrics, and error tracking.

2,359 Updated today
jeremylongshore
DevOps & Infrastructure Solid

deploying-monitoring-stacks

This skill deploys monitoring stacks, including Prometheus, Grafana, and Datadog. It is used when the user needs to set up or configure monitoring infrastructure for applications or systems. The skill generates production-ready configurations, implements best practices, and supports multi-platform deployments. Use this when the user explicitly requests to deploy a monitoring stack, or mentions Prometheus, Grafana, or Datadog in the context of infrastructure setup.

2,359 Updated today
jeremylongshore
Data & Documents Solid

dashboard-brief

Convert a business question into a complete dashboard specification. Use when asked to design a dashboard, create a dashboard spec or brief, plan a BI report, or define what charts and metrics a dashboard should include. Produces a structured spec with metrics, dimensions, chart types, filters, and layout guidance.

960 Updated 3 days ago
mohitagw15856
AI & Automation Solid

grafana-dashboards

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

36,649 Updated today
wshobson
AI & Automation Solid

prometheus-configuration

Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.

36,649 Updated today
wshobson
DevOps & Infrastructure Solid

azure-iot-operations

Expert knowledge for Azure IoT Operations development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring MQTT broker, data flows/graphs, OPC UA/ONVIF connectors, WASM transforms, or Prometheus/Grafana, and other Azure IoT Operations related development tasks. Not for Azure IoT (use azure-iot), Azure IoT Hub (use azure-iot-hub), Azure IoT Edge (use azure-iot-edge), Azure Defender For Iot (use azure-defender-for-iot).

604 Updated 3 days ago
MicrosoftDocs
DevOps & Infrastructure Solid

azure-managed-grafana

Expert knowledge for Azure Managed Grafana development including troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when integrating Azure Monitor/Prometheus, configuring data sources/alerts, Entra auth, private endpoints, or HA workspaces, and other Azure Managed Grafana related development tasks. Not for Azure Monitor (use azure-monitor), Azure Application Gateway (use azure-application-gateway), Azure Virtual Machines (use azure-virtual-machines), Azure Kubernetes Service (AKS) (use azure-kubernetes-service).

604 Updated 3 days ago
MicrosoftDocs
DevOps & Infrastructure Solid

azure-monitor

Expert knowledge for Azure Monitor development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when working with Log Analytics workspaces, DCRs, AMA/agents, Application Insights, or Prometheus/Container Insights, and other Azure Monitor related development tasks. Not for Azure Managed Grafana (use azure-managed-grafana), Azure Network Watcher (use azure-network-watcher), Azure Service Health (use azure-service-health), Azure Defender For Cloud (use azure-defender-for-cloud).

604 Updated 3 days ago
MicrosoftDocs
AI & Automation Solid

otel-queries

Analyze gh-aw OpenTelemetry traces from JSONL mirrors or OTLP backends.

4,612 Updated today
github
DevOps & Infrastructure Solid

devops

DevOps - Docker, CI/CD, cloud infra, monitoring.

569 Updated today
sipyourdrink-ltd
AI & Automation Solid

observability

Structured logging with Pino/Winston, OpenTelemetry tracing, metrics collection, Grafana dashboards, and alerting rules.

501 Updated yesterday
vibeeval
DevOps & Infrastructure Solid

automating-devops

DevOps knowledge reference covering Git workflows, testing strategies, DevSecOps, release pipeline orchestration (release.yml, multi-arch images, cosign integration), CI/CD pipelines, database management, observability, and performance optimization. Use when working with Git, CI/CD, release pipelines, ghcr image publishing, testing, monitoring, or infrastructure automation.

228 Updated today
telagod
AI & Automation Solid

observability-sre

Observability and SRE expert. Use when setting up monitoring, logging, tracing, defining SLOs, or managing incidents. Covers Prometheus, Grafana, OpenTelemetry, and incident response best practices.

154 Updated 1 weeks ago
majiayu000
AI & Automation Listed

ops-monitor

Unified APM and monitoring surface. Polls Datadog, New Relic, and OpenTelemetry backends for active alerts, error traces, and entity health. Use --watch for live polling every 60 seconds. Use --setup to configure monitoring credentials.

18 Updated today
Lifecycle-Innovations-Limited
AI & Automation Solid

hunt-csrf

Hunting skill for csrf vulnerabilities. Built from 15 public bug bounty reports including modern variants — SameSite=Lax sibling-subdomain bypass (Argo CD CVE-2024-22424), GraphQL mutations-via-GET (GitLab $3,370), framework-wide CSRF middleware disabled (Stripe Dashboard $5,000), path-traversal CSRF-token bypass (GitHub Enterprise CVE-2022-23732 $10k), Origin-omission bypass (TikTok $2,500), OAuth-state null-byte (Streamlabs), WebSocket CSRF / CSWSH (Coda), default-SameSite email-change → ATO (YoYo Games $400), social-account-link CSRF (HackerOne), JSON-CSRF via text/plain on email-change (TikTok $500). Use when hunting modern CSRF — heavy emphasis on chain-to-ATO patterns.

1,912 Updated 3 days ago
elementalsouls
AI & Automation Listed

distributed-tracing

Implement distributed tracing with OpenTelemetry, Tempo/Jaeger — instrumentation, sampling, and trace-to-log correlation. Use when the user asks about distributed tracing, OpenTelemetry setup, span instrumentation, trace propagation, or connecting traces to logs and metrics.

15 Updated today
sawrus
AI & Automation Listed

grafana-dashboards

Design and maintain Grafana dashboards — service overview panels, SLO tracking, variable templates, dashboard-as-code with Grafonnet/Jsonnet.

15 Updated today
sawrus
AI & Automation Listed

log-aggregation

Set up Loki or ELK log aggregation for K8s workloads — structured logging, log routing, and log-based alerting.

15 Updated today
sawrus
AI & Automation Listed

service-mesh

Implement service mesh for mTLS, traffic management, and observability — Istio and Linkerd patterns for Kubernetes.

15 Updated today
sawrus
AI & Automation Listed

slo-implementation

Implement SLOs end-to-end in Prometheus — recording rules, burn rate alerts, error budget dashboards, and Sloth/pyrra integration.

15 Updated today
sawrus
AI & Automation Listed

attack-path-architect

Generates strategic attack trees and kill chains from reconnaissance data or domain input. Maps MITRE ATT&CK TTPs, identifies chaining opportunities, trust relationships, and prioritizes attack paths by feasibility and impact. Use when user asks for "attack path", "kill chain", "attack tree", "threat modeling from recon", "attack surface analysis", or "prioritize targets". Requires prior recon data or a domain to analyze. For authorized pentesting and red team engagements only.

32 Updated 2 days ago
KaQus
Data & Documents Listed

grafana-foundation-sdk

Build Grafana dashboards as code with the grafana-foundation-sdk typed builders (TypeScript or Go). Use when creating, modifying, or generating Grafana dashboard JSON programmatically, converting hand-written dashboard JSON to typed code, building monitoring dashboards, or working with Prometheus/Loki queries in dashboards.

29 Updated yesterday
tenequm
AI & Automation Listed

magic-mouth

Magic Mouth is trigger → message. The entire craft is specifying the trigger boundary, the message payload, and the suppression/escalation rules. It is NOT: - One-time messages: "Send a Slack message now saying X" (no trigger, no automation) - Full chatbots: "Build a conversational AI that understands context and handles open-ended questions" (requires NLU, dialogue management) - Monitoring dashboards: "Set up Grafana with real-time graphs and trend analysis" (visualization, not message routing) - Silent traps/wards: "Revert changes silently and log who tried" (defensive code manipulation, not messaging) - Encryption: "Encrypt a message so only the recipient can read it" (cryptography, not event-driven delivery) - Multimedia presentations: "Play a 20-slide deck with animations and narration" (media playback, not conditional messaging)

92 Updated 2 months ago
Hmbown
AI & Automation Listed

dashboard-design

Use when designing monitoring dashboards — visualization selection, layout principles, observability strategies (RED/USE/Golden Signals), and data storytelling.

7 Updated today
event4u-app
AI & Automation Listed

observability-sre

Observability and SRE expert. Use when setting up monitoring, logging, tracing, defining SLOs, or managing incidents. Covers Prometheus, Grafana, OpenTelemetry, and incident response best practices.

72 Updated 2 weeks ago
majiayu000
AI & Automation Listed

beacon

Engineering observability and reliability through SLO/SLI design, distributed tracing, alerting, dashboards, capacity planning, toil automation, and reliability review. Use when designing observability instrumentation, defining SLOs/SLIs, building dashboards/alerts, or reviewing reliability posture.

49 Updated today
simota
DevOps & Infrastructure Listed

cli-forge-infra

Ops integration assistant — reads service docs, finds the simplest config path (CLI/Helm/Operator/Terraform), builds dependency trees, proposes upgrade paths, and tracks decisions in ADRs. Use when debugging infra, integrating services, bootstrapping platforms, upgrading versions, simplifying config, or reviewing infrastructure code. Triggers on ops tool names (OpenBao, Vault, Consul, Traefik, Gitea, ArgoCD, Prometheus, Grafana, cert-manager, Istio, Linkerd, Terraform, OpenTofu, Podman, Docker, K8s, etc.) or keywords like "bootstrap", "integrate", "simplify config", "upgrade infra", "ops stack", "service mesh", "dependency tree".

4 Updated yesterday
Destynova2
AI & Automation Listed

logging-observability-standards

When setting up telemetry, debugging distributed systems, or standardizing application output.

5 Updated 3 days ago
KraitDev
DevOps & Infrastructure Listed

azure-kubernetes

Plan, create, and configure production-ready Azure Kubernetes Service (AKS) clusters. Covers Day-0 checklist, SKU selection (Automatic vs Standard), networking options (private API server, Azure CNI Overlay, egress configuration), security, and operations (autoscaling, upgrade strategy, cost analysis). WHEN: create AKS environment, provision AKS environment, enable AKS observability, design AKS networking, choose AKS SKU, secure AKS.

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

devops-troubleshooter

Expert DevOps troubleshooter specializing in rapid incident response, advanced debugging, and modern observability. Masters log analysis, distributed tracing, Kubernetes debugging, performance optimization, and root cause analysis. Handles production outages, system reliability, and preventive monitoring. Use PROACTIVELY for debugging, incident response, or system troubleshooting.

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

monitoring-observability

Set up monitoring, logging, and observability for applications and infrastructure. Use when implementing health checks, metrics collection, log aggregation, or alerting systems. Handles Prometheus, Grafana, ELK Stack, Datadog, and monitoring best practices.

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

observability-engineer

Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows. Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability.

353 Updated today
aiskillstore
AI & Automation Listed

nav-init

Initialize Navigator documentation structure in a project. Auto-invokes when user says "Initialize Navigator", "Set up Navigator", "Create Navigator structure", or "Bootstrap Navigator".

190 Updated 3 days ago
alekspetrov
DevOps & Infrastructure Listed

hunt-cloud-misconfig

Hunt cloud / infrastructure misconfigurations. AWS: public S3 buckets (s3:GetObject anonymous), permissive bucket policies (PutObjectAcl public-write), exposed CloudFront origin, public Lambda function URL, public RDS snapshot, IAM credentials in JS bundles, AWS metadata accessible via SSRF. GCP: public GCS buckets, exposed Cloud Run services, leaked service account JSON. Azure: public blob containers, exposed Function App. (Kubernetes/Docker exposure is owned by hunt-k8s; CI/CD pipeline attacks by hunt-cicd; post-credential IAM escalation by cloud-iam-deep.) Detection: targeted dorking, certificate transparency, JS bundle secret extraction, port scan for known service ports. Validate: actual data read / write / RCE. Use when hunting cloud-native storage and compute misconfig (S3/GCS/Blob, IMDS-via-SSRF, serverless, public managed services).

1,912 Updated 3 days ago
elementalsouls
Web & Frontend Listed

building-soc-metrics-and-kpi-tracking

构建 SOC 绩效指标和 KPI 跟踪仪表盘,使用 SIEM 数据衡量平均检测时间(MTTD)、 平均响应时间(MTTR)、告���质量比率、分析师生产力和检测覆盖率。适用于 SOC 领导层 需要运营可视化、持续改进跟踪或高管级安全运营效能报告的场景。

26 Updated 1 months ago
killvxk
DevOps & Infrastructure Listed

devops-automator

Expert DevOps engineer specializing in infrastructure automation, CI/CD pipeline development, and cloud operations

9 Updated today
LiHongwei-cn
AI & Automation Listed

llm-self-loop

Restructure Web-UI / human-triggered tasks into CLI + file-output loops the LLM can iterate alone, with structured logs and addressable scratchpads. Apply trap-or-abandon: if a step cannot be looped, improve the harness rather than babysit. Trigger on iterative grunt-work, "push a button in a web UI to trigger this", monitoring dashboards, or any workflow whose inner loop requires a human in the middle.

27 Updated 2 days ago
OutlineDriven
AI & Automation Listed

observability-audit

Score observability across the four pillars — logs, metrics, traces, and alerts/dashboards — with per-service coverage heatmap. Cross-cutting synthesis. Static, live (Prometheus/Grafana/OTel/Datadog), and runtime (synthetic alert) modes.

3 Updated 3 days ago
anthril
Testing & QA Listed

application-performance-performance-engineer

Expert performance engineer specializing in modern observability, application optimization, and scalable system performance. Masters OpenTelemetry, distributed tracing, load testing, multi-tier caching, Core Web Vitals, and performance monitoring. Handles end-to-end optimization, real user monitoring, and scalability patterns. Use PROACTIVELY for performance optimization, observability, or scalability challenges.

2 Updated yesterday
Adnova-Group
AI & Automation Listed

monitoring-expert

Configures monitoring systems, implements structured logging pipelines, creates Prometheus/Grafana dashboards, defines alerting rules, and instruments distributed tracing. Implements Prometheus/Grafana stacks, conducts load testing, performs application profiling, and plans infrastructure capacity. Use when setting up application monitoring, adding observability to services, debugging production issues with logs/metrics/traces, running load tests with k6 or Artillery, profiling CPU/memory bottlenecks, or forecasting capacity needs.

7 Updated yesterday
ankurCES
AI & Automation Listed

monitor-scaffold

Drop in supervisor config + /healthz endpoint + restart runbook for each service in profile.monitors.targets, per supervisor (systemd / pm2 / k8s / docker-compose)

2 Updated today
bakw00ds
AI & Automation Listed

promql-cli

CLI for querying Prometheus and PromQL-compatible engines (Thanos, Cortex, VictoriaMetrics, Grafana Mimir, Grafana Tempo...) — instant queries, range queries, metric discovery (metrics/labels/meta subcommands), output formats (table/csv/json/graph). Apply when executing PromQL queries, troubleshooting performance issues on a software having observability, investigating latency/error rates/saturation, or analyzing time series data.

2 Updated today
hssh8917
AI & Automation Listed

grafana-dashboards

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

29 Updated 2 weeks ago
HermeticOrmus
DevOps & Infrastructure Listed

prometheus-configuration

Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.

29 Updated 2 weeks ago
HermeticOrmus
DevOps & Infrastructure Listed

openstack-monitoring

OpenStack monitoring operations skill for deploying, configuring, and operating the cloud health monitoring stack. Covers Prometheus metric collection and scrape targets, Grafana dashboard provisioning and visualization, Alertmanager notification channels and routing, alerting rules for service health and resource exhaustion, service endpoint health checks, log aggregation strategies, SLA tracking with availability and response time percentiles, and capacity trend analysis from historical metrics. Use when deploying monitoring via Kolla-Ansible, configuring alert thresholds, troubleshooting blank dashboards, tuning noisy alerts, or analyzing cloud performance trends.

65 Updated today
Tibsfox
Data & Documents Listed

data-engineering-data-pipeline

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

devops-orchestrator

Coordinates infrastructure, CI/CD, and deployment tasks. Use when provisioning infrastructure, setting up pipelines, configuring monitoring, or managing deployments. Applies devops-standard.md with DORA metrics.

353 Updated today
aiskillstore
AI & Automation Listed

grafana-dashboards

Create and manage production Grafana dashboards for real-time visualization of system and application metrics. Use when building monitoring dashboards, visualizing metrics, or creating operational observability interfaces.

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

prometheus-configuration

Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or configuring alerting systems.

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

service-mesh-observability

Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing SLOs for service communication.

353 Updated today
aiskillstore
DevOps & Infrastructure Listed

ccc-devops

complete DevOps ecosystem — 21 skills in one. Deployments, CI/CD, containers, AWS, monitoring, security, IaC, networking, and runbooks.

3 Updated today
KevinZai
DevOps & Infrastructure Listed

k8s-components-checker

Survey an RKE2 community cluster against an embedded compatibility registry of 19 stack components and produce a verdict for upgrade-readiness, drift-review, and version-skew questions. Components: RKE2, Rancher, Harvester, Cilium, Tetragon, cert-manager, Kyverno, KEDA, Argo CD, Harbor, Traefik, Rook, Ceph, OpenEBS, GitLab, ECK, Zalando postgres-operator, Grafana Mimir, NVIDIA GPU Operator. Works air-gapped — compatibility data lives in `references/compat/`. Surveys run via `kubectl` + `helm` + `pluto` + the apiserver `apiserver_requested_deprecated_apis` metric from the operator's workstation. Community editions only — Prime/EE-gated content is ignored. NOT for installing components, NOT for executing upgrades, NOT for tracking per-cluster running state (the registry is methodology, not inventory).

3 Updated today
air-gapped
AI & Automation Listed

prometheus-mimir-grafana

Query Prometheus and Grafana Mimir, write and debug PromQL, and build or fix Grafana dashboards — for agents solving problems from metrics. Covers the Prometheus HTTP API (`/api/v1/query`, `query_range`, `series`, `labels`, `metadata`), Mimir multi-tenancy (`X-Scope-OrgID`, federation `a|b|c`, per-tenant 422/429 limits), the PromQL surface (selectors, rate family, classic + native histograms, `histogram_quantile`, vector matching `on()`/`group_left`, recording rules), Grafana dashboard JSON (panels, targets, variables + interpolation specifiers, legacy `/api/dashboards/db` vs Grafana-12 `/apis/dashboard.grafana.app/v1beta1/…`), KPI frameworks (RED, USE, Golden Signals, SLO burn-rate), connection recipes, MCP servers vs curl, and the PromQL trap list.

3 Updated today
air-gapped
AI & Automation Listed

vllm-observability

Observe production vLLM — `/metrics` Prometheus surface (V1 engine), SLO-driven alerting on TTFT/ITL/queue/KV/preemption/aborts/corrupted-logits, shipping Grafana dashboards in `examples/observability/`, OTLP tracing with `--otlp-traces-endpoint` and `--collect-detailed-traces={model,worker,all}`, diagnostic rules to triage from /metrics alone — queue-grows + TPOT-stable means capacity, queue-stable + TPOT-grows means context/model, DCGM `SM_OCCUPANCY` is the real GPU-saturation signal not `GPU_UTIL`. V1 metric names (kv_cache_usage_perc), gpu_→kv_ rename saga, DCGM-exporter pairing, dashboard-lying pitfalls.

3 Updated today
air-gapped
API & Backend Listed

grafana-platform-dashboard

Validate OpenShift Grafana dashboards.

389 Updated today
boshu2
AI & Automation Listed

backend-developer

Backend Developer (/be, alias: James, /james) - Senior Backend Developer with 10+ years experience. Covers Java/Spring Boot (default), Kotlin, Python/FastAPI, PHP/Laravel, Quarkus, and Kafka/messaging - detects the project's stack and loads the matching reference. Use when implementing server features, REST APIs, business logic, persistence, messaging, or unit/integration tests in any of these stacks.

10 Updated today
olehsvyrydov
AI & Automation Listed

sre-engineer

SRE / Observability Engineer (/sre) — reliability engineering: SLOs/SLIs & error budgets, monitoring & alerting (Prometheus, Grafana, OpenTelemetry), incident response & runbooks, on-call, capacity & load, chaos/resilience, and post-incident reviews. Use when defining reliability targets, instrumenting observability, setting up alerting, writing runbooks, doing incident response, or reviewing a change for production readiness. Invoke alongside /arch for reliability NFRs and devops-engineer for the underlying infra/CI-CD. NOT for provisioning infra or pipelines (that's devops-engineer) — /sre owns reliability, not the cluster.

10 Updated today
olehsvyrydov
DevOps & Infrastructure Listed

docker

Manage the workflow engine's Docker Compose stack. Use when starting, stopping, rebuilding containers, or resetting the database.

163 Updated today
Altinn
DevOps & Infrastructure Listed

golang-observability

Golang everyday observability — the always-on signals in production. Covers structured logging with slog, Prometheus metrics, OpenTelemetry distributed tracing, continuous profiling with pprof/Pyroscope, server-side RUM event tracking, alerting, and Grafana dashboards. Apply when instrumenting Go services for production monitoring, setting up metrics or alerting, adding OpenTelemetry tracing, correlating logs with traces, migrating legacy loggers (zap/logrus/zerolog) to slog, adding observability to new features, or implementing GDPR/CCPA-compliant tracking with Customer Data Platforms (CDP). Not for temporary deep-dive performance investigation (→ See golang-benchmark and golang-performance skills).

0 Updated today
guynhsichngeodiec
AI & Automation Listed

kookr-oss-contribution-gate

Rate limiting and blocked-repo enforcement for OSS contributions — hook behavior, oss-gate CLI, ledger format, configuration

2 Updated today
kookr-ai
Data & Documents Listed

pr-contribution-excellence

Patterns for excellent open-source PR contributions, distilled from analyzing real PRs across repositories

2 Updated today
kookr-ai
DevOps & Infrastructure Listed

devops

DevOps practices, CI/CD, and infrastructure management

0 Updated today
murtazatouqeer
AI & Automation Listed

monitoring-expert

Use when setting up monitoring systems, logging, metrics, tracing, or alerting. Invoke for dashboards, Prometheus/Grafana, load testing, profiling, capacity planning.

2 Updated today
zacklecon
Web & Frontend Listed

dashboard-designer

Use this skill when designing a data dashboard—choosing KPIs, structuring layout, applying visual hierarchy, or deciding which BI tool to use. Trigger phrases: 'design a dashboard', 'build a KPI dashboard', 'what should my dashboard show', 'help me layout a dashboard', 'dashboard for monitoring'. Not for building chart code from scratch (use chart-builder), writing SQL queries (use sql-analyst), or designing marketing landing pages.

15 Updated 2 days ago
NickCrew
Code & Development Listed

monitor-setup

Set up error tracking, alerting, and health checks. Use AFTER deploy to ensure observability. Step 7 of 7-step workflow. Maps to H7 (Sharpen the Saw).

3 Updated 2 days ago
pitimon
Code & Development Listed

targeted-debug

Focused debug of a specific production issue — read only files named in the stack trace, error message, or user input; form a hypothesis from observable evidence; do NOT explore the codebase broadly. Use when the user wants to understand a specific bug or error WITHOUT spinning up a full /investigate pipeline.

0 Updated 3 days ago
Tamircohen28
AI & Automation Listed

evan-insight-blog-writer

evan-insight 블로그 투자 분석 글 작성. 어그로 두괄식, 쉬운 언어, 자연스러운 문체. 투자 분석, 주식 분석, 기업 분석, 블로그 글쓰기, 투자 글, evan-insight, 100배 주식, Next 구글, Next NVIDIA 관련 키워드로 트리거.

3 Updated 5 days ago
U2SY26
AI & Automation Listed

hanun-observability-craft

How Hanun wires hardening overlays, secrets ops, and observability (Prometheus / Grafana / Loki / Sentry / GlitchTip / OpenTelemetry) — the always-reapply-on-recreate rule, the metric / log / trace separation, the alerting discipline, and the no-prod-execution boundary. Invoke when observability wiring or hardening setup is in scope.

3 Updated 2 days ago
Y4NN777
Data & Documents Listed

observability

This skill should be used when the user asks about "observability" or "monitoring", what "metrics, logs, and traces" to collect, "health checks" (liveness/readiness), "alerting" or "on-call", "SLO/SLI" or "error budgets", the "RED" or "USE" method, "dashboards", or names a tool like "Prometheus", "Grafana", or "Datadog". Use it whenever a design has no answer to "how would we know this is broken?" or "what do we alert on?" — i.e. any time failure would be invisible until users complain, even if the user doesn't say "observability".

6 Updated 1 weeks ago
proyecto26
AI & Automation Listed

reject-job

This skill should be used when the user wants to reject, hide, or filter out a remote job from future email digests. Triggers on phrases like "reject this job", "hide [company]", "add to reject list", "don't show [company] again", "remove [company] from results", or when reviewing remote job emails and marking jobs as not relevant.

5 Updated yesterday
Mahashwetha
AI & Automation Listed

service-mesh-observability

Implement comprehensive observability for service meshes including distributed tracing, metrics, and visualization. Use when setting up mesh monitoring, debugging latency issues, or implementing SLOs for service communication.

2 Updated today
Mohammadibrahim55
Web & Frontend Listed

implementing-observability

Monitoring, logging, and tracing implementation using OpenTelemetry as the unified standard. Use when building production systems requiring visibility into performance, errors, and behavior. Covers OpenTelemetry (metrics, logs, traces), Prometheus, Grafana, Loki, Jaeger, Tempo, structured logging (structlog, tracing, slog, pino), and alerting.

374 Updated 6 months ago
ancoleman
AI & Automation Listed

couchbase-observability

Monitor, alert on, and observe Couchbase clusters in production. Use whenever the user asks about Couchbase metrics, Prometheus, Grafana, alerting, alert thresholds, memory high watermark, disk usage, replication lag, query latency, index build progress, DCP lag, ops/sec, cache miss ratio, Couchbase Exporter, admin_stats_* tools, log aggregation, SIEM shipping, health checks, or 'how do I know if my Couchbase cluster is healthy.' Distinct from couchbase-mcp (calling the tools) and couchbase-security-hardening (audit log shipping). Use proactively for new production deployments needing an observability stack, incident response setup, and SLO definition.

1 Updated 2 weeks ago
celticht32
AI & Automation Listed

observability-and-growth

Full instrumentation from day one. PostHog consolidates product analytics + feature flags + error tracking (one platform, one bill). GA4 via GTM (14-step automation, custom dimensions over events, server-side tagging). Sentry (deep error tracking + performance). Stripe (webhook-first with idempotent processing). Listmonk on Coolify (newsletters via Resend SMTP relay). PLG 7-layer framework. Programmatic SEO (5 page types). Incident auto-remediation via Sentry→Inngest pipeline. AI search (GEO) awareness. Local business conversions (phone_click, direction_click, form_submit, booking_click) with CRO patterns for both SaaS and local.

11 Updated today
heymegabyte
DevOps & Infrastructure Listed

grafana-architect

Grafana dashboards + alerts — dashboards-as-code (Grizzly), per-service folders, one-question-per-panel, unified alerting with runbooks, low-cardinality discipline. Use when designing dashboards, writing alert rules, or auditing.

2 Updated yesterday
ralvarezdev
Code & Development Listed

observability-architect

Application-side observability — structured logs, Prometheus metrics, OTel traces, signal correlation, head sampling, PII discipline, RED+USE. Use when instrumenting code, naming metrics, or auditing what a service emits.

2 Updated yesterday
ralvarezdev
DevOps & Infrastructure Listed

monitoring

监控与告警

1 Updated today
ryukyagamilight
AI & Automation Listed

monitoring-observability

Provides monitoring and observability best practices covering the three pillars (logs, metrics, traces), OpenTelemetry instrumentation, Prometheus/Grafana dashboards, SLO-based alerting, and APM strategies. Use when setting up monitoring, observability, prometheus, grafana, opentelemetry, alerting, tracing, logging, metrics, dashboards, SLOs, or APM.

65 Updated today
Tibsfox
AI & Automation Listed

grafana-alert-router

Routes Grafana alerting webhook payloads to Slack, PagerDuty, and OpsGenie channels based on label matching rules. Supports alert grouping and silence management via the Grafana Alerting API.

13 Updated today
agentskillexchange
AI & Automation Listed

obs-bootstrap

Step-by-step OpenTelemetry and uFawkesObs setup: SDK init patterns for TypeScript, Python, Go; DORA metric spans; Grafana dashboard spec. Use when adding observability to a service.

2 Updated today
paruff
Data & Documents Listed

pipeline-bootstrap

Step-by-step guide to connect a uFawkesAI project to uFawkesPipe and fawkes platform: Dockerfile, ArgoCD manifest, DORA deployment spans. Use when setting up CI/CD for a new service.

2 Updated today
paruff
DevOps & Infrastructure Listed

monitoring-patterns

Application monitoring patterns covering Prometheus metrics (Counter, Gauge, Histogram, Summary), the prometheus-client Python library, metric naming conventions, labels, and health check endpoints. Use whenever a Python project instruments metrics, uses prometheus-client, or the user asks about Prometheus, metrics, monitoring, health checks, or observability, even if "Prometheus" is not mentioned by name.

0 Updated today
ku5ic
AI & Automation Listed

mcp-agent-manager

Route user runtime requests to a scoped MCP and call the best matching tool. Also manages MCP setup, health checks, Teleport sync, and agent rendering.

1 Updated 1 weeks ago
lkhung09
AI & Automation Listed

obs-guardian

Builds observability, monitoring, alerting, and incident visibility for production systems. Covers OpenTelemetry instrumentation for traces, metrics, and logs; structured logging with JSON, correlation IDs, and sampling; Prometheus and Grafana scrape configs, dashboards, and recording rules; distributed tracing with Jaeger and Tempo; SLO/SLA definition, error budgets, burn-rate alerts; PagerDuty and OpsGenie alerting rules; and on-call runbook templates. Use this skill when the user says "set up monitoring," "instrument with OpenTelemetry," "add structured logging," "set up Grafana dashboards," "define SLOs," "no visibility into my app," "tracing across microservices," "alerting rules," or "production incident with no logs."

1 Updated 2 weeks ago
mturac
DevOps & Infrastructure Listed

devops-best-practices

Opinionated production-grade DevOps defaults for Terraform, Kubernetes, CI/CD, Docker, cloud security, observability, cost, and disaster recovery. ALWAYS use when generating, reviewing, or modifying any infrastructure code, Kubernetes manifests (Deployment, Service, StatefulSet, Helm, Kustomize), Terraform (.tf, modules, state), Dockerfiles, docker-compose, CI/CD pipelines (.github/workflows, .gitlab-ci.yml, Jenkinsfile), cloud resources (AWS/GCP/Azure), IAM policies, security groups, observability setup (Prometheus, Grafana, OpenTelemetry), or DNS/TLS/CDN config — even if the user does not explicitly ask for best practices. Prevents the failure modes that hurt production teams most often: missing PDBs, single replicas in prod, latest image tags, public S3 buckets, long-lived credentials, missing observability, and CI/CD supply-chain risks. Apply opinionated defaults by default; surface tradeoffs when the user has reason to deviate.

0 Updated 2 days ago
ronalships
AI & Automation Listed

office-docs

Generate PPTX presentations, DOCX reports, and XLSX spreadsheets from structured data — using python-pptx, python-docx, and openpyxl without requiring Microsoft Office

4 Updated 2 days ago
Claudient
AI & Automation Listed

observability

Backend observability patterns — structured logging, Micrometer metrics, OpenTelemetry tracing, Spring Boot Actuator, Kubernetes health probes, alerting, and dashboards. Use when user mentions logging, metrics, tracing, monitoring, health checks, or Prometheus.

0 Updated today
IuliaIvanaPatras
AI & Automation Listed

aio-grafana-diagram

Create Grafana diagrams for system visualization — analyzes codebase to auto-generate Mermaid diagrams with metric binding. For standalone Mermaid diagrams use aio-mermaid instead.

3 Updated 1 weeks ago
aiocean
DevOps & Infrastructure Listed

spring-microservices-architect

Production-grade governance agent for Spring Boot microservices. Scaffolds projects iteratively using capability-based layering, enforces coding standards, and validates against battle-tested reference patterns. Fully portable — works with any domain. USE FOR: microservice, Spring Boot, scaffold, Docker compose, kubernetes, helm, eureka, gateway, resilience4j, reactive, spring cloud, openapi, persistence, security, oauth, tracing, zipkin, monitoring, prometheus, grafana, native compilation, graalvm, code review, architecture review, quality gate, governance, spring cloud stream, rabbitmq, kafka, testcontainers, mapstruct, service discovery, edge server, config server, circuit breaker, distributed tracing, entity, entities, domain model, generate entity, persistence model, create entity, MongoDB document, JPA entity, MapStruct mapper, repository, test, verify, validate, TDD, test-driven, failing test, integration test, build check, regression test, quality check, security database, MFA, multi-factor, WebAuthn,

0 Updated 1 months ago
jaykumarpatil
Data & Documents Listed

managing-context

Discovers and loads relevant project context from markdown documentation before each task. Matches context documents based on keywords, file paths, and task types. Use at task start to access project plans, architecture, and implementation status.

1 Updated 1 weeks ago
Open330

Integration detected automatically from skill content. Some results may be false positives.