groq-observability

Featured

Set up observability for Groq integrations: latency histograms, token throughput, rate limit gauges, cost tracking, and Prometheus alerts. Trigger with phrases like "groq monitoring", "groq metrics", "groq observability", "monitor groq", "groq alerts", "groq dashboard".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Groq Observability ## Overview Monitor Groq LPU inference for latency, token throughput, rate limit utilization, and cost. Groq's defining advantage is speed (280-560 tok/s), so latency degradation is the highest-priority signal. The API returns rich timing metadata (`queue_time`, `prompt_time`, `completion_time`) and rate limit headers on every response. ## Key Metrics to Track | Metric | Type | Source | Why | |--------|------|--------|-----| | TTFT (time to first token) | Histogram | Client-side timing | Groq's main value prop | | Tokens/second | Gauge | `usage.completion_time` | Throughput degradation | | Total latency | Histogram | Client-side timing | End-to-end performance | | Rate limit remaining | Gauge | `x-ratelimit-remaining-*` headers | Prevent 429s | | Token usage | Counter | `usage.total_tokens` | Cost attribution | | Error rate by code | Counter | Error handler | Availability | | Estimated cost | Counter | Tokens * model price | Budget tracking | ## Instructions ### Step 1: Instrumented Groq Client ```typescript import Groq from "groq-sdk"; const groq = new Groq(); interface GroqMetrics { model: string; latencyMs: number; ttftMs: number; tokensPerSec: number; promptTokens: number; completionTokens: number; totalTokens: number; queueTimeMs: number; estimatedCostUsd: number; } const PRICE_PER_1M: Record<string, { input: number; output: number }> = { "llama-3.1-8b-instant": { input: 0.05, output: 0.08 }, "llama-3.3-70b-versatile": {...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

groq-cost-tuning

Optimize Groq costs through model routing, token management, and usage monitoring. Use when analyzing Groq billing, reducing API costs, or implementing usage monitoring and budget alerts. Trigger with phrases like "groq cost", "groq billing", "reduce groq costs", "groq pricing", "groq expensive", "groq budget".

2,266 Updated today
jeremylongshore
AI & Automation Featured

groq-performance-tuning

Optimize Groq API performance with model selection, caching, streaming, and parallel requests. Use when experiencing slow responses, implementing caching strategies, or optimizing request throughput for Groq integrations. Trigger with phrases like "groq performance", "optimize groq", "groq latency", "groq caching", "groq slow", "groq speed".

2,266 Updated today
jeremylongshore
AI & Automation Featured

groq-rate-limits

Implement Groq rate limit handling with backoff, queuing, and header parsing. Use when handling rate limit errors, implementing retry logic, or optimizing API request throughput for Groq. Trigger with phrases like "groq rate limit", "groq throttling", "groq 429", "groq retry", "groq backoff".

2,266 Updated today
jeremylongshore
AI & Automation Featured

algolia-observability

Set up observability for Algolia: Prometheus metrics for search latency/errors, OpenTelemetry tracing, structured logging, and Grafana dashboards. Trigger: "algolia monitoring", "algolia metrics", "algolia observability", "monitor algolia", "algolia alerts", "algolia tracing", "algolia dashboard".

2,266 Updated today
jeremylongshore
AI & Automation Featured

groq-enterprise-rbac

Configure Groq organization management, API key scoping, spending controls, and team access patterns. Trigger with phrases like "groq organization", "groq RBAC", "groq enterprise", "groq team access", "groq spending limits", "groq multi-team".

2,266 Updated today
jeremylongshore