coreweave-rate-limits

Solid

Handle CoreWeave API and GPU quota limits. Use when hitting quota limits, managing GPU resource allocation, or implementing request queuing for inference endpoints. Trigger with phrases like "coreweave quota", "coreweave limits", "coreweave gpu allocation", "coreweave throttle".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 97/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# CoreWeave Rate Limits ## Overview CoreWeave limits are primarily GPU quota-based rather than API rate limits. Each namespace has allocated GPU quotas per type. ## Check GPU Quota ```bash kubectl describe resourcequota -n my-namespace kubectl get resourcequota -o json | jq '.items[].status' ``` ## Inference Request Queuing ```python import asyncio from collections import deque class InferenceQueue: def __init__(self, max_concurrent: int = 10): self.semaphore = asyncio.Semaphore(max_concurrent) self.queue_depth = 0 async def inference(self, client, prompt: str) -> str: self.queue_depth += 1 async with self.semaphore: try: return await asyncio.to_thread(client.generate, prompt) finally: self.queue_depth -= 1 ``` ## Resources - [CoreWeave Node Pools](https://docs.coreweave.com/products/cks/nodes/manage) ## Next Steps For security, see `coreweave-security-basics`.

Details

Author: jeremylongshore
Repository: jeremylongshore/claude-code-plugins-plus-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

Anthropic · AI Kubernetes · Infrastructure

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

coreweave-cost-tuning

Optimize CoreWeave GPU cloud costs with right-sizing and scheduling. Use when reducing GPU spend, selecting cost-effective instances, or implementing scale-to-zero for dev workloads. Trigger with phrases like "coreweave cost", "coreweave pricing", "reduce coreweave spend", "coreweave budget".

2,266 Updated today

jeremylongshore

AI & Automation Solid

coreweave-common-errors

Diagnose and fix CoreWeave GPU scheduling, pod, and networking errors. Use when pods are stuck Pending, GPUs are not allocated, or experiencing CUDA and NCCL errors. Trigger with phrases like "coreweave error", "coreweave pod pending", "coreweave gpu not found", "coreweave debug", "fix coreweave".

2,266 Updated today

jeremylongshore

AI & Automation Solid

coreweave-performance-tuning

Optimize CoreWeave GPU inference latency and throughput. Use when reducing inference latency, maximizing GPU utilization, or tuning batch sizes and concurrency. Trigger with phrases like "coreweave performance", "coreweave latency", "coreweave throughput", "optimize coreweave inference".

2,266 Updated today

jeremylongshore

AI & Automation Featured

coreweave-observability

Set up GPU monitoring and observability for CoreWeave workloads. Use when implementing GPU metrics dashboards, configuring alerts, or tracking inference latency and throughput. Trigger with phrases like "coreweave monitoring", "coreweave observability", "coreweave gpu metrics", "coreweave grafana".

2,266 Updated today

jeremylongshore

AI & Automation Featured

coreweave-sdk-patterns

Production-ready patterns for CoreWeave GPU workload management with kubectl and Python. Use when building inference clients, managing GPU deployments programmatically, or creating reusable CoreWeave deployment templates. Trigger with phrases like "coreweave patterns", "coreweave client", "coreweave Python", "coreweave deployment template".

2,266 Updated today

jeremylongshore