coreweave-cost-tuning

Solid

Optimize CoreWeave GPU cloud costs with right-sizing and scheduling. Use when reducing GPU spend, selecting cost-effective instances, or implementing scale-to-zero for dev workloads. Trigger with phrases like "coreweave cost", "coreweave pricing", "reduce coreweave spend", "coreweave budget".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 97/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# CoreWeave Cost Tuning ## GPU Pricing Reference (approximate) | GPU | Per GPU/hour | Best For | |-----|-------------|----------| | A100 40GB PCIe | ~$1.50 | Development, smaller models | | A100 80GB PCIe | ~$2.21 | Production inference | | H100 80GB PCIe | ~$4.76 | High-throughput inference | | H100 SXM5 (8x) | ~$6.15/GPU | Training, multi-GPU | | L40 | ~$1.10 | Image generation, light inference | ## Cost Optimization Strategies ### Scale-to-Zero for Dev/Staging ```yaml autoscaling.knative.dev/minScale: "0" autoscaling.knative.dev/scaleDownDelay: "5m" ``` ### Right-Size GPU Selection ```python def recommend_gpu(model_size_b: float, inference_only: bool = True) -> str: if model_size_b <= 7: return "L40" if inference_only else "A100_PCIE_80GB" elif model_size_b <= 13: return "A100_PCIE_80GB" elif model_size_b <= 70: return "A100_PCIE_80GB (4x tensor parallel)" else: return "H100_SXM5 (8x tensor parallel)" ``` ### Quantization to Use Smaller GPUs Use AWQ or GPTQ quantization to fit larger models on smaller GPUs: ```bash # 70B model at 4-bit fits on single A100-80GB instead of 4x vllm serve meta-llama/Llama-3.1-70B-Instruct-AWQ --quantization awq ``` ## Resources - [CoreWeave Pricing](https://www.coreweave.com/pricing) - [CoreWeave GPU Instances](https://docs.coreweave.com/docs/platform/instances/gpu-instances) ## Next Steps For architecture patterns, see `coreweave-reference-architecture`.

Details

Author: jeremylongshore
Repository: jeremylongshore/claude-code-plugins-plus-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

coreweave-common-errors

Diagnose and fix CoreWeave GPU scheduling, pod, and networking errors. Use when pods are stuck Pending, GPUs are not allocated, or experiencing CUDA and NCCL errors. Trigger with phrases like "coreweave error", "coreweave pod pending", "coreweave gpu not found", "coreweave debug", "fix coreweave".

2,266 Updated today

jeremylongshore