coreweave-deploy-integration

Featured

Deploy inference services on CoreWeave with Helm charts and Kustomize. Use when deploying multi-model inference, managing GPU deployments at scale, or templating CoreWeave manifests. Trigger with phrases like "deploy coreweave", "coreweave helm", "coreweave kustomize", "coreweave deployment patterns".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# CoreWeave Deploy Integration ## Overview Deploy GPU-accelerated inference services on CoreWeave Kubernetes (CKS). This skill covers containerizing inference workloads with NVIDIA CUDA base images, configuring GPU resource limits and node affinity for A100/H100 scheduling, setting up health checks that validate GPU availability and model loading, and executing rolling updates that respect GPU node draining. CoreWeave's scheduler requires explicit GPU resource requests to place pods on the correct hardware tier. ## Docker Configuration ```dockerfile FROM nvidia/cuda:12.4.0-runtime-ubuntu22.04 AS base RUN apt-get update && apt-get install -y --no-install-recommends \ python3 python3-pip curl && rm -rf /var/lib/apt/lists/* WORKDIR /app COPY requirements.txt ./ RUN pip3 install --no-cache-dir -r requirements.txt FROM base RUN groupadd -r app && useradd -r -g app app COPY --chown=app:app src/ ./src/ COPY --chown=app:app models/ ./models/ USER app EXPOSE 8080 HEALTHCHECK --interval=30s --timeout=10s --retries=3 \ CMD curl -f http://localhost:8080/health || exit 1 CMD ["python3", "src/server.py"] ``` ## Environment Variables ```bash export COREWEAVE_API_KEY="cw_xxxxxxxxxxxx" export COREWEAVE_NAMESPACE="tenant-my-org" export MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" export GPU_TYPE="A100_PCIE_80GB" export GPU_COUNT="1" export LOG_LEVEL="info" export PORT="8080" ``` ## Health Check Endpoint ```typescript import express from 'express'; import { execSync } from 'chil...

Details

Author: jeremylongshore
Repository: jeremylongshore/claude-code-plugins-plus-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

coreweave-observability

Set up GPU monitoring and observability for CoreWeave workloads. Use when implementing GPU metrics dashboards, configuring alerts, or tracking inference latency and throughput. Trigger with phrases like "coreweave monitoring", "coreweave observability", "coreweave gpu metrics", "coreweave grafana".

2,266 Updated today

jeremylongshore