coreweave-deploy-integration

Featured

Deploy inference services on CoreWeave with Helm charts and Kustomize. Use when deploying multi-model inference, managing GPU deployments at scale, or templating CoreWeave manifests. Trigger with phrases like "deploy coreweave", "coreweave helm", "coreweave kustomize", "coreweave deployment patterns".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# CoreWeave Deploy Integration ## Overview Deploy GPU-accelerated inference services on CoreWeave Kubernetes (CKS). This skill covers containerizing inference workloads with NVIDIA CUDA base images, configuring GPU resource limits and node affinity for A100/H100 scheduling, setting up health checks that validate GPU availability and model loading, and executing rolling updates that respect GPU node draining. CoreWeave's scheduler requires explicit GPU resource requests to place pods on the correct hardware tier. ## Docker Configuration ```dockerfile FROM nvidia/cuda:12.4.0-runtime-ubuntu22.04 AS base RUN apt-get update && apt-get install -y --no-install-recommends \ python3 python3-pip curl && rm -rf /var/lib/apt/lists/* WORKDIR /app COPY requirements.txt ./ RUN pip3 install --no-cache-dir -r requirements.txt FROM base RUN groupadd -r app && useradd -r -g app app COPY --chown=app:app src/ ./src/ COPY --chown=app:app models/ ./models/ USER app EXPOSE 8080 HEALTHCHECK --interval=30s --timeout=10s --retries=3 \ CMD curl -f http://localhost:8080/health || exit 1 CMD ["python3", "src/server.py"] ``` ## Environment Variables ```bash export COREWEAVE_API_KEY="cw_xxxxxxxxxxxx" export COREWEAVE_NAMESPACE="tenant-my-org" export MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" export GPU_TYPE="A100_PCIE_80GB" export GPU_COUNT="1" export LOG_LEVEL="info" export PORT="8080" ``` ## Health Check Endpoint ```typescript import express from 'express'; import { execSync } from 'chil...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

coreweave-hello-world

Deploy a GPU workload on CoreWeave with kubectl. Use when running your first GPU job, testing inference, or verifying CoreWeave cluster access. Trigger with phrases like "coreweave hello world", "coreweave first deploy", "coreweave gpu test", "run on coreweave".

2,266 Updated today
jeremylongshore
DevOps & Infrastructure Featured

coreweave-ci-integration

Integrate CoreWeave deployments into CI/CD pipelines with GitHub Actions. Use when automating container builds, deploying inference services from CI, or validating GPU manifests in pull requests. Trigger with phrases like "coreweave CI", "coreweave github actions", "coreweave pipeline", "automate coreweave deploy".

2,266 Updated today
jeremylongshore
AI & Automation Featured

coreweave-local-dev-loop

Set up local development workflow for CoreWeave GPU deployments. Use when building containers locally, testing YAML manifests, or iterating on model serving configurations before deploying. Trigger with phrases like "coreweave dev setup", "coreweave local testing", "develop for coreweave", "coreweave container build".

2,266 Updated today
jeremylongshore
AI & Automation Featured

coreweave-install-auth

Configure CoreWeave Kubernetes Service (CKS) access with kubeconfig and API tokens. Use when setting up kubectl access to CoreWeave, configuring CKS clusters, or authenticating with CoreWeave cloud services. Trigger with phrases like "install coreweave", "setup coreweave", "coreweave kubeconfig", "coreweave auth", "connect to coreweave".

2,266 Updated today
jeremylongshore
AI & Automation Featured

coreweave-observability

Set up GPU monitoring and observability for CoreWeave workloads. Use when implementing GPU metrics dashboards, configuring alerts, or tracking inference latency and throughput. Trigger with phrases like "coreweave monitoring", "coreweave observability", "coreweave gpu metrics", "coreweave grafana".

2,266 Updated today
jeremylongshore