coreweave-core-workflow-a

Featured

Deploy KServe InferenceService on CoreWeave with autoscaling and GPU scheduling. Use when serving ML models with KServe, configuring scale-to-zero, or deploying production inference endpoints on CoreWeave. Trigger with phrases like "coreweave inference service", "coreweave kserve", "coreweave model serving", "deploy model on coreweave".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# CoreWeave Core Workflow: KServe Inference ## Overview Deploy production inference services on CoreWeave using KServe InferenceService with GPU scheduling, autoscaling, and scale-to-zero. CKS natively integrates with KServe for serverless GPU inference. ## Prerequisites - Completed `coreweave-install-auth` setup - KServe available on your CKS cluster - Model stored in S3, GCS, or HuggingFace ## Instructions ### Step 1: Deploy an InferenceService ```yaml # inference-service.yaml apiVersion: serving.kserve.io/v1beta1 kind: InferenceService metadata: name: llama-inference annotations: autoscaling.knative.dev/class: "kpa.autoscaling.knative.dev" autoscaling.knative.dev/metric: "concurrency" autoscaling.knative.dev/target: "1" autoscaling.knative.dev/minScale: "1" autoscaling.knative.dev/maxScale: "5" spec: predictor: minReplicas: 1 maxReplicas: 5 containers: - name: kserve-container image: vllm/vllm-openai:latest args: - "--model" - "meta-llama/Llama-3.1-8B-Instruct" - "--port" - "8080" ports: - containerPort: 8080 protocol: TCP resources: limits: nvidia.com/gpu: "1" memory: 48Gi cpu: "8" requests: nvidia.com/gpu: "1" memory: 32Gi cpu: "4" env: - name: HUGGING_FACE_HUB_TOKEN valueFrom: secretKeyRef: ...

Details

Author: jeremylongshore
Repository: jeremylongshore/claude-code-plugins-plus-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

coreweave-install-auth

Configure CoreWeave Kubernetes Service (CKS) access with kubeconfig and API tokens. Use when setting up kubectl access to CoreWeave, configuring CKS clusters, or authenticating with CoreWeave cloud services. Trigger with phrases like "install coreweave", "setup coreweave", "coreweave kubeconfig", "coreweave auth", "connect to coreweave".

2,266 Updated today

jeremylongshore