← All creators

air-gapped

Organization

Claude Code plugin marketplace — 58 installable reference skills across vLLM/SGLang inference, Kubernetes & Harvester, GPU host bring-up, observability, security, and agent workflows.

58 indexed · 0 Featured · 3 stars · avg score 79

Prolific

View on GitHub →

Categories

AI & Automation (43) API & Backend (1) Data & Documents (2) DevOps & Infrastructure (12)

Indexed Skills (58)

AI & Automation Solid

autoresearch

Autonomous experiment loops that hill-climb a measurable metric — apply one change, measure, keep it only if the number improved, revert if not, repeat unattended. Also deep multi-perspective research producing a saved report, and research-then-optimize when no metric exists yet.

3 Updated 2 days ago

AI & Automation Solid

aiperf

NVIDIA AIPerf — vendor-neutral generative-AI inference benchmarking (genai-perf successor). Covers `aiperf profile` with concurrency / request-rate / fixed-schedule trace replay / user-centric / multi-run confidence, 17 endpoint types (chat, completions, embeddings, rankings, responses, image-gen, image-edit, video-gen, NIM, HF-TGI, raw, template, etc.), 10 custom dataset formats (single_turn, multi_turn, mooncake_trace, bailian_trace, burst_gpt_trace, random_pool, dag_jsonl, raw_payload, inputs_json, sagemaker_data_capture) plus the SPEED-Bench family, 20+ public datasets, goodput SLOs, GPU + Prometheus telemetry, plot/analyze-trace/synthesize/service subcommands, plugin extensibility, and reasoning-token TTFT/TTFO split.

3 Updated 2 days ago

AI & Automation Solid

airgap-vetting

Vet an open-source product for air-gap readiness BEFORE adoption. Answers eight questions — telemetry, does-opt-out-actually-work, proxy-in-disguise over a hosted API, runtime downloads, custom-CA support, feature-level content egress / offline degradation, day-two sustainment (feed mirroring, staleness), and a 4-grade verdict (air-gap-native / possible-with-mirror / proxy-in-disguise / no-go). Two-pass: static grep of source + container image (bundled fingerprint tables work offline), then an optional dynamic harness (--network=none, egress deny+log, mitmproxy CA injection, faketime). Writes AIRGAP-VETTING.json + .md. Use for any "should we adopt X?" question in a disconnected environment, not just when the user says "air gap".

3 Updated 2 days ago

DevOps & Infrastructure Solid

ansible-idrac-9-10

Run and debug `dellemc.openmanage` Ansible playbooks against Dell PowerEdge **iDRAC 9** (14G–16G) and **iDRAC 10** (17G — R670, R770, R870, R970, XE9780, XE9785). Covers the iDRAC 10 / iDRAC 9 ≥ 7.30.10.50 `BasicAuthState: Unadvertised` default that silently 401s `ansible.builtin.uri` (Dell KB 000437501), the `idrac_session` + `x_auth_token` lifecycle with `block:/always:`, `force_basic_auth: true` fallback for raw Redfish, OMSDK modules (`idrac_firmware`, `idrac_server_config_profile`) that cannot use tokens, iDRAC 10 attribute deltas (`iDRAC.IPv4Static.*` → `iDRAC.IPv4.Static*`, `iDRAC.NIC.*` → `iDRAC.Network.*`, ACME+SCEP → `iDRAC.ACE`, `BIOS.SysSecurity.AcPwrRcvry*` → `System.ServerPwr.*`), iDRAC 9-only modules (`idrac_network` → `idrac_network_attributes`, `idrac_syslog`, `idrac_timezone_ntp`), iDRAC 10 Redfish Jobs URI under `/Oem/Dell/Jobs/`, WS-MAN removed on 17G, and version pins (collection ≥9.12.3 broad / ≥10.0.2 full; 9.12.1 for iDRAC 8).

3 Updated 2 days ago

DevOps & Infrastructure Solid

argo-cd-apps

Author and maintain Argo CD `Application` and `ApplicationSet` manifests as a GitOps consumer (publisher), targeting Argo CD v3.3 / v3.4 (May 2026). Covers source types (Helm, Kustomize, OCI, multi-source, plugin), sync policies + options + waves + hooks, ApplicationSet generators (List, Cluster, Git, Matrix, Merge, SCMProvider, PullRequest, Plugin, ClusterDecisionResource), Progressive Sync (Beta), Source Hydrator (still Alpha), AppProjects, RBAC, sync impersonation (`destinationServiceAccounts`), GPG/cosign signature verification, GitOps repo layout (mono vs poly, app-of-apps vs ApplicationSet — Argo recommends ApplicationSet first), troubleshooting drift / OutOfSync / sync loops / stuck-deletion / hook failures, and v3.0→v3.4 changes (annotation tracking default, SSA-migration regression, CVE-2026-42880 Secret leak). NOT for installing or operating the Argo CD control plane (HA, Dex, repo-server tuning, UI customization).

3 Updated 2 days ago

AI & Automation Solid

confluence-best-practices

Advise on USING Confluence well, not operating it: make the structural call — is this a space, a page, or a child page? — diagnose why a wiki is a dread (can't find anything, content rots, duplicates, hidden by permissions, unreadable), and recommend the lean fix. Built FIRST for an agent that ACTS on Confluence (creates/organises/governs content via REST/CQL or an MCP server) and SECOND for helping humans author readable pages. Self-hosted Server/Data Center first (storage format NOT ADF; no native page archive; REST v1), but works for Cloud too. Adapt to the org's own space conventions and working language; never auto-translate content. Covers ALL content types — knowledge base, docs, intranet, meeting notes, runbooks, decision records.

3 Updated 2 days ago

AI & Automation Solid

gpu-host-tuning

Audit AND tune Linux/GPU inference hosts — read-only host snapshot (CPU power state, C-states, NUMA topology, PCIe link state, GPU settings, kernel boot params, sysctl, ulimits, IRQ affinity, container runtime), optional pinned-host↔GPU memcpy bench (torch + numactl), and per-lever cheat-sheets to flip settings (governor, EPP, cpuidle, persistence, ECC, hugepages, intel_iommu, NCCL env, tuned-adm profiles, Dell/Supermicro/HPE BIOS guidance). Sits beneath any inference framework (vLLM, sglang, TensorRT-LLM) — about the host, not the framework.

3 Updated 2 days ago

DevOps & Infrastructure Solid

harvester-upgrade

Plan and run a controlled, COMMUNITY-edition Harvester HCI upgrade off an EOL line up to latest stable — the no-skip minor ladder (1.5→1.6→1.7→1.8; embedded RKE2/KubeVirt/Longhorn/SLE-Micro ride along), gated at each hop on first upgrading the EXTERNAL Rancher + a matching Harvester UI-extension (1.6↔Rancher 2.12, 1.7↔2.13, 1.8↔2.14). Covers air-gapped version detection, why node-upgrade order is NOT operator-choosable (forced serial; the pause knob is v1.7.0+ only) and how to protect VM-hosted control planes anyway via anti-affinity spread + N+1 live-migration, making self-managed RKE2 guests Harvester-aware (cloud-provider, CSI, qemu-guest-agent), per-hop breaking changes (wicked→NetworkManager, Intel NIC rename, DHCP IP churn), the enforced pre-flight health gates, and the no-downgrade backup/rollback reality. Companion to k8s-components-checker and rancher-upgrade.

3 Updated 2 days ago

DevOps & Infrastructure Solid

helm

This skill should be used when authoring or maintaining Helm charts — creating charts, writing templates and _helpers.tpl, values.yaml patterns, Chart.yaml, values.schema.json, helm-docs, and library charts. Covers Helm 4 (SSA, WASM, OCI digest), chart CI/CD, OpenShift compatibility, chart security, CRD management, and production templates. NOT for installing or consuming third-party charts.

3 Updated 2 days ago

AI & Automation Solid

jinja-expert

Author, read, and debug Jinja2 templates across the three places Jinja lives in 2026 — HuggingFace `chat_template.jinja` (rendered by `apply_chat_template` for vLLM / sglang), Ansible playbooks + `.j2` files, and Jinja-adjacent Kubernetes workflows (`values.yaml.j2`, `kubernetes.core.k8s + template`, Helm post-renderers). Companion to the `helm` skill — Helm charts are Go `text/template` + Sprig, not Jinja, and this skill makes that disambiguation explicit.

3 Updated 2 days ago

AI & Automation Solid

jira-best-practices

Advise on USING Jira well, not operating it: make the structural call — is this an epic, a story, a task, or a sub-task? — and diagnose why a Jira is a dread, then recommend the lean fix. Adapt to the organisation's OWN hierarchy names, conventions, and working language instead of imposing a methodology. Self-hosted-first: Jira Data Center 10.3/11.x (no Cloud AI; dual Epic Link + Parent Link). Built for an agent that ACTS on Jira through the jira-cli tool or the mcp-atlassian MCP server while advising the user; Jira web-UI and admin-schema guidance is secondary. Covers ALL project types — software AND non-software (operations, engineering, services, business).

3 Updated 2 days ago

Data & Documents Solid

jira-cli

Drive Atlassian Jira from the terminal with the `jira` CLI (jira-cli, v1.7.0) against ANY Jira — Cloud or on-premise/Data Center. Covers the full command surface (issue / epic / sprint / board / project / release), the non-interactive automation contract (`--no-input` + `--plain`/`--raw`/`--csv` for agent-safe, parseable output), JQL filtering, GitHub/Jira markdown → Atlassian Document Format (ADF) conversion, authentication for every backend (Cloud API token, on-prem basic, PAT/bearer, mTLS), and live-discovery of instance-specific values (project keys, issue types, statuses, priorities, link types, custom fields) instead of guessing them.

3 Updated 2 days ago

AI & Automation Solid

jira-confluence-mcp

Install, configure, secure, and troubleshoot the mcp-atlassian MCP server (sooperset/mcp-atlassian) that connects an agent to Jira/Confluence — including AIR-GAPPED setup (mirror the prebuilt image by digest; no PyPI/git mirror) and internal-CA / TLS handling (mount the CA vs JIRA_SSL_VERIFY=false). Self-hosted Data Center first: the #1 gotcha is DC uses JIRA_PERSONAL_TOKEN (a PAT), NOT the Cloud username+API-token pattern. Covers `claude mcp add`, the env-var catalog, hardening (READ_ONLY_MODE, TOOLSETS/ENABLED_TOOLS, project filters, the v0.22 default-toolset change), Cloud-vs-DC tool/format divergence, and 401/403/field/rate-limit/SSL fixes. NOT a catalogue of the 72 tools — those self-document at runtime; this is the setup/ops knowledge invisible at call time.

3 Updated 2 days ago

DevOps & Infrastructure Solid

k8s-components-checker

Survey an RKE2 community cluster against an embedded compatibility registry of 19 stack components and produce a verdict for upgrade-readiness, drift-review, and version-skew questions. Components: RKE2, Rancher, Harvester, Cilium, Tetragon, cert-manager, Kyverno, KEDA, Argo CD, Harbor, Traefik, Rook, Ceph, OpenEBS, GitLab, ECK, Zalando postgres-operator, Grafana Mimir, NVIDIA GPU Operator. Works air-gapped — compatibility data lives in `references/compat/`. Surveys run via `kubectl` + `helm` + `pluto` + the apiserver `apiserver_requested_deprecated_apis` metric from the operator's workstation. Community editions only — Prime/EE-gated content is ignored. NOT for installing components, NOT for executing upgrades, NOT for tracking per-cluster running state (the registry is methodology, not inventory).

3 Updated 2 days ago

AI & Automation Solid

keda

Configure, operate, and master KEDA (Kubernetes Event-driven Autoscaling) — ScaledObject, ScaledJob, TriggerAuthentication CRDs, 70+ scalers, HPA behavior tuning, scale-to-zero, the KEDA HTTP Add-on, production hardening, multi-trigger semantics, scalingModifiers formulas, GitOps integration, and troubleshooting stuck scalers. Covers the common traps (cooldownPeriod only applies to N→0, CPU/memory cannot drive scale-to-zero alone, activationThreshold vs threshold, multi-trigger max-of semantics, HPA conflicts).

3 Updated 2 days ago

AI & Automation Solid

keycloak-iam

Operate, configure, deploy, secure, and integrate with Keycloak (open-source IAM) — the modern Quarkus distribution (24.x–26.7.x), the Keycloak Operator with `Keycloak` and `KeycloakRealmImport` CRDs, and realm/client/identity-provider configuration.

3 Updated 2 days ago

AI & Automation Solid

lmcache-mp

LMCache multiprocess (MP) mode — standalone LMCache server in its own pod/process that vLLM connects to over ZMQ. Gives process isolation, no GIL contention on the inference path, one cache shared by multiple vLLM pods per node, and CPU-memory scaling independent of GPU memory. Covers the `LMCacheMPConnector` path (vs the in-process `LMCacheConnectorV1`), the DaemonSet+Deployment K8s pattern and LMCache Operator, the L1 (CPU DRAM) + L2 (NIXL, fs, mooncake_store, s3, Redis) cascade, the `lmcache/standalone` + `lmcache/vllm-openai` image pair, hybrid-attention model support (Gemma 3/4, Qwen3.5/3.6 GDN, DeepSeek-V4-Flash, GLM 5.x, MiniMax-M3) via `SupportsHMA`, and the production gotchas (`--no-enable-prefix-caching`, vLLM/lmcache version pins, object-group separation, cache_salt fallback bug).

3 Updated 2 days ago

Data & Documents Solid

makefile-best-practices

Makefile best practices, patterns, and templates for GNU Make 4.x — dependency graphs, task-runner workflows, parallel-safe recipes, self-documenting help targets, and language-specific patterns (Go, Python, Node, Docker, Helm, POSIX).

3 Updated 2 days ago

AI & Automation Solid

nvidia-datacenter-bringup

Bring up NVIDIA HGX/DGX datacenter GPU hosts on Ubuntu 24.04 LTS — air-gapped or connected, Secure Boot enabled. Covers B300/B200/H100/A100/L40S/L4 driver+fabricmanager+NVLSM+DOCA-OFED install order and exact package set from NVIDIA CUDA repo + DOCA repo. Triggers on B300/B200/HGX/DGX install, "fabricmanager won't start", "system not yet initialized" / cudaErrorSystemNotReady, NVLSM missing, ib_umad not loading, DOCA-OFED before NVIDIA driver, nvidia-driver-pinning-XXX, nvlink5-XXX, nvidia-open vs cuda-drivers, "Blackwell requires open kernel modules", ConnectX-7/8 bridge device, FM exact-version-match, gpu-operator cuda-validator CrashLoopBackOff, B300 PCI ID 0x3182, air-gap CUDA + DOCA mirror, three-tier DOCA GPG key, MOK enrollment, DKMS sign, Dell PowerEdge XE9780/XE9785 baseboard firmware v1.4.30, iDRAC Redfish virtual AC cycle DellOemChassis.ExtendedReset, generic "install nvidia driver ubuntu 24.04 datacenter".

3 Updated 2 days ago

DevOps & Infrastructure Solid

nvidia-nixl

NVIDIA Inference Xfer Library (NIXL) operator + developer reference. Point-to-point KV-cache and tensor transport for distributed inference (Dynamo, vLLM, SGLang). Covers the agent API (full Python reference; C++/Rust via upstream pointers), all 15 backend plugins (UCX, GDS, GDS_MT, libfabric, mooncake, posix, hf3fs, obj/S3, azure_blob, infinia/DDN, gusli, uccl, gpunetio/DOCA, telemetry, tracing), AMD ROCm/HIP support, build paths (pip nixl-cu12/cu13, meson+ninja from source), ETCD vs side-channel metadata, telemetry (Prometheus + cyclic shared-memory), NIXL-EP elastic MoE device kernels, and Dynamo / vLLM NixlConnector / SGLang integration patterns.

3 Updated 2 days ago

AI & Automation Solid

open-webui-embeddings

Wire HuggingFace embedding + reranker models (BGE-M3, BGE-Reranker-v2-m3, etc.) into Open WebUI's RAG pipeline via LiteLLM proxying HuggingFace Text Embeddings Inference (TEI). Covers the exact wire shapes Open WebUI sends (URL auto-append on embed but NOT rerank; payload + response shapes for both modes), the LiteLLM-TEI gotchas (encoding_format=null trap, HF-driver task_type misdetection, openai vs huggingface driver tradeoffs), TEI config cliffs (max-client-batch-size 422 under hybrid search, max-batch-tokens AS the auto-truncate boundary, arch-specific Docker images), and the end-to-end production config. BGE-M3 + BGE-Reranker-v2-m3 are worked examples; patterns generalise to any TEI encoder.

3 Updated 2 days ago

AI & Automation Solid

open-webui-valkey-websocket

Deploy Open WebUI multi-pod with WebSockets and Valkey/Redis Sentinel at 1000+ user scale on Kubernetes. Centerpiece is the structural Socket.IO+Redis frame-amplification bug (#23733) that cripples multi-pod streaming, and the maintainer-endorsed mitigation (`CHAT_RESPONSE_STREAM_DELTA_CHUNK_SIZE`). Covers all multi-pod env vars, the custom-model-icon perf history (base64-in-/api/models, fixed late 2025–Apr 2026), the official helm chart's gaps (bundled Redis is unsuitable for production; no HPA/PDB/probes/sticky sessions), and the catalog of known multi-pod issues with current status.

3 Updated 2 days ago

AI & Automation Solid

prometheus-mimir-grafana

Query Prometheus and Grafana Mimir, write and debug PromQL, and build or fix Grafana dashboards — for agents solving problems from metrics. Covers the Prometheus HTTP API (`/api/v1/query`, `query_range`, `series`, `labels`, `metadata`), Mimir multi-tenancy (`X-Scope-OrgID`, federation `a|b|c`, per-tenant 422/429 limits), the PromQL surface (selectors, rate family, classic + native histograms, `histogram_quantile`, vector matching `on()`/`group_left`, recording rules), Grafana dashboard JSON (panels, targets, variables + interpolation specifiers, legacy `/api/dashboards/db` vs Grafana-12 `/apis/dashboard.grafana.app/v1beta1/…`), KPI frameworks (RED, USE, Golden Signals, SLO burn-rate), connection recipes, MCP servers vs curl, and the PromQL trap list.

3 Updated 2 days ago

AI & Automation Solid

rancher-upgrade

Plan and sequence COMMUNITY-edition Rancher upgrades across air-gapped multi-cluster fleets — a management/"hosting" Rancher cluster plus the downstream RKE2/K3s clusters it provisions. Covers the community release model (2.11→2.14, community-vs-Prime cadence, EOL), the Kontainer Driver Metadata (KDM) matrix deciding which downstream k8s minors each Rancher version can manage (and the stranding risk when a host-Rancher bump outruns its sub-clusters), cross-cluster upgrade ordering, the embedded-CAPI→Rancher-Turtles migration, Fleet coupling, cert-manager/Helm/backup prerequisites, backup- restore-operator + etcd rollback, and the air-gapped upgrade procedure (which images/charts/KDM to mirror). Assumes the Helm-on-Kubernetes install. Community editions only; Prime content excluded. Companion to k8s-components-checker, which owns the management-cluster k8s compatibility verdict; this skill owns the upgrade methodology and downstream coordination.

3 Updated 2 days ago

AI & Automation Solid

secure-boot-cert-rotation

Triage and remediate the Microsoft Secure Boot 2011→2023 UEFI certificate rotation (CAs expiring June/October 2026) across Dell PowerEdge / iDRAC9 bare metal, Ubuntu/Linux servers, and Harvester HCI / KubeVirt guest VMs. Establishes the load-bearing fact that UEFI firmware ignores certificate expiry — nothing stops booting on the deadline; the real risk is forward-compat once a 2023-only-signed shim arrives, plus a dbx/revocation freeze — then routes to the cleanest per-platform fix: iDRAC BIOS-staged keys applied on reboot (Dell), fwupd-free manual `db` append that self-authenticates via the existing 2011 KEK (Linux), and the Harvester virt-launcher OVMF floor (v1.6.0) with ephemeral-vs-persistent NVRAM triage (VMs). Covers the PK→KEK→db trust chain, why no generic Microsoft 2023 KEK payload exists, and audit via mokutil / efi-readvar / racadm bioscert / Redfish.

3 Updated 2 days ago

AI & Automation Solid

sglang-hicache

SGLang HiCache (hierarchical KV cache) — three-tier prefix cache: GPU HBM (L1) → pinned host DRAM (L2) → distributed L3 (Mooncake / 3FS / NIXL / AIBrix / EIC / SiMM / file / LMCache). Covers `--enable-hierarchical-cache`, all `--hicache-*` flags, write policies, page_first* layouts, prefetch policy (best_effort / wait_complete / timeout), per-rank sizing, MHA / MLA / DSA / Mamba / SWA support matrix (SWA + 3FS hybrid shipped in v0.5.11), runtime attach/detach HTTP admin, and auto-rewrite startup log lines that silently substitute layout × IO × storage combinations.

3 Updated 2 days ago

AI & Automation Solid

sglang-model-gateway

SGLang Model Gateway (`sgl-model-gateway`, formerly `sgl-router`) — Rust router fronting vLLM and SGLang inference workers on Kubernetes. Covers first-class vLLM gRPC backend plus HTTP transparent-proxy for vanilla vLLM, the policy set (six `--policy` values, `cache_aware` default), tokenizer-format dispatch (`tokenizer.json` HF-fast vs `tiktoken.model` BPE — including when neither is required because `cache_aware` is text-based), air-gapped recipe (gateway ignores `HF_ENDPOINT`, mount tokenizer files on PVC only when actually needed), K8s manifests with `model_id` labels and per-model RBAC, three HA mitigations (single + PDB, `sessionAffinity: ClientIP`, `--enable-mesh` CRDT sync), and a pitfall catalog covering the Dec 2025 `sgl-router` → `sgl-model-gateway` rename and over-engineered tokenizer init-container traps.

3 Updated 2 days ago

AI & Automation Solid

skill-improver

Autoresearch loop for Claude Code skills — greedy keep/discard hill climbing on a 10-dimension quality rubric, with blind subagent validation for self-scoring bias, plus a `freshen` mode that probes external references (release notes, docs, deprecation signals) and applies verified updates, plus a `trigger` mode that measures and tunes the skill's frontmatter description until it reliably fires when it should and stays silent when it shouldn't (60/40 train/test split, 7 runs/query, blinded test scores).

3 Updated 2 days ago

AI & Automation Solid

transformers-config-tokenizers-expert

Preflight reference for HuggingFace snapshots — what vLLM, sglang, and transformers.generate see at runtime. Covers config-file precedence (tokenizer.json, tokenizer_config.json, generation_config.json, chat_template.jinja), transformers v5 tokenizer-class taxonomy (TokenizersBackend, PythonBackend, MistralCommonBackend, TikTokenTokenizer), special-token discovery (all_special_ids, added_tokens_decoder, extra_special_tokens, backend_tokenizer.get_added_tokens_decoder), chat-template Jinja contract (ImmutableSandboxedEnvironment, loopcontrols, raise_exception, strftime_now, tojson, add_generation_prompt), and engine knobs (skip_special_tokens, trust_request_chat_template, chat_template_kwargs allowlist, adjust_request, incremental detokenizer, EOS merge). Ships verified 2026 hall-of-shame for Kimi-K2.6, GLM-5.1, Gemma-4, Qwen3, DeepSeek-V3, plus drop-in Python for resolving markers to IDs, detecting turn-primer-as-EOS leaks, and cross-referencing tokenizer.json vs tokenizer_config.json.

3 Updated 2 days ago

AI & Automation Solid

ubuntu-autoinstall

Author, validate, and debug Ubuntu Server autoinstall configuration (the Subiquity installer's `autoinstall:` schema, version 1) for Ubuntu Server LTS 24.04 and 26.04, focused on unattended on-premise and air-gapped installs — identity, storage (LVM/direct/ZFS/encryption/RAID), apt mirror-selection + proxy + offline fallback, ssh, packages, kernel, late-commands/early-commands, and zero-touch delivery via a NoCloud seed. The `network:` block is netplan v2 (use the ubuntu-netplan skill); the `user-data:` block is cloud-config for the installed system (use the ubuntu-cloud-init skill).

3 Updated 2 days ago

DevOps & Infrastructure Solid

ubuntu-cloud-init

Author, validate, and debug cloud-init configuration for Ubuntu Server LTS 24.04 and 26.04, focused on on-premise and air-gapped hosts via the NoCloud datasource — `#cloud-config` user-data, the cloud-config modules (users, ssh, write_files, runcmd, apt with local mirrors, ca_certs for internal CAs, ntp, disk_setup), NoCloud seeding (seed dir, `cidata` ISO, `ds=nocloud;s=...` kernel cmdline, SMBIOS serial), boot stages, and pinning `datasource_list`. Public-cloud datasources (EC2/Azure/GCE…) are pointers only. For the `network:` block use the ubuntu-netplan skill; for the installer schema use the ubuntu-autoinstall skill.

3 Updated 2 days ago

AI & Automation Solid

ubuntu-netplan

Author, validate, and debug netplan network configuration (`/etc/netplan/*.yaml`) for Ubuntu Server LTS 24.04 and 26.04, focused on on-premise and air-gapped hosts — static addressing, bonds, bridges, VLANs, VRFs, routing/policy-routing, DNS, interface matching/renaming, and the systemd-networkd renderer (NetworkManager and desktop covered briefly). Also the shared `network:` substrate for the ubuntu-autoinstall and ubuntu-cloud-init skills, which both use netplan v2.

3 Updated 2 days ago

AI & Automation Solid

vllm-benchmarking

Run production vLLM benchmarks — `vllm bench` (serve, throughput, latency, sweep, startup, mm-processor), request-rate vs max-concurrency semantics, TTFT/TPOT/ITL/E2EL percentiles, goodput SLO measurement, prefix-cache workloads, air-gapped operation (HF_ENDPOINT, ModelScope, hf-mirror, offline cache). Methodology split — SLO health checks vs A/B change sweeps — plus pitfalls that produce misleading numbers (no warmup, wrong tokenizer, random-as-prod, `--request-rate inf` alone).

3 Updated 2 days ago

AI & Automation Solid

vllm-caching

vLLM tiered KV cache configuration for production H100/H200 clusters. Native CPU offload, LMCache (CPU+NVMe+GDS), NixlConnector (disaggregated prefill), MooncakeConnector (RDMA), MultiConnector composition. Version gates, sizing math (flag total across TP, not per-GPU — opposite of SGLang), KV-vs-weights offload distinction operators most often get wrong.

3 Updated 2 days ago

AI & Automation Solid

vllm-chat-templates

vLLM chat-template (prompt-side Jinja) operator reference. Template resolution precedence (`--chat-template` → AutoProcessor → tokenizer default → bundled fallback), `chat_template_kwargs` allowlist silently dropping `add_generation_prompt`/`enable_thinking`/custom kwargs (PR 27622 fix), 27 shipped `tool_chat_template_*.jinja` files, known template-layer bugs for Qwen3/Qwen3-Coder, DeepSeek-R1/V3/V3.1/V3.2, GPT-OSS, Kimi-K2, Llama-4, Mistral (HF vs mistral mode), Gemma-3/4, Phi-4, GLM. Prompt side only — output parsing lives in sibling skills.

3 Updated 2 days ago

AI & Automation Solid

vllm-configuration

Configure vLLM completely — YAML config file format, CLI arg precedence, full VLLM_*/HF_*/TRANSFORMERS_* env-var catalog, end-to-end recipe for air-gapped environments (internal HF mirrors, hf-mirror.com, ModelScope, HF_HUB_OFFLINE with pre-seeded cache, gated models offline, trust_remote_code supply-chain implications). VLLM_HOST_IP vs API-host confusion, Kubernetes-service-named-`vllm` env-var poisoning, usage-stats triple opt-out, YAML precedence surprises.

3 Updated 2 days ago

AI & Automation Solid

vllm-deployment

Use this skill when authoring, reviewing, or fixing a vLLM Kubernetes manifest, Docker/Podman pod, or OpenShift ServingRuntime — even when the user does not say "vllm". Triggers on: lab cluster performance practices, cache mount + survival across pod restarts (/root/.cache, VLLM_CACHE_ROOT, TORCHINDUCTOR_CACHE_DIR, TRITON_CACHE_DIR, "do we have caches saved"), HF_TOKEN secret in pod env, liveness + readiness probe tuning (initialDelaySeconds, failureThreshold, "pod takes 12 min to boot"), serve_args review, --enforce-eager rationale, MoE deployment ("ep2 dp2", --enable-expert-parallel, expert-parallel sizing), TP/PP sizing, ConfigMap parser-plugin mount, image tag selection, cold-boot reduction, multi-node LWS + Ray, control planes (llm-d, production-stack, AIBrix, NVIDIA Dynamo, KServe), KEDA autoscaling, GAIE routing, disaggregated prefill/decode (Nixl/Mooncake/LMCache/MORI-IO), RHAIIS on OpenShift (SCC, arbitrary UID, Routes 60s, ModelCar, air-gapped). Lead with operator intent, not vendor names.

3 Updated 2 days ago

AI & Automation Solid

vllm-gemma-4-31b

Operating-point reference for serving Gemma 4 31B on vLLM — TP sizing, max_model_len, max_num_seqs, gpu_memory_utilization, kv_cache_dtype, EAGLE3 spec-dec, chat_template choice.

3 Updated 2 days ago

AI & Automation Solid

vllm-input-modalities

vLLM non-chat inference surfaces — text embeddings (`/v1/embeddings`, `/v2/embed`), reranking/scoring (`/rerank`, `/score`), speech-to-text (`/v1/audio/transcriptions`, `/v1/audio/translations`), document OCR via VLMs. Covers 2026 `--runner pooling` (replacing `--task embed`), v0.20 deprecations (`score`→`classify`, multitask pooling, `encode`→`token_embed`+`token_classify`), Matryoshka/MRL, ColBERT/ColPali/ColQwen late-interaction MaxSim, Cohere v2 `/v2/embed`, Jina v3/v4/v5 quirks, cross-encoder score templates, Whisper large-v3-turbo quants, DeepSeek-OCR recipe (NGramPerReqLogitsProcessor, no prefix cache, GUNDAM mode).

3 Updated 2 days ago

AI & Automation Solid

vllm-nvidia-hardware

NVIDIA AI-hardware + vLLM-platform reference covering Hopper (H100/H200), Blackwell (B100/B200/B300) and Blackwell Ultra, Grace-Blackwell superchips and NVL72 racks (GB200, GB300), Vera Rubin (R100/R300) with VR200 NVL144 and Kyber NVL576, Dell PowerEdge XE family and IR5000/IR7000/IR9048 racks. Per-SKU HBM, FP4/FP8/FP16 TFLOPs, NVLink5, TDP, rack power/cooling (135 kW GB300, 180-220 kW NVL144, 600 kW Kyber), DLC vs RDHx, 800 VDC HVDC. Memory-wall roofline, HBM3E→HBM4 supply 2026. vLLM attention-backend × SM matrix, FP4/FP8 paths, KV connectors, Blackwell gotchas (SM103 TRTLLM hang, 270 vs 288 GB B300 bin split).

3 Updated 2 days ago

AI & Automation Solid

vllm-observability

Observe production vLLM — `/metrics` Prometheus surface (V1 engine), SLO-driven alerting on TTFT/ITL/queue/KV/preemption/aborts/corrupted-logits, shipping Grafana dashboards in `examples/observability/`, OTLP tracing with `--otlp-traces-endpoint` and `--collect-detailed-traces={model,worker,all}`, diagnostic rules to triage from /metrics alone — queue-grows + TPOT-stable means capacity, queue-stable + TPOT-grows means context/model, DCGM `SM_OCCUPANCY` is the real GPU-saturation signal not `GPU_UTIL`. V1 metric names (kv_cache_usage_perc), gpu_→kv_ rename saga, DCGM-exporter pairing, dashboard-lying pitfalls.

3 Updated 2 days ago

AI & Automation Solid

vllm-omni

vLLM-Omni output-side multimodal generation — image (FLUX.1/2, Qwen-Image, GLM-Image, BAGEL, SD3.5, HunyuanImage-3.0), video (Wan2.1/2.2, LTX-2, HunyuanVideo-1.5), TTS (Qwen3-TTS, CosyVoice3, Voxtral-TTS), any-to-any omni (Qwen3-Omni, Qwen2.5-Omni, MiMo-Audio) via `vllm serve --omni`. Stage-based disaggregation (OmniConnector + Mooncake + RDMA), `/v1/images/generations`, async+sync `/v1/videos`, `/v1/audio/speech` with voice-upload, PCM16 WebSocket `/v1/realtime`, Ulysses/Ring SP + CFG-parallel, DiT FP8/INT8/GGUF, CUDA/ROCm/NPU/XPU/MUSA matrix, release pitfalls (v0.19.0rc1 FLUX regression, GLM-Image transformers>=5.0, Qwen3-TTS enforce-eager).

3 Updated 2 days ago

AI & Automation Solid

vllm-performance-tuning

vLLM performance-tuning operator reference — tuning workflow (baseline → bottleneck → knob → re-bench), fused-MoE kernel autotune (`benchmark_moe.py` generates `E=N,N=M,device_name=X.json` configs), DeepEP all-to-all + expert parallelism + EPLB, CUDA graph modes (FULL_AND_PIECEWISE default), torch.compile AOT + compile cache, scheduler knobs (`--max-num-batched-tokens`, `--max-num-seqs`, `--async-scheduling`), TP/EP/DP/PP decision tree, NCCL/DCGM on H100/H200/B200/GB200, PD disaggregation (Nixl/Mooncake/LMCache), known regressions + vendor quirks (v0.14→0.15.1 MiniMax, MI300X FP8<BF16, DeepGEMM M<128 TTFT).

3 Updated 2 days ago

AI & Automation Solid

vllm-quantization

vLLM datacenter-GPU quantization — picking, configuring, troubleshooting NVFP4, FP8, MXFP4, MXFP8, AWQ, GPTQ, INT8, compressed-tensors, modelopt, quark on H100/H200/B200/B300/GB200/GB300. 29 `--quantization` flag values, KV-cache dtypes (fp8_e4m3, nvfp4, per-token-head, turboquant), MoE backend selection (CUTLASS, TRTLLM, FlashInfer, DeepGEMM, Marlin, Qutlass), producing checkpoints with llm-compressor and NVIDIA ModelOpt (NVFP4_DEFAULT_CFG, FP8_DEFAULT_CFG, W4A16, SmoothQuant+GPTQ), online quantization (`fp8_per_tensor`, `fp8_per_block`), training EAGLE-3/dflash drafters on BF16 targets before PTQ, version gates per vLLM release (v0.14 → v0.21).

3 Updated 2 days ago

DevOps & Infrastructure Solid

redis-to-valkey

Migrate Redis deployments (especially Bitnami Redis Helm charts in Sentinel HA mode) to Valkey on Kubernetes, including fully air-gapped clusters. Core knowledge: the RDB-version wall (Valkey replicates/loads only from Redis ≤ 7.2.x; Redis 7.4+ writes RDB v12 which Valkey rejects; Valkey 9 writes its own v80 — a one-way door), the two transfer layers (version-bound REPLICAOF/DUMP-RESTORE vs version-agnostic logical replay with RedisShake or rdb-cli), a side-by-side cutover runbook, Valkey chart selection (groundhog2k / CloudPirates / official valkey-io tradeoffs), Bitnami-redis→valkey values translation, consumer-app reconnection (Sentinel discovery, master-set names, frozen redis_version 7.2.4), Prometheus exporter continuity, air-gap tool/image mirroring, and Argo CD source rewiring away from charts.bitnami.com. Part of the bitnami-exit suite.

3 Updated 2 days ago

AI & Automation Solid

chat-completions-api

Reference for the OpenAI Chat Completions API (/v1/chat/completions) and legacy /v1/completions as the lingua-franca compatibility protocol — the official spec incl. deprecation timeline and Responses-only feature delta, how 7 local servers (vLLM, SGLang, llama.cpp, Ollama, mistral.rs, Llama Stack/OGX, Lemonade) actually implement it, gateways (LiteLLM, Bifrost), 10 cloud providers' CC-compat endpoints (Anthropic, Gemini, DeepSeek, xAI, Groq, OpenRouter, Azure...), the reasoning_content/reasoning field schism, finish_reason divergences, and client wire behavior (opencode, Vercel AI SDK). NOT for the Responses API (responses-api skill) or Anthropic Messages protocol (messages-api skill).

3 Updated 2 days ago

AI & Automation Solid

messages-api

Reference for the Anthropic Messages API (/v1/messages) as a third-party compatibility protocol — the 7 inference servers that implement it natively (vLLM, SGLang, llama.cpp, Ollama, mistral.rs, Llama Stack/OGX, Lemonade), gateways that adapt it (LiteLLM, Bifrost, Superagent Gateway), client behavior (Claude Code, opencode anthropic provider), Messages ↔ Chat Completions translation, the thinking-signature seam, stop_reason divergences, and streaming quirks. NOT for official Anthropic API usage (models, pricing, SDK) — that is the claude-api skill.

3 Updated 2 days ago

AI & Automation Solid

mimir-upgrade

Plan and run a controlled, COMMUNITY-edition Grafana Mimir upgrade on the `mimir-distributed` Helm chart, air-gap first — the chart↔app co-pinned ladder (5.7→5.8→6.0.6→6.1.0 = app 2.16→2.17→3.0.4→3.1.2), the classic-vs-ingest-storage decision (the chart ships a supported `classic-architecture.yaml`; `kafka.enabled: false` alone is NOT the switch and causes an ingestion outage), the community-specific nginx→gateway rename that silently moves the proxy's DNS name and breaks every remote_write client, the silent-no-op vs crashloop asymmetry between stale chart keys and stale app config, rollout-operator sequencing and the abort levers that deadlock a namespace, per-hop verification, and air-gap image/CRD/egress work. Companion to k8s-components-checker.

3 Updated 2 days ago

AI & Automation Solid

responses-api

Reference for the OpenAI Responses API (/v1/responses), OpenResponses open standard, and Codex CLI. Covers the request/response schema, previous_response_id, Conversations API, server-side compaction, WebSocket transport, hosted Shell tool, Skills, tool_search, MCP connectors, prompt caching, phase field, 53 typed streaming events, 10-backend support matrix (vLLM, llama.cpp, mistral.rs, Ollama, LiteLLM, SGLang, Llama Stack, TensorRT-LLM, Bifrost, Lemonade), and Chat Completions translation with 17 gotchas.

3 Updated 2 days ago

AI & Automation Solid

open-webui-api

Administer Open WebUI entirely via its REST API (v0.10.x): user/group lifecycle, permissions, model catalog GitOps (export/import/sync), knowledge/RAG pipelines, config-as-code, SCIM provisioning, event webhooks, and backup surfaces. Grounded in the v0.10.2 source — covers the 458-path surface the official docs leave ~96% undocumented, the auth bootstrapping traps (ENABLE_API_KEYS default-off, JWT 4-week expiry, one unscoped key per user), and the 0.10.0 breaking changes (access_control→access_grants with inverted public/private defaults, flat dot-keyed config) that silently break every pre-0.10 script and most LLM training-data knowledge.

3 Updated 2 days ago

AI & Automation Solid

traefik-hardening

Harden a Traefik 2.x/3.x reverse proxy against abusive or unwanted traffic — per-identity rate + concurrency limiting, IP allowlisting, request-body buffering, middleware chaining, keying limits on a JWT claim/header instead of raw source IP, air-gapped plugin loading (localPlugins/WASM), and detection/response via access logs. Built on the principle that a proxy usually CANNOT reliably tell a "bad" client from a "good" one (User-Agent and even TLS/JA3 fingerprints are spoofable; real humans burst while abusive scripts crawl) — so cap per-identity resource use regardless of client type rather than trying to classify. Covers the 2.x→3.x middleware deltas (IPWhiteList→IPAllowList, Redis-backed distributed RateLimit, status-based Retry), the single-leader-counting trap, and where Traefik's job ends and the app/backend must take over.

3 Updated 2 days ago

DevOps & Infrastructure Solid

logging-operator

Configure and operate the kube-logging logging-operator (formerly Banzai Cloud) on Kubernetes — the CRD-driven log pipeline: Fluent Bit collector → fluentd or syslog-ng aggregator → outputs. Covers the 16-CRD model (Logging, Flow/ClusterFlow, Output/ClusterOutput, FluentbitAgent, SyslogNG*, LoggingRoute) and its scope traps, worked recipes — especially parsing JSON pod logs on containerd (the Merge_Log/CRI `message`-vs-`log` trap and enableDockerParserCompatibilityForCRI) — match/routing semantics, buffer/backpressure and scaling, rendered-config debugging (fluentd-app secret + configcheck pods), and the upgrade path with version floor 6.7.0 (CVE-2026-54680 config-injection RCE).

3 Updated 2 days ago

DevOps & Infrastructure Solid

rancher-logging-exit

Migrate off the Rancher-bundled `rancher-logging` chart (cattle-logging-system, rancher/mirrored-kube-logging-* images) to the upstream kube-logging logging-operator ≥6.7.0 — air-gap-first. Rancher 2.11 through 2.15-dev all bundle a frozen operator 4.10.0 that is inside the affected range of CVE-2026-54680 (CVSS 9.9 config-injection RCE, no SUSE fix) — so the exit is security-urgent. Covers the maintainer-endorsed helm-release-secret strategy (near-zero gap; NOT `helm uninstall rancher-logging-crd`, which cascade-deletes every CR and the data plane), CR compatibility 4.10→6.7 (silent field pruning), server-side CRD apply (828KB CRDs), buffer-PVC preservation, air-gap image/chart mirroring, rollback, and stale-CRD debris cleanup.

3 Updated 2 days ago

API & Backend Solid

postgres-operator-cloudnative-pg-migration

Migrate PostgreSQL clusters from the Zalando postgres-operator (acid.zalan.do `postgresql` CRs, Spilo/Patroni, WAL-G) to CloudNativePG (CNPG) on Kubernetes, incl. fully air-gapped clusters. Core knowledge: the two walls (Spilo↔CNPG glibc/collation divergence that corrupts physically-copied indexes; WAL-G↔Barman archive incompatibility that strands old backups), three migration paths (logical replication default; initdb.import for small DBs/PG≤13; pg_basebackup as the discouraged same-major exception), the acid.zalan.do→Cluster manifest field map with no-equivalent gaps (preparedDatabases, sidecars, logical-backup cron), backup re-plumbing onto the barman-cloud CNPG-I plugin, HA parity (synchronous + failoverQuorum vs Patroni failsafe), consumer cutover (service/secret renames, scram, cnpg_ metrics), air-gap mirroring, stay-vs-migrate evidence.

3 Updated 2 days ago

DevOps & Infrastructure Solid

snmp-exporter

Best practices for Prometheus snmp_exporter (v0.30.x): writing generator.yml modules, curating MIB walks, SNMPv2c/v3 auth, timeout tuning, Kubernetes deployment (Probe/ScrapeConfig CRDs, secrets, UDP egress), local docker testing, and debugging failed scrapes. Includes worked device references for Dell iDRAC 9/10, Cisco CBS250/350 (+ Catalyst 1200/1300), and NVIDIA/Mellanox Onyx switches.

3 Updated 2 days ago

AI & Automation Listed

baml-expert

BAML (Boundary ML) expert for projects defining LLM calls as typed functions in .baml files with a generated Python client. Use whenever the repo contains baml_src/, baml_client/, baml-cli commands, or imports from baml_py / baml_client. Covers .baml syntax (function, class, enum, client, test, retry_policy, attributes), Python integration (baml_client sync/async, streaming, ClientRegistry, Collector, TypeBuilder), Schema-Aligned Parsing, ctx.output_format, @@assert / @@check tests, @stream.done / @stream.not_null / @stream.with_state streaming, multimodal (image/audio/pdf), and debugging via BAML_LOG plus Boundary Studio. Triggers even unnamed — "add an LLM function", "fix a failing parse", "add a test for the prompt", "stream the response" in a project with baml_src/. Prefer over raw LLM-SDK guidance here; defer to jinja-expert for standalone chat-template / .j2 work.

3 Updated 2 days ago

AI & Automation Listed

netbox-best-practices

NetBox 4.2-4.6 deployment and upgrade knowledge that the official netboxlabs/skills marketplace does NOT cover - use for deploying or upgrading NetBox on Kubernetes with the netbox-community helm chart (netbox-chart), external PostgreSQL/valkey wiring, API token bootstrap on 4.5+ (nbt_ v2 tokens), plugin installation in the official image, version-migration planning between NetBox 4.2 and 4.6, module type profiles, and front/rear port (patch panel) API changes. Trigger on "netbox helm", "netbox chart", "netbox kubernetes", "netbox upgrade", "netbox plugin install", "netbox api token bootstrap", "netbox 4.x breaking changes", "netbox oidc/sso group mapping" or "netbox sso hardening", or seeding/automation that must survive a NetBox version bump. For general NetBox data modeling, IPAM design, Diode, or validation questions - and for turning on an auth backend in the first place - prefer the official netboxlabs/skills marketplace skills (netbox-administration); this skill only covers the gaps.

3 Updated 2 days ago

DevOps & Infrastructure Listed

openshift-app

Package applications for OpenShift deployment: container images (UBI, arbitrary UID, multi-stage builds), packaging formats (Helm, Kustomize, Operators, OLM v1), CI/CD (Tekton, ArgoCD, Shipwright, Conforma), security (SCC, PSA, supply chain, image signing, secrets), operations (Routes, probes, scaling, monitoring, storage), disconnected/air-gapped patterns, and critical gotchas. Also when an app "works on Kubernetes but fails on OpenShift" (SCC denied, random/arbitrary UID, permission errors). Covers OCP 4.14-4.22. NOT for cluster installation or infrastructure management.

3 Updated 2 days ago

Bio shown is the top-scored skill's repo description as a fallback — real GitHub bios land in a future update.