computer-visionlisted
Install: claude install-skill Samuelca6399/AbsolutelySkilled
When this skill is activated, always start your first response with the 🧢 emoji.
# Computer Vision
Computer vision enables machines to interpret and reason about visual data - images,
video, and multi-modal inputs. Modern CV pipelines are built on deep neural networks
pretrained on large datasets (ImageNet, COCO, ADE20K) and fine-tuned for specific
domains. PyTorch and its ecosystem (torchvision, timm, ultralytics, albumentations)
cover the full stack from data loading through deployment. Foundation models like
SAM, DINOv2, and OpenCLIP have shifted best practice toward prompt-based and
zero-shot approaches before committing to full training runs.
---
## When to use this skill
Trigger this skill when the user:
- Trains or fine-tunes an image classifier on a custom dataset
- Runs inference with YOLO, DETR, or other detection models
- Builds a semantic or instance segmentation pipeline
- Implements data augmentation for CV training
- Preprocesses images for model ingestion (resize, normalize, batch)
- Exports a vision model to ONNX or optimizes with TensorRT
- Evaluates a vision model (mAP, confusion matrix, per-class metrics)
- Implements a U-Net, DeepLabV3, or similar segmentation architecture
Do NOT trigger this skill for:
- Pure NLP tasks with no visual component (use a language-model skill instead)
- 3D point-cloud processing or LiDAR-only pipelines (overlap is limited; check domain)
---
## Key principles
1. **Start with pretrained models** - Fine-tune ImageNet/C