← ClaudeAtlas

computer-visionlisted

Use this skill when building computer vision applications, implementing image classification, object detection, or segmentation pipelines. Triggers on image classification, object detection, YOLO, semantic segmentation, image preprocessing, data augmentation, transfer learning, CNN architectures, vision transformers, and any task requiring visual recognition or image analysis.
Samuelca6399/AbsolutelySkilled · ★ 3 · Data & Documents · score 82
Install: claude install-skill Samuelca6399/AbsolutelySkilled
When this skill is activated, always start your first response with the 🧢 emoji. # Computer Vision Computer vision enables machines to interpret and reason about visual data - images, video, and multi-modal inputs. Modern CV pipelines are built on deep neural networks pretrained on large datasets (ImageNet, COCO, ADE20K) and fine-tuned for specific domains. PyTorch and its ecosystem (torchvision, timm, ultralytics, albumentations) cover the full stack from data loading through deployment. Foundation models like SAM, DINOv2, and OpenCLIP have shifted best practice toward prompt-based and zero-shot approaches before committing to full training runs. --- ## When to use this skill Trigger this skill when the user: - Trains or fine-tunes an image classifier on a custom dataset - Runs inference with YOLO, DETR, or other detection models - Builds a semantic or instance segmentation pipeline - Implements data augmentation for CV training - Preprocesses images for model ingestion (resize, normalize, batch) - Exports a vision model to ONNX or optimizes with TensorRT - Evaluates a vision model (mAP, confusion matrix, per-class metrics) - Implements a U-Net, DeepLabV3, or similar segmentation architecture Do NOT trigger this skill for: - Pure NLP tasks with no visual component (use a language-model skill instead) - 3D point-cloud processing or LiDAR-only pipelines (overlap is limited; check domain) --- ## Key principles 1. **Start with pretrained models** - Fine-tune ImageNet/C