computer-vision

Solid

Use this skill when building computer vision applications, implementing image classification, object detection, or segmentation pipelines. Triggers on image classification, object detection, YOLO, semantic segmentation, image preprocessing, data augmentation, transfer learning, CNN architectures, vision transformers, and any task requiring visual recognition or image analysis.

Data & Documents 164 stars 28 forks Updated yesterday MIT

Install

View on GitHub

Quality Score: 92/100

Stars 20%
74
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

When this skill is activated, always start your first response with the ๐Ÿงข emoji. # Computer Vision Computer vision enables machines to interpret and reason about visual data - images, video, and multi-modal inputs. Modern CV pipelines are built on deep neural networks pretrained on large datasets (ImageNet, COCO, ADE20K) and fine-tuned for specific domains. PyTorch and its ecosystem (torchvision, timm, ultralytics, albumentations) cover the full stack from data loading through deployment. Foundation models like SAM, DINOv2, and OpenCLIP have shifted best practice toward prompt-based and zero-shot approaches before committing to full training runs. --- ## When to use this skill Trigger this skill when the user: - Trains or fine-tunes an image classifier on a custom dataset - Runs inference with YOLO, DETR, or other detection models - Builds a semantic or instance segmentation pipeline - Implements data augmentation for CV training - Preprocesses images for model ingestion (resize, normalize, batch) - Exports a vision model to ONNX or optimizes with TensorRT - Evaluates a vision model (mAP, confusion matrix, per-class metrics) - Implements a U-Net, DeepLabV3, or similar segmentation architecture Do NOT trigger this skill for: - Pure NLP tasks with no visual component (use a language-model skill instead) - 3D point-cloud processing or LiDAR-only pipelines (overlap is limited; check domain) --- ## Key principles 1. **Start with pretrained models** - Fine-tune ImageNet/C...

Details

Author
AbsolutelySkilled
Repository
AbsolutelySkilled/AbsolutelySkilled
Created
2 months ago
Last Updated
yesterday
Language
MDX
License
MIT

Related Skills