clip

Featured

OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpose image understanding.

AI & Automation 27,984 stars 2901 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# CLIP - Contrastive Language-Image Pre-Training OpenAI's model that understands images from natural language. ## When to use CLIP **Use when:** - Zero-shot image classification (no training data needed) - Image-text similarity/matching - Semantic image search - Content moderation (detect NSFW, violence) - Visual question answering - Cross-modal retrieval (image→text, text→image) **Metrics**: - **25,300+ GitHub stars** - Trained on 400M image-text pairs - Matches ResNet-50 on ImageNet (zero-shot) - MIT License **Use alternatives instead**: - **BLIP-2**: Better captioning - **LLaVA**: Vision-language chat - **Segment Anything**: Image segmentation ## Quick start ### Installation ```bash pip install git+https://github.com/openai/CLIP.git pip install torch torchvision ftfy regex tqdm ``` ### Zero-shot classification ```python import torch import clip from PIL import Image # Load model device = "cuda" if torch.cuda.is_available() else "cpu" model, preprocess = clip.load("ViT-B/32", device=device) # Load image image = preprocess(Image.open("photo.jpg")).unsqueeze(0).to(device) # Define possible labels text = clip.tokenize(["a dog", "a cat", "a bird", "a car"]).to(device) # Compute similarity with torch.no_grad(): image_features = model.encode_image(image) text_features = model.encode_text(text) # Cosine similarity logits_per_image, logits_per_text = model(image, text) probs = logits_per_image.softmax(dim=-1).cpu().numpy() # Print results labels ...

Details

Author
davila7
Repository
davila7/claude-code-templates
Created
11 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category