← ClaudeAtlas

fiftyone-find-duplicateslisted

Find duplicate or near-duplicate images in FiftyOne datasets using brain similarity computation. Use when users want to deduplicate datasets, find similar images, cluster visually similar content, or remove redundant samples. Requires FiftyOne MCP server with @voxel51/brain plugin installed.
aiskillstore/marketplace · ★ 329 · AI & Automation · score 79
Install: claude install-skill aiskillstore/marketplace
# Find Duplicates in FiftyOne Datasets ## Overview Find and remove duplicate or near-duplicate images using FiftyOne's brain similarity operators. Uses deep learning embeddings to identify visually similar images. **Use this skill when:** - Removing duplicate images from datasets - Finding near-duplicate images (similar but not identical) - Clustering visually similar images - Cleaning datasets before training ## Prerequisites - FiftyOne MCP server installed and running - `@voxel51/brain` plugin installed and enabled - Dataset with image samples loaded in FiftyOne ## Key Directives **ALWAYS follow these rules:** ### 1. Set context first ```python set_context(dataset_name="my-dataset") ``` ### 2. Launch FiftyOne App Brain operators are delegated and require the app: ```python launch_app() ``` Wait 5-10 seconds for initialization. ### 3. Discover operators dynamically ```python # List all brain operators list_operators(builtin_only=False) # Get schema for specific operator get_operator_schema(operator_uri="@voxel51/brain/compute_similarity") ``` ### 4. Compute embeddings before finding duplicates ```python execute_operator( operator_uri="@voxel51/brain/compute_similarity", params={"brain_key": "img_sim", "model": "mobilenet-v2-imagenet-torch"} ) ``` ### 5. Close app when done ```python close_app() ``` ## Complete Workflow ### Step 1: Setup ```python # Set context set_context(dataset_name="my-dataset") # Launch app (required for brain operators) launch_ap