vector-index-tuninglisted

Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.
CodeWithBehnam/cc-docs · ★ 0 · AI & Automation · score 70

Install: claude install-skill CodeWithBehnam/cc-docs

# Vector Index Tuning Guide to optimizing vector indexes for production performance. ## When to Use This Skill - Tuning HNSW parameters - Implementing quantization - Optimizing memory usage - Reducing search latency - Balancing recall vs speed - Scaling to billions of vectors ## Core Concepts ### 1. Index Type Selection ``` Data Size Recommended Index ──────────────────────────────────────── < 10K vectors → Flat (exact search) 10K - 1M → HNSW 1M - 100M → HNSW + Quantization > 100M → IVF + PQ or DiskANN ``` ### 2. HNSW Parameters | Parameter | Default | Effect | | ------------------ | ------- | ---------------------------------------------------- | | **M** | 16 | Connections per node, ↑ = better recall, more memory | | **efConstruction** | 100 | Build quality, ↑ = better index, slower build | | **efSearch** | 50 | Search quality, ↑ = better recall, slower search | ### 3. Quantization Types ``` Full Precision (FP32): 4 bytes × dimensions Half Precision (FP16): 2 bytes × dimensions INT8 Scalar: 1 byte × dimensions Product Quantization: ~32-64 bytes total Binary: dimensions/8 bytes ``` ## Templates ### Template 1: HNSW Parameter Tuning ```python import numpy as np from typing import List, Tuple import time def benchmark_hnsw_parameters( vectors: np.ndarray, queries: np.ndarray, ground_truth