building-rag-systemslisted

Build production RAG systems with semantic chunking, incremental indexing, and filtered retrieval. Use when implementing document ingestion pipelines, vector search with Qdrant, or context-aware retrieval. Covers chunking strategies, change detection, payload indexing, and context expansion. NOT when doing simple similarity search without production requirements.
aiskillstore/marketplace · ★ 329 · AI & Automation · score 79

Install: claude install-skill aiskillstore/marketplace

# Building RAG Systems Production-grade RAG with semantic chunking, incremental updates, and filtered retrieval. ## Quick Start ```bash # Dependencies pip install qdrant-client openai pydantic python-frontmatter # Core components # 1. Crawler → discovers files, extracts path metadata # 2. Parser → extracts frontmatter, computes file hash # 3. Chunker → semantic split on ## headers, 400 tokens, 15% overlap # 4. Embedder → batched OpenAI embeddings # 5. Uploader → Qdrant upsert with indexed payloads ``` --- ## Ingestion Pipeline ### Architecture ``` ┌──────────┐ ┌────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐ │ Crawler │ -> │ Parser │ -> │ Chunker │ -> │ Embedder │ -> │ Uploader │ └──────────┘ └────────┘ └─────────┘ └──────────┘ └──────────┘ │ │ │ │ │ Discovers Extracts Splits by Generates Upserts to files frontmatter semantic vectors Qdrant + file hash boundaries (batched) (batched) ``` ### Semantic Chunking (NOT Fixed-Size) ```python class SemanticChunker: """ Production chunking: - Split on ## headers (semantic boundaries) - Target 400 tokens (NVIDIA benchmark optimal) - 15% overlap for context continuity - Track prev/next for context expansion """ SECTION_PATTERN = re.compile(r"(?=^## )", re.MULTILINE) TOKENS_PER_WORD = 1.3 de