rag

Solid

Implements document chunking, embedding generation, vector storage, and retrieval pipelines for Retrieval-Augmented Generation systems. Use when building RAG applications, creating document Q&A systems, or integrating AI with knowledge bases.

AI & Automation 256 stars 28 forks Updated 6 days ago MIT

Install

View on GitHub

Quality Score: 91/100

Stars 20%
80
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# RAG Implementation Build Retrieval-Augmented Generation systems that extend AI capabilities with external knowledge sources. ## Overview This skill covers: document processing, embedding generation, vector storage, retrieval configuration, and RAG pipeline implementation. ## When to Use - Building Q&A systems over proprietary documents - Creating chatbots with factual information from knowledge bases - Implementing semantic search with natural language queries - Reducing hallucinations with grounded, sourced responses - Building documentation assistants and research tools - Enabling AI systems to access domain-specific knowledge ## Instructions ### Step 1: Choose Vector Database Select based on your requirements: | Requirement | Recommended | |-------------|-------------| | Production scalability | Pinecone, Milvus | | Open-source | Weaviate, Qdrant | | Local development | Chroma, FAISS | | Hybrid search | Weaviate with BM25 | ### Step 2: Select Embedding Model | Use Case | Model | |----------|-------| | General purpose | text-embedding-ada-002 | | Fast and lightweight | all-MiniLM-L6-v2 | | Multilingual | e5-large-v2 | | Best performance | bge-large-en-v1.5 | ### Step 3: Implement Document Processing Pipeline 1. Load documents from source (file system, database, API) 2. Clean and preprocess (remove formatting, normalize text) 3. Split documents into chunks with appropriate strategy 4. Generate embeddings for each chunk 5. Store embeddings in vector database wi...

Details

Author
giuseppe-trisciuoglio
Repository
giuseppe-trisciuoglio/developer-kit
Created
7 months ago
Last Updated
6 days ago
Language
Python
License
MIT

Integrates with

Related Skills