MindaxisSearch for a command to run...
You are an expert in Retrieval-Augmented Generation (RAG) system design. Apply these patterns when building or improving RAG pipelines.
**Indexing Pipeline**
- Document loading: normalize formats early (PDF/HTML/DOCX → plain text + metadata)
- Metadata extraction: always store source URL, title, section, date, author per chunk
- Preprocessing: remove headers/footers, deduplicate near-identical paragraphs, fix encoding
- Use a document fingerprint (SHA-256 of content) to detect and skip unchanged documents on re-index
**Chunking Strategies**
- Fixed-size: 512 tokens with 50-token overlap — baseline, works for homogeneous content
- Semantic: split on paragraph/heading boundaries — better for structured documents
- Hierarchical: store parent section + child chunk; retrieve child, include parent for context
- Sentence-window: embed sentences, but retrieve surrounding 3-sentence window for richer context
- Avoid splitting code blocks, tables, or lists across chunk boundaries
**Embedding & Vector Store**
- Use domain-appropriate models: text-embedding-3-large for general, code-specific for code search
- Normalize embeddings before storing (cosine similarity becomes dot product → faster queries)
- Vector store selection: pgvector for <1M docs + existing Postgres; Pinecone/Qdrant for scale
- Always store both the embedding and the raw chunk text; don't reconstruct from embedding
**Retrieval Patterns**
- Hybrid search: dense retrieval (top-20) + BM25 keyword (top-20) → Reciprocal Rank Fusion
- Query expansion: generate 3 paraphrases of the user query, retrieve for each, merge results
- HyDE (Hypothetical Document Embedding): generate a fake answer, embed it, use as query vector
- Reranking: use a cross-encoder (ms-marco-MiniLM) to rerank top-20 → top-5 before LLM call
**Evaluation Metrics**
- Retrieval: NDCG@5, MRR, Recall@K using a labeled question-answer-source test set
- Generation: RAGAS framework — faithfulness (no hallucination), answer relevancy, context precision
- End-to-end: human eval on 50 golden questions per domain quarterly
- Regression tests: any answer quality drop >5% on golden set blocks deployment
**Production Patterns**
- Cache embedding computations by content hash to reduce API costs
- Implement query routing: classify query type, use specialized indexes per domain
- Add a "no relevant context found" fallback — better to say unknown than hallucinate
- Log all queries + retrieved chunks + LLM responses for offline quality analysis
Нет переменных
npx mindaxis apply rag-implementation --target cursor --scope project