DemoAI/ML

ContextIQ: Production RAG System

Production RAG API supporting 11+ file formats with TF-IDF embeddings, ChromaDB vector search, and smart caching.

FastAPIChromaDBLangChainPython

The Problem

Most RAG demos handle PDFs only. Real organizations have knowledge spread across 11+ file formats, and retrieval quality degrades without hybrid search strategies.

Architecture & Approach

Production RAG API with hybrid retrieval combining TF-IDF keyword search and vector similarity search, smart caching for repeated queries, and multi-format document processing.

Key Technical Decisions

Hybrid retrieval with Reciprocal Rank Fusion

Pure semantic search misses exact keyword matches that users expect. Combining keyword and vector search with RRF scoring gives the best of both worlds.

Smart caching layer

Repeated queries to the same document corpus shouldn't re-embed or re-retrieve. Caching dramatically reduces latency and API costs for common queries.

Results

API supporting 11+ file formats with deployed demo on HuggingFace.

Interactive Demo

Loading demo (free tier may take 30s to wake up)...

AI Collaborative WorkspacePrevious Virtual Banking MicroservicesNext