Azure RAG Document Q&A
ProductionFull-stack document intelligence platform built on Azure — zero to deployed in 3 hours
Problem
Organizations struggle with information discoverability. Employees need fast, accurate answers from internal documents without manual searching. Traditional full-text search yields poor results for semantic queries, while raw LLM chatbots hallucinate when facts matter.
Solution
A web application that ingests documents, generates vector embeddings via Azure OpenAI, stores them in Azure AI Search with hybrid indexing, and answers questions using GPT-4o grounded in retrieved document context. Includes both Q&A interface and Azure Bot Service chat.
Key Features
- Document upload with automatic chunking (800 chars + 200 overlap)
- Vector embeddings via Azure OpenAI text-embedding-3-small
- Hybrid search (vector similarity + keyword matching)
- GPT-4o answer generation with source citations
- Dual interface: web Q&A and Azure Bot Service chat
- Analytics endpoint tracking visitor activity
- Environment-agnostic configuration (local dev to Azure)
Tech Stack
Architecture
RAG pipeline: Documents → Chunks → Embeddings (text-embedding-3-small) → Azure AI Search (HNSW vector index). Queries → Embedding → Hybrid search (top 5) → Context + question → GPT-4o → Grounded answer with citations. Flask app deployed on Azure App Service (Linux, Python 3.12, gunicorn).
Screenshots
Screenshots coming soon
Metrics
My Role
Sole developer. Built from zero Azure experience to deployed, publicly accessible system in approximately 3 hours. Configured 6 Azure services, implemented the full RAG pipeline, and optimized deployment from 17 minutes to 80 seconds.