Ask questions about your PDF documents in plain English. Upload any document and get accurate, cited answers — powered by semantic search and Google Gemini.
Built to explore production patterns in LLM systems: vector retrieval, streaming responses, session isolation, and grounded generation.
UPLOAD FLOW
─────────────────────────────────────────────────────────────
PDF ──► PyMuPDF ──► Chunks (512 tok, 50 overlap) ──► Gemini Embeddings ──► ChromaDB
│
(keyed by session ID)
QUERY FLOW
─────────────────────────────────────────────────────────────
User Question ──► Gemini Embeddings ──► Cosine Similarity Search ──► Top 5 Chunks
│
┌────────────▼────────────┐
│ Gemini LLM │
│ + System Prompt │
│ + Conversation History │
│ (last 5 messages) │
└────────────┬────────────┘
│
Streamed Answer + Citations
- Semantic Search — vector similarity retrieval via ChromaDB, not keyword matching
- Streaming Responses — token-by-token generation via Server-Sent Events
- Source Citations — every answer links back to the exact page and chunk it came from
- Conversation History — sliding window of last 5 messages for context-aware follow-ups
- Multi-User Sessions — UUID-based session isolation; each user's documents are namespaced separately
- Grounded Generation — system prompt enforces answers only from retrieved context, eliminating hallucination
- Production Patterns — rate limiting, structured logging, retry logic with exponential backoff
| Layer | Technology |
|---|---|
| Backend | Python, FastAPI |
| Vector Store | ChromaDB |
| LLM + Embeddings | Google Gemini API |
| PDF Parsing | PyMuPDF |
| Token Counting | tiktoken |
| Frontend | React, Tailwind CSS, Vite |
| Containerization | Docker, Docker Compose |
- Docker and Docker Compose
- Google Gemini API key (get one here)
git clone https://github.com/kshitizj03/rag-pdf-chat
cd RAG
cp .env.example .env
# Add your GEMINI_API_KEY to .env
docker compose up --build- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
Backend
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
uvicorn main:app --reload --port 8000Frontend
cd frontend
npm install
cp .env.example .env
npm run devMIT