Chat with any documentation using AI. Provide a documentation link, and the system will scrape, process, and convert it into a searchable knowledge base that you can interact with through natural language.
Check out the live website here: DocChat
For detailed documentation, please refer to the files in the docs/ folder:
| Guide | Description |
|---|---|
| 🏗️ Architecture Guide | In-depth pipeline sequences, database details, and Mermaid workflows. |
| ⚙️ Backend Guide | Folder layout, database schemas, BullMQ queuing, and encryption details. |
| 🎨 Frontend Guide | Vite/Tailwind v4 layouts, routing, caching layers, and styling principles. |
| 🔌 API Reference | Complete endpoints map, parameters, JSON shapes, and auth requirements. |
| 🛠️ Setup & Troubleshooting | Detailed listing of 25+ env variables, local Redis/pg configurations, and debug steps. |
DocChat consists of a React UI communicating with an Express server, orchestrating background indexers via BullMQ (Redis) and persisting embeddings in Qdrant and metadata in PostgreSQL.
sequenceDiagram
autonumber
actor User
participant F as Frontend (Vite + React)
participant B as Backend (Express API)
participant R as Redis (BullMQ)
participant W as Worker (chatWorker)
participant Q as Qdrant Vector DB
participant DB as PostgreSQL (Prisma)
User->>F: Enter Docs URL & Choose Mode
F->>B: POST /api/v1/chat/create
B->>DB: Check reuse (existing source URL)
alt URL is already indexed
DB-->>B: Return existing source collection
B-->>F: Chat session ready (Instant)
else URL is new
B->>DB: Save Chat (status: QUEUED)
B->>R: Queue Ingestion Job (chatCreation)
B-->>F: Return Chat ID (asynchronous processing)
F->>B: Poll progress status via /api/v1/chat/status/:id
R->>W: Process job
W->>W: Crawl, split text, generate embeddings
W->>Q: Upsert vector chunks
W->>DB: Mark Chat status: READY
end
User->>F: Submit Prompt
F->>B: POST /api/v1/message/send (SSE Stream)
B->>Q: Similarity Search (Score >= 0.35)
Q-->>B: Top 5 Text Chunks
B->>B: Assemble prompt with context & chat history
B->>F: Stream chunks back to client
- Accepts a documentation URL as input.
- Recursively crawls internal pages up to config limits.
- Automatically respects
robots.txtinstructions.
- Vector Mode: Generates vector embeddings for extracted text chunks using OpenRouter, and stores them in Qdrant for semantic search.
- Vectorless Mode (TreeIndex): Builds a documentation structural tree (no vector embeddings required) and retrieves nodes directly from the generated tree. Useful for resource-constrained or offline-friendly index pipelines.
- Reuses existing Qdrant collections or TreeIndex trees when URLs match, facilitating instant chat creation for both the original user and other users.
- Long-Term Memory: Optional connection with
Mem0key captures and injects user profile context over multiple sessions. - Token Budget limits: Tracks and restricts daily user token counts in Redis.
- Audit Event Logs: Automatically audits administrative actions, model usages, and ingestions.
- Admin Control Center: Built-in visual panel to inspect user token consumptions, audit events, and sweep orphaned collections in Qdrant.
Before running, make sure you have Node.js (v20+ recommended), Docker, and pnpm installed.
git clone https://github.com/avishek0769/DocChat.git
cd DocChatCopy the backend example file to .env and fill in the required variables (see Setup & Troubleshooting Guide for full descriptions):
cp backend/.env.example backend/.envStart the local Redis Stack (required for workers and token limits):
docker compose up -dInstall frontend and backend packages:
# Root (Frontend)
pnpm install
# Backend
cd backend
pnpm installRun Prisma migration commands inside the backend/ directory:
pnpm dlx prisma migrate dev --name init
pnpm dlx prisma generateStart the application components:
# 1. Start Frontend (run from repository root)
pnpm run dev
# 2. Start Backend API Server (run from backend/ directory)
pnpm run dev
# 3. Start Background Ingestion Worker (run from backend/ directory)
node chatWorker.jsContributions are welcome! If you would like to help improve DocChat, please review our Contributing Guide to understand branch naming, PR checklist procedures, and codebase guidelines.
This project is licensed under the MIT License.