AskDocs AI is an AI-powered chatbot that leverages Hybrid RAG (Retrieval-Augmented Generation) to answer your questions based on the content of uploaded PDFs. It combines semantic vector search with traditional keyword-based search for superior accuracy.
- Hybrid Search: Combines ChromaDB (semantic) and BM25 (keyword) retrieval.
- LLM Powered: High-performance LLM via Groq Cloud.
- Async Processing: PDF ingestion and indexing are offloaded to background threads.
- Multimodal Support: Optimized for PDF extraction and processing.
- Backend: FastAPI, LangChain (Classic), ChromaDB, Groq Cloud
- Frontend: Streamlit
- Search Engines: BM25 (Keyword), Vector (Cosine Similarity)
- Embeddings: HuggingFace (Sentence Transformers)
- Containerization: Docker & Docker Compose
Control the behavior of the Hybrid Search by adjusting weights in your .env file or server/config.py:
| Variable | Description | Default |
|---|---|---|
HYBRID_SEARCH_BM25_WEIGHT |
Weight for keyword search (0.0 to 1.0) | 0.5 |
HYBRID_SEARCH_CHROMA_WEIGHT |
Weight for semantic search (0.0 to 1.0) | 0.5 |
GROQ_API_KEY |
Your Groq Cloud API Key | Required |
- Split Dependencies: Client and Server have separate requirement files to minimize image sizes.
- CPU-Only Optimization: Server image is optimized for CPU-only environments, reducing size from ~12.8GB to ~2.3GB.
- Persistent Memory: Uses Docker volumes to persist the ChromaDB vector store and uploaded files.
Create a .env file in the root directory:
GROQ_API_KEY=your_api_key_heredocker-compose up -d --build- Streamlit UI: http://localhost:8501
- FastAPI Docs: http://localhost:8000/docs
-
Create and activate a virtual environment
python -m venv venv .\venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.client.txt pip install -r requirements.server.txt
-
Run the services
# Backend python server/main.py # Frontend streamlit run client/main.py
To verify that the Hybrid Search mechanism and LLM integration are working correctly:
python server/tests/test_hybrid_search.pyThis script validates:
- Vectorstore connectivity.
- BM25 index reconstruction.
- Ensemble Retriever initialization.

