📚 DocuMind AI Assistant is an intelligent document analysis and interactive AI chatbot that helps users extract key insights, analyze content, and ask questions about the document's information in real-time. This project employs a Retrieval-Augmented Generation (RAG) approach, enabling it to provide accurate and contextually relevant responses by combining document retrieval and AI-generated answers. The app supports PDF, Word, and Image files, making document processing easier and more interactive.
- Smart Document Analysis: Instantly extract key insights, summaries, and important content from your documents.
- Interactive Chat: Ask questions and receive intelligent, context-based answers directly from the document.
- Multi-Format Support: Supports PDF, Word, and Image files for versatile document processing.
To get the DocuMind AI Assistant up and running on your local machine, follow these steps:
Ensure you have the following installed:
- Python 3.7+
- pip (Python package installer)
-
Clone the repository:
git clone https://github.com/man-swi/DocuMind-AI-Assistant.git cd DocuMind-AI-Assistant -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up your API Key:
- Obtain your API key by signing up at Mistral AI.
- Create a
.envfile in the project root directory and add your Mistral API key:MISTRAL_API_KEY=your_api_key_here
To launch the DocuMind AI Assistant locally, follow these steps:
-
Start the Streamlit app:
streamlit run app.py
-
Open the app in your browser:
- Go to http://localhost:8501 to interact with the DocuMind AI Assistant.
- Upload a document (PDF, Word, or Image) and let the AI extract key insights, important topics, and summaries. The assistant processes text, images, and more to provide meaningful analysis.
- Engage with the assistant by asking questions about the document’s content. The AI will respond intelligently, providing answers based on the document's contents.
Here are some screenshots showing how the DocuMind AI Assistant works:
-
DocuMind-AI-Assistant:
This image shows the main interface where the chatbot assistant interacts with the user. It’s the place where all conversations happen.

-
Extracted Text:
This image displays the extracted text from a document, providing insights and summaries to the user.

-
Settings Option:
This screenshot shows the settings section where users can print the output or change the theme of the assistant interface.

-
Chat Feature with Save:
This image shows the chat section where users can ask questions, and the assistant can save the conversation data for later download or printing.

This project uses a combination of technologies to build the interactive and intelligent document assistant:
- Python: Core programming language.
- Streamlit: Framework for building interactive, web-based applications.
- Mistral AI: The AI model responsible for document analysis and real-time interaction.
- PyPDF2: Library for PDF text extraction.
- python-docx: Library to read and extract data from DOCX files.
- pytesseract: Optical Character Recognition (OCR) tool for extracting text from images.
- Pillow: Image processing library used for handling and manipulating images.
- FAISS: Library for efficient similarity search and clustering of dense vectors.
You can access the live version of DocuMind AI Assistant here:
DocuMind AI Assistant - Streamlit App