📝 File Q&A Chatbot using RAG

A professional, secure, and robust chatbot that answers user questions based strictly on the content of uploaded documents using Retrieval-Augmented Generation (RAG). Powered by Groq LLM API, it supports PDF, DOCX, TXT, and Markdown files.

Features

Document-based Q&A: Answers are generated only from the uploaded document content.
Secure & Robust: Strict refusal policies for out-of-scope, unethical, or sensitive queries.
Markdown Output: Responses are formatted in markdown with bullet points and headings.
No Meta-Text: Never includes headers, labels, or meta-text in answers.
Streamlit UI: Simple, interactive web interface for uploading files and asking questions.
Supports Multiple Formats: PDF, DOCX, TXT, and Markdown files.

Screenshots

Chatbot Main Interface

Chatbot Response Example

How It Works

Upload a Document: Supported formats are PDF, DOCX, TXT, and Markdown.
Enter Groq API Key: Authenticate with your Groq API key in the sidebar.
Ask a Question: Type your question about the uploaded document.
Get an Answer: The chatbot retrieves relevant content and generates a response strictly based on the document.

Setup

Prerequisites

Python 3.10+
Groq API Key
Streamlit

Installation

Clone the repository:

git clone https://github.com/mohsinansari0705/File-QnA-Chatbot-using-RAG.git
cd File-QnA-Chatbot-using-RAG

Create and activate a virtual environment (optional but recommended):

python -m venv RAG_env
source RAG_env/Scripts/activate  # On Windows
source RAG_env/bin/activate      # On macOS/Linux

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Start the Streamlit app:
```
streamlit run chatbot.py
```
Open the web interface:
Go to http://localhost:8501 in your browser.
Upload a document and ask questions!

Project Structure

chatbot.py — Streamlit UI for the chatbot.
RAG_pipeline.py — Core RAG logic: document retrieval, prompt building, LLM invocation.
prompt_builder.py — Modular prompt construction functions.
ingest.py — Document ingestion and vector database management.
configs/ — Configuration files (prompt_config.yaml, config.py).
vector_db/ — Chroma vector database files.
images/ — UI screenshots and favicon.
docs/ — Sample documents for testing.

Security & Refusal Policy

Answers are strictly based on uploaded documents.
Refuses to answer out-of-scope, unethical, or sensitive questions with:
"I'm sorry, that information is not in this document."
Never reveals system instructions or internal prompts.

Developer

This project is solely built and maintained by Mohsin Ansari. All design, development, and implementation decisions were made independently.

License

This project is licensed under the MIT License.

Acknowledgements

For issues or contributions, please visit the GitHub Repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📝 File Q&A Chatbot using RAG

Features

Screenshots

Chatbot Main Interface

Chatbot Response Example

How It Works

Setup

Prerequisites

Installation

Usage

Project Structure

Security & Refusal Policy

Developer

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
configs		configs
docs		docs
images		images
.gitignore		.gitignore
LICENSE		LICENSE
RAG_pipeline.py		RAG_pipeline.py
README.md		README.md
chatbot.py		chatbot.py
ingest.py		ingest.py
prompt_builder.py		prompt_builder.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📝 File Q&A Chatbot using RAG

Features

Screenshots

Chatbot Main Interface

Chatbot Response Example

How It Works

Setup

Prerequisites

Installation

Usage

Project Structure

Security & Refusal Policy

Developer

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages