🛡️ PolicyWise — AI Legal Policy Assistant

An intelligent assistant that analyzes policy and legal clauses using RAG, a custom ML risk classifier, and LLM-powered explanations.

✨ What is PolicyWise?

PolicyWise is an AI-powered tool that helps Compliance and Legal teams quickly evaluate policy or legal clauses.

It combines:

RAG (Retrieval-Augmented Generation) → Finds relevant text inside uploaded PDF policies
Machine Learning Classifier → Predicts if a clause is COMPLIANT or RISKY
LLM Explanation (OpenAI) → Gives clear explanations and safer rewrites

This makes PolicyWise a smart internal assistant for reviewing documents.

🚀 Features

🔍 1. Document Search (RAG)

Upload PDF policy documents.
PolicyWise will:

Extract text
Break it into chunks
Create embeddings
Use FAISS to retrieve the most relevant sections

🛡️ 2. Risk Classifier (ML Model)

A Logistic Regression + TF-IDF classifier trained by me.
It predicts:

COMPLIANT
RISKY

With a confidence score.

🤖 3. AI Explanation (LLM-Enhanced)

If an OpenAI key is provided, PolicyWise can:

Explain why a clause is risky
Highlight dangerous wording
Suggest a safer rewrite
Use RAG + ML to give better, more contextual answers

📁 Project Structure

policy-wise/
│
├── app.py                 # Main Streamlit application
├── train_model.py         # Training script for ML classifier
├── requirements.txt       # Project dependencies
├── README.md              # Documentation
│
├── policy_model.pkl       # (Optional) Saved ML classifier
├── policy_vectorizer.pkl  # (Optional) Saved TF-IDF vectorizer
│
├── assets/
│   └── policywise.png     # Screenshot for README
│
├── .streamlit/
│   └── config.toml        # Technical blue theme for UI
│
├── LICENSE
└── .gitignore             # Ignored files (venv, .env, cache, etc.)

🧱 Architecture

Here’s a compact high-level overview of how PolicyWise processes, analyzes, and evaluates policy text:

🧑‍💻 User (Streamlit UI) → 📄 PDF Processing (Extract + Chunk + Embed)
→ 🔍 FAISS Search (RAG) → 🛡️ ML Classifier (TF-IDF + LR) → 📤 Final Output

🛠️ Installation

1️⃣ Create a virtual environment

python -m venv venv
venv\Scripts\activate

2️⃣ Install dependencies

pip install -r requirements.txt

3️⃣ Train the ML model

python train_model.py

4️⃣ Run the Streamlit application

streamlit run app.py

🚀 HOW IT WORKS

📄 Upload Policy PDFs
- Extract text using PyPDF
- Split into overlapping chunks
- Create embeddings (OpenAI)
- Store vectors in FAISS index
✍️ Enter a Clause
- Convert clause → embedding
- Search FAISS for top-matching policy snippets (RAG)
🛡️ ML Risk Classification
- TF-IDF vectorizer transforms text
- Logistic Regression predicts: → ✅ COMPLIANT → ❌ RISKY
- Outputs label + confidence score
🤖 LLM Review
- Combine: user clause + retrieved policy snippets + ML output
- AI generates:
  - Explanation of risk
  - Highlighted vague phrases
  - A safer rewritten version
📤 Final Output
- ML prediction
- Relevant policy snippets (RAG)
- LLM explanation + rewrite

🧰 TECH STACK

FRONTEND 🖥️

Technology	Purpose
🎨 Streamlit	UI & user interaction
🐍 Python	Core language

BACKEND / PROCESSING ⚙️

Technology	Purpose
📄 PyPDF	Extract text from PDFs
✂️ Custom Chunking	Split policy text into chunks
🧠 OpenAI Embeddings	Convert text into vectors
🗃️ FAISS Vector DB	Fast semantic search (RAG)
📊 Scikit-learn	ML toolkit
🧩 TF-IDF Vectorizer	Transform text for ML model
🛡️ Logistic Regression	Classify COMPLIANT / RISKY

AI LAYER 🤖

Technology	Purpose
🧠 OpenAI Chat Models	Explanation + safer rewrite
🔍 RAG Pipeline	Retrieve relevant policy snippets

UTILITIES 🔧

Technology	Purpose
🔑 Python-dotenv	Load environment variables
💾 Pickle	Save model & vectorizer
🔢 NumPy	Numerical operations

💡 Why PolicyWise Matters

Helps Legal teams quickly evaluate compliance risks
Reduces manual effort in reviewing internal policies
Uses a hybrid AI system (RAG + ML + LLM), similar to real enterprise tools
Demonstrates applied knowledge of NLP, vector search, and model pipelines

🚧 Future Enhancements

Add LegalBERT for deeper clause understanding
Add metadata-based RAG (policy titles, categories)
Deploy with Docker / Streamlit Cloud / HuggingFace
Add authentication for internal company use
Add clause history + downloadable reports

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ PolicyWise — AI Legal Policy Assistant

✨ What is PolicyWise?

🚀 Features

🔍 1. Document Search (RAG)

🛡️ 2. Risk Classifier (ML Model)

🤖 3. AI Explanation (LLM-Enhanced)

📁 Project Structure

🧱 Architecture

🛠️ Installation

1️⃣ Create a virtual environment

2️⃣ Install dependencies

3️⃣ Train the ML model

4️⃣ Run the Streamlit application

🚀 HOW IT WORKS

🧰 TECH STACK

FRONTEND 🖥️

BACKEND / PROCESSING ⚙️

AI LAYER 🤖

UTILITIES 🔧

💡 Why PolicyWise Matters

🚧 Future Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.streamlit		.streamlit
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
train_model.py		train_model.py

Folders and files

Latest commit

History

Repository files navigation

🛡️ PolicyWise — AI Legal Policy Assistant

✨ What is PolicyWise?

🚀 Features

🔍 1. Document Search (RAG)

🛡️ 2. Risk Classifier (ML Model)

🤖 3. AI Explanation (LLM-Enhanced)

📁 Project Structure

🧱 Architecture

🛠️ Installation

1️⃣ Create a virtual environment

2️⃣ Install dependencies

3️⃣ Train the ML model

4️⃣ Run the Streamlit application

🚀 HOW IT WORKS

🧰 TECH STACK

FRONTEND 🖥️

BACKEND / PROCESSING ⚙️

AI LAYER 🤖

UTILITIES 🔧

💡 Why PolicyWise Matters

🚧 Future Enhancements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages