Skip to content

Da-ya7/PII_FULLSTACK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PII Fullstack Redaction Platform

PII Fullstack Banner

GitHub Stars GitHub Forks GitHub Issues Last Commit

GitHub Repo License Python Flutter Flask MySQL

Production-style fullstack system for detecting and redacting Personally Identifiable Information from document images using a hybrid AI pipeline.

Why This Project

  • End-to-end workflow: authentication, upload, detection, decisioning, and redacted output.
  • Multi-stage AI pipeline: OCR + regex + NER + policy-aware decisions.
  • Fullstack delivery: Flutter client and Flask backend in one repository.
  • Audit-focused: redaction results and processing logs are available in the app.

Repository Structure

PII_FULLSTACK/
├─ PII/                     # Main application
│  ├─ lib/                  # Flutter app (UI + state + services)
│  ├─ modules/              # AI pipeline modules (OCR, regex, NER, hybrid, RAG, redaction)
│  ├─ app.py                # Flask backend entry point
│  ├─ requirements.txt      # Python dependencies
│  ├─ pubspec.yaml          # Flutter dependencies
│  ├─ RUNNING_GUIDE.md
│  └─ TESTING_GUIDE.md
├─ README.md                # This file
└─ LICENSE

System Architecture

Flutter Client
	|
	| HTTP API
	v
Flask Backend (app.py)
	|
	+--> OCR Engine (Tesseract + OpenCV)
	+--> Regex Detector
	+--> NER Detector (SpaCy)
	+--> Hybrid Fusion Engine
	+--> Policy Decision Engine (RAG/FAISS fallback-aware)
	+--> Redaction Engine (text + image)
	|
	+--> MySQL (users, auth, audit logs)

Quick Start

1. Clone

git clone https://github.com/Da-ya7/PII_FULLSTACK.git
cd PII_FULLSTACK/PII

2. Backend Setup

python -m venv .venv
# Windows
.venv\Scripts\activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm

3. Environment Configuration

copy .env.example .env

Update .env with your local configuration values.

4. Start Backend

python app.py

Backend health endpoint:

http://127.0.0.1:5000/api/health

5. Start Flutter Web Client

flutter pub get
flutter run -d chrome --dart-define=API_BASE_URL=http://127.0.0.1:5000 --web-port=5080

Client URL:

http://localhost:5080

Core Features

  • Secure user registration/login.
  • Document upload and processing from Flutter UI.
  • OCR extraction with bounding boxes.
  • Structured PII detection via regex patterns.
  • Contextual PII detection via NER.
  • Hybrid confidence-aware fusion.
  • Policy-driven action (full redact, partial mask, keep).
  • Downloadable redacted results.
  • Audit history for processed files.

Tech Stack

  • Frontend: Flutter, Provider, Material 3
  • Backend: Flask, Flask-CORS, bcrypt
  • AI/NLP: Tesseract OCR, OpenCV, SpaCy, sentence-transformers, FAISS
  • Data: MySQL

Documentation

Security Notes

  • .env, local uploads, generated build artifacts, and dataset folders are ignored via .gitignore.
  • Use non-production credentials locally and rotate secrets before deployment.

License

Licensed under MIT. See LICENSE.

About

Secure full-stack PII redaction platform that uses OCR, regex, NER, and policy-aware AI to detect and redact sensitive data from documents, with a Flutter frontend, Flask API, and MySQL-backed audit logging.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors