Skip to content

NandithaKale/phishshield-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ PhishShield β€” Intelligent Phishing Detection System

A machine learning-based cybersecurity system that detects phishing URLs, emails, and SMS messages with real-time risk classification, explainable AI, and a browser extension.


πŸ“Œ Overview

PhishShield analyzes URLs and text messages to predict whether they are Safe or Phishing, along with confidence scores and explanations.

It follows a modular architecture integrating:

  • Machine Learning models
  • Flask backend APIs
  • Web-based UI
  • Chrome extension for real-time detection

🎯 Objectives

  • Detect phishing URLs using machine learning
  • Provide Explainable AI (XAI) outputs
  • Extend detection to Email and SMS
  • Build a browser extension for real-time detection
  • Store and display detection history

πŸš€ Features

  • πŸ” Real-time phishing detection for URLs
  • πŸ“© Email and SMS phishing analysis
  • πŸ“Š Confidence score with each prediction
  • 🧠 Explainable AI insights (why flagged)
  • 🌐 Chrome extension for live browsing protection
  • πŸ—‚οΈ Detection history storage and retrieval
  • ⚑ Fast API responses using Flask backend
  • πŸ“ˆ Scalable modular architecture

🧠 System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Presentation Layer        | Web UI / Extension β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Backend Layer             | Flask REST API     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Machine Learning Layer    | Feature + Model    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Database Layer            | Detection Storage  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”„ Workflow

User Input (Web / Extension)
        |
        v
Backend API receives request
        |
        v
Feature extraction
        |
        v
ML model prediction
        |
        v
Result + Confidence + Explanation
        |
        v
Display to user + Store in database

πŸ“‚ Project Structure

phishshield/
β”‚
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app.py
β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   β”œβ”€β”€ predict_url.py
β”‚   β”‚   └── predict_text.py
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ predictor.py
β”‚   β”‚   └── model_loader.py
β”‚   └── config.py
β”‚
β”œβ”€β”€ ml_model/
β”‚   β”œβ”€β”€ dataset/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ train_url_model.py
β”‚   β”‚   β”œβ”€β”€ train_text_model.py
β”‚   β”‚   └── feature_extractor.py
β”‚   └── saved_model/
β”‚       β”œβ”€β”€ url_model.pkl
β”‚       └── text_model.pkl
β”‚
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ templates/
β”‚   β”‚   β”œβ”€β”€ index.html
β”‚   β”‚   β”œβ”€β”€ result.html
β”‚   β”‚   └── history.html
β”‚   └── static/
β”‚       β”œβ”€β”€ css/style.css
β”‚       └── js/script.js
β”‚
β”œβ”€β”€ extension/
β”‚   β”œβ”€β”€ manifest.json
β”‚   β”œβ”€β”€ popup.html
β”‚   β”œβ”€β”€ popup.js
β”‚   └── style.css
β”‚
β”œβ”€β”€ database/
β”‚   β”œβ”€β”€ db.py
β”‚   └── schema.sql
β”‚
β”œβ”€β”€ shared/
β”‚   └── feature_extractor.py
β”‚
β”œβ”€β”€ requirements.txt
└── README.md

βš™οΈ Core Modules

πŸ”Ή Feature Extraction

Centralized in shared/feature_extractor.py

Feature Description
URL Length Total character count
Dot Count Number of . in URL
HTTPS Presence Secure protocol check
@ Symbol Common phishing indicator
Hyphen Count Number of - in domain
IP Address Detects direct IP usage
Suspicious Patterns //, redirects
Digit Ratio Proportion of digits

πŸ”Ή Machine Learning

Property Details
Primary Model Random Forest Classifier
Baseline Logistic Regression
Optional XGBoost
Metric F1 Score β‰₯ 0.90
Dataset 50,000+ URLs
Split 70% / 15% / 15%
Labels 0 = Safe, 1 = Phishing

πŸ”Ή API Endpoints

URL Detection

POST /predict

Request

{
  "url": "http://example.com"
}

Response

{
  "result": "phishing",
  "confidence": 0.94,
  "reason": "Contains suspicious symbols"
}

Text / Email / SMS Detection

POST /predict-text

Request

{
  "text": "Your account has been suspended"
}

Response

{
  "result": "phishing",
  "confidence": 0.88
}

πŸ”Ή Database Schema

Table: detections

Field Type Description
id Integer Primary Key
input_value String URL or text
input_type String url / text
result String safe / phishing
confidence Float Prediction score
timestamp DateTime Detection time

πŸ”Ή Browser Extension

  • Detects phishing URLs in real-time
  • Sends active tab URL to backend
  • Displays result in popup

Permissions:

  • activeTab
  • scripting

πŸ”Ή Explainable AI

  • Uses SHAP / rule-based explanations
  • Highlights important features influencing prediction
  • Integrated into UI and extension

πŸ›‘οΈ Error Handling

Scenario Behavior
Invalid URL Returns error
Empty input Validation error
Low confidence Marked as "Uncertain"
Backend failure Fallback response

πŸ› οΈ Tech Stack

  • Machine Learning: Python, scikit-learn, pandas, numpy, SHAP
  • Backend: Flask
  • Frontend: HTML, CSS, JavaScript
  • Extension: Chrome Extension (Manifest V3)
  • Database: SQLite
  • Deployment: Render / Railway (optional)

πŸ‘₯ Team

Role Responsibilities
ML Developer Model training, feature engineering
Backend Developer API development, DB integration
Frontend Developer UI + Extension development

▢️ Getting Started

1. Clone Repository

git clone https://github.com/your-username/phishshield.git
cd phishshield

2. Install Dependencies

pip install -r requirements.txt

3. Train Model

python ml_model/src/train_url_model.py

4. Run Backend

python backend/app.py

5. Load Chrome Extension

  • Open chrome://extensions/
  • Enable Developer Mode
  • Click Load Unpacked
  • Select the extension/ folder

πŸ† Conclusion

PhishShield is a scalable and modular phishing detection system combining machine learning, explainability, and real-time browser integration.

It is designed as a practical, real-world cybersecurity solution.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors