A machine learning-based cybersecurity system that detects phishing URLs, emails, and SMS messages with real-time risk classification, explainable AI, and a browser extension.
PhishShield analyzes URLs and text messages to predict whether they are Safe or Phishing, along with confidence scores and explanations.
It follows a modular architecture integrating:
- Machine Learning models
- Flask backend APIs
- Web-based UI
- Chrome extension for real-time detection
- Detect phishing URLs using machine learning
- Provide Explainable AI (XAI) outputs
- Extend detection to Email and SMS
- Build a browser extension for real-time detection
- Store and display detection history
- π Real-time phishing detection for URLs
- π© Email and SMS phishing analysis
- π Confidence score with each prediction
- π§ Explainable AI insights (why flagged)
- π Chrome extension for live browsing protection
- ποΈ Detection history storage and retrieval
- β‘ Fast API responses using Flask backend
- π Scalable modular architecture
βββββββββββββββββββββββββββββββββββββββββββββββββ
β Presentation Layer | Web UI / Extension β
βββββββββββββββββββββββββββββββββββββββββββββββββ€
β Backend Layer | Flask REST API β
βββββββββββββββββββββββββββββββββββββββββββββββββ€
β Machine Learning Layer | Feature + Model β
βββββββββββββββββββββββββββββββββββββββββββββββββ€
β Database Layer | Detection Storage β
βββββββββββββββββββββββββββββββββββββββββββββββββ
User Input (Web / Extension)
|
v
Backend API receives request
|
v
Feature extraction
|
v
ML model prediction
|
v
Result + Confidence + Explanation
|
v
Display to user + Store in database
phishshield/
β
βββ backend/
β βββ app.py
β βββ routes/
β β βββ predict_url.py
β β βββ predict_text.py
β βββ services/
β β βββ predictor.py
β β βββ model_loader.py
β βββ config.py
β
βββ ml_model/
β βββ dataset/
β βββ src/
β β βββ train_url_model.py
β β βββ train_text_model.py
β β βββ feature_extractor.py
β βββ saved_model/
β βββ url_model.pkl
β βββ text_model.pkl
β
βββ frontend/
β βββ templates/
β β βββ index.html
β β βββ result.html
β β βββ history.html
β βββ static/
β βββ css/style.css
β βββ js/script.js
β
βββ extension/
β βββ manifest.json
β βββ popup.html
β βββ popup.js
β βββ style.css
β
βββ database/
β βββ db.py
β βββ schema.sql
β
βββ shared/
β βββ feature_extractor.py
β
βββ requirements.txt
βββ README.md
Centralized in shared/feature_extractor.py
| Feature | Description |
|---|---|
| URL Length | Total character count |
| Dot Count | Number of . in URL |
| HTTPS Presence | Secure protocol check |
@ Symbol |
Common phishing indicator |
| Hyphen Count | Number of - in domain |
| IP Address | Detects direct IP usage |
| Suspicious Patterns | //, redirects |
| Digit Ratio | Proportion of digits |
| Property | Details |
|---|---|
| Primary Model | Random Forest Classifier |
| Baseline | Logistic Regression |
| Optional | XGBoost |
| Metric | F1 Score β₯ 0.90 |
| Dataset | 50,000+ URLs |
| Split | 70% / 15% / 15% |
| Labels | 0 = Safe, 1 = Phishing |
POST /predict
Request
{
"url": "http://example.com"
}Response
{
"result": "phishing",
"confidence": 0.94,
"reason": "Contains suspicious symbols"
}POST /predict-text
Request
{
"text": "Your account has been suspended"
}Response
{
"result": "phishing",
"confidence": 0.88
}Table: detections
| Field | Type | Description |
|---|---|---|
| id | Integer | Primary Key |
| input_value | String | URL or text |
| input_type | String | url / text |
| result | String | safe / phishing |
| confidence | Float | Prediction score |
| timestamp | DateTime | Detection time |
- Detects phishing URLs in real-time
- Sends active tab URL to backend
- Displays result in popup
Permissions:
activeTabscripting
- Uses SHAP / rule-based explanations
- Highlights important features influencing prediction
- Integrated into UI and extension
| Scenario | Behavior |
|---|---|
| Invalid URL | Returns error |
| Empty input | Validation error |
| Low confidence | Marked as "Uncertain" |
| Backend failure | Fallback response |
- Machine Learning: Python, scikit-learn, pandas, numpy, SHAP
- Backend: Flask
- Frontend: HTML, CSS, JavaScript
- Extension: Chrome Extension (Manifest V3)
- Database: SQLite
- Deployment: Render / Railway (optional)
| Role | Responsibilities |
|---|---|
| ML Developer | Model training, feature engineering |
| Backend Developer | API development, DB integration |
| Frontend Developer | UI + Extension development |
git clone https://github.com/your-username/phishshield.git
cd phishshieldpip install -r requirements.txtpython ml_model/src/train_url_model.pypython backend/app.py- Open
chrome://extensions/ - Enable Developer Mode
- Click Load Unpacked
- Select the
extension/folder
PhishShield is a scalable and modular phishing detection system combining machine learning, explainability, and real-time browser integration.
It is designed as a practical, real-world cybersecurity solution.