Detect deepfakes in images and videos using a fine-tuned Vision Transformer (ViT) model β ~92% accuracy.
Demo Β· Features Β· Quick Start Β· API Docs Β· How It Works
- πΌοΈ Image detection β JPG, PNG, WEBP, BMP
- π¬ Video detection β MP4, AVI, MOV, MKV, WEBM (samples up to 20 frames)
- π€ ViT-powered β
prithivMLmods/Deep-Fake-Detector-v2-Model(~92% accuracy on 56k test images) - β‘ Fast REST API β FastAPI + Uvicorn with auto-generated Swagger UI at
/docs - π¨ Polished frontend β drag-and-drop SPA, confidence ring, live activity log β no build step required
- π Frame averaging β video predictions average softmax probabilities across all sampled frames for robustness
- πΎ Cached model weights β ~330 MB downloaded once, then loaded from
~/.cache/huggingface/on every run
git clone https://github.com/YAXH64/Deepsentry---Deepfake-Detector.git
cd deepsentrypython -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activatepip install -r requirements.txtpython main.pyFirst run only: the model weights (~330 MB) are downloaded automatically and cached. This takes ~1 minute depending on your connection. All subsequent starts are near-instant.
Open index.html directly in your browser β no web server needed.
Backend API β http://localhost:8000
Swagger UI β http://localhost:8000/docs
deepsentry/
βββ index.html # Frontend SPA β drag-and-drop UI, results panel
βββ main.py # FastAPI app β routes, CORS, response schema
βββ model.py # ViT model loader + inference engine
βββ processor.py # File decoding, video frame extraction
βββ requirements.txt # Python dependencies
All endpoints return JSON. Full interactive docs at http://localhost:8000/docs.
Health check.
{ "status": "ok" }Upload an image and get a deepfake prediction.
Request: multipart/form-data
| Field | Type | Required | Notes |
|---|---|---|---|
file |
File | β | .jpg .jpeg .png .webp .bmp |
Response:
{
"label": "Real",
"confidence": 94.72,
"media_type": "image",
"elapsed_ms": 312.5
}Upload a video and get a deepfake prediction. Up to 20 evenly-spaced frames are sampled and averaged.
Request: multipart/form-data
| Field | Type | Required | Notes |
|---|---|---|---|
file |
File | β | .mp4 .avi .mov .mkv .webm |
Response:
{
"label": "Deepfake",
"confidence": 87.13,
"media_type": "video",
"elapsed_ms": 4821.0
}| Code | Cause |
|---|---|
400 |
No file, unsupported format, corrupted file, or no extractable frames |
422 |
FastAPI validation error (missing required field) |
500 |
Unhandled server error during inference |
DeepSentry uses a 4-step analysis pipeline:
βββββββββββββββββββββββ
β 1. Facial Geometry β Analyzes 3D facial structure and landmark positions
βββββββββββββββββββββββ€
β 2. Temporal Check β Frame-to-frame coherence (videos only)
βββββββββββββββββββββββ€
β 3. Artifact Scan β GAN/diffusion model pixel-level fingerprints
βββββββββββββββββββββββ€
β 4. ViT Inference β Final classification with confidence score
βββββββββββββββββββββββ
Under the hood, steps 1β3 are surfaced in the UI as animated checks. Step 4 is the actual model inference:
- Uploaded file β decoded by OpenCV β converted to PIL Image
ViTImageProcessorresizes to 224Γ224 and normalizes pixel values- ViT forward pass β softmax probabilities over
["Realism", "Deepfake"] - For video: probabilities are averaged across all sampled frames
- Highest-probability class returned as the label with confidence %
| Property | Value |
|---|---|
| Model ID | prithivMLmods/Deep-Fake-Detector-v2-Model |
| Architecture | ViT (vit-base-patch16-224-in21k fine-tuned) |
| Accuracy | ~92% on 56,001 test images |
| Input size | 224 Γ 224 px |
| Labels | Realism β Real Β· Deepfake β Deepfake |
| Device | CUDA (if available) Β· CPU fallback |
| Constant | File | Default | Description |
|---|---|---|---|
MAX_FRAMES |
processor.py |
20 |
Max video frames sampled per upload |
MODEL_ID |
model.py |
prithivMLmods/... |
HuggingFace model identifier |
DEVICE |
model.py |
auto | cuda if available, else cpu |
host |
main.py |
0.0.0.0 |
Uvicorn bind address |
port |
main.py |
8000 |
Uvicorn port |
API_BASE |
index.html |
http://localhost:8000 |
Backend URL used by the frontend |
Deploying remotely? Update
API_BASEin the<script>block ofindex.htmlto point to your server's address.
fastapi>=0.100.0
uvicorn[standard]>=0.23.0
python-multipart>=0.0.7
torch>=2.0.0
transformers>=4.35.0
Pillow>=9.0.0
opencv-python-headless>=4.8.0
numpy>=1.24.0
pydantic>=2.0.0
- No file size limit β large video uploads are read fully into RAM. Add a size guard before production use.
- CORS is open β
allow_origins=["*"]is set for local dev. Restrict this before deploying publicly. - Single file at a time β batch upload is not currently supported.
- Model accuracy β ~92% means roughly 1 in 12 predictions may be incorrect. Do not use as a sole source of truth.
Pull requests are welcome! For major changes, please open an issue first to discuss what you'd like to change.
- Fork the repo
- Create your branch (
git checkout -b feature/your-feature) - Commit your changes (
git commit -m 'Add your feature') - Push to the branch (
git push origin feature/your-feature) - Open a Pull Request
This project is licensed under the MIT License. See LICENSE for details.