Skip to content

๐Ÿ‘๏ธโ€๐Ÿ—จ๏ธ! STEG-Detector is for detecting hidden information in images, audio, and video files. This tool employs statistical analysis, machine learning, and multimedia processing techniques to identify steganography. Features include multi-format support, metadata extraction, and a user-friendly GUI built with Tkinter.

License

Notifications You must be signed in to change notification settings

0warn/STEG-Detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

23 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

<[STEG-DETECTOR]>

The Future of Steganography Detection โ€” Machine Learning powered, Wavelet-Enhanced, GUI-Driven


๐Ÿš€ Recent Major Upgrade!

This project has undergone a significant overhaul for improved maintainability, performance, accuracy, and robust error handling. The core functionalities have been modularized, and several critical issues addressed.

๐ŸŒŒ Vision & Inspiration

Steganography is not just about hiding information โ€” itโ€™s about the invisible war in cyberspace.
Traditional security tools fail to detect hidden payloads inside images, videos, or signals.
Thatโ€™s why Steg-Detector Pro was born: a futuristic, AI-driven, and research-grade framework that blends:

  • ๐Ÿง  Machine Learning for smart detection
  • ๐ŸŒŠ Wavelet Transform (PyWavelets) for frequency feature extraction
  • ๐Ÿ–ผ๏ธ Computer Vision (OpenCV) for image processing
  • ๐Ÿ–ฅ๏ธ Beautiful GUI (Tkinter) for easy use
  • ๐Ÿ“Š Dataset Training so it keeps evolving with your data

โšก Key Changes & Improvements (v3.0)

  • Modular Architecture: The entire codebase has been refactored into a src/ directory with logical submodules (src/logic, src/gui, src/utils). This greatly enhances maintainability and extensibility.
  • SSIM Bug Fix: Corrected a critical bug in the Structural Similarity Index (SSIM) feature extraction for images. Previously, SSIM was incorrectly calculated by comparing an image to itself; it now compares the image to a slightly blurred version for a more accurate steganographic artifact detection.
  • Enhanced Audio Feature Extraction:
    • MFCCs Integration: Added Mel-frequency cepstral coefficients (MFCCs) using librosa, significantly improving the accuracy of audio steganography detection.
    • Performance Optimization: Optimized audio processing by calculating rfftfreq outside loops, reducing computational overhead.
  • Robust Dependency Handling: Implemented conditional imports for optional libraries (librosa, pywt, skimage, pydub). The application now logs warnings and gracefully falls back to alternative methods or defaults if these libraries are not installed, preventing crashes.
  • Centralized Configuration: All application-wide constants are now managed in src/config.py, making it easier to customize settings.
  • Improved Error Handling: Replaced generic except Exception blocks with more specific exception types across the codebase for clearer error reporting and more robust application behavior.
  • Comprehensive Unit Tests: Added a new tests/ directory with unit tests for core functionalities, ensuring code reliability and preventing regressions.

๐Ÿ› ๏ธ Installation

  1. Clone the repository:

    git clone https://github.com/0warn/STEG-Detector.git
    cd STEG-Detector
  2. Install dependencies:
    Ensure you have Python 3.8+ and pip installed. Then, install all required libraries:

    pip install -r requirement.txt

    โšก Note: Some optional dependencies like librosa and pywt (PyWavelets) might require system-level dependencies depending on your OS. The application will function without them, but certain features (MFCCs, Wavelet features) will be disabled.


๐Ÿš€ Usage

  1. Run the detector:
    python3 main.py

โš ๏ธ Crucial First Step: Retrain Your Model!

Due to significant internal changes in scikit-learn versions, any previously trained model files (steg_detector_model.joblib) are incompatible with this updated version of STEG-Detector.

Upon first launch (or if an incompatible model exists), the application will start, but detection features will not work until a new model is trained and saved.

Steps to Retrain Your Model:

  1. Launch the application (python3 main.py).
  2. Navigate to the "Train / Evaluate" tab in the GUI.
  3. Select your Dataset Source: Choose either "Folders (clean/stego)" or "CSV (path,label)".
    • If "Folders", browse to your root dataset directory (e.g., a folder containing clean/ and stego/ subfolders).
    • If "CSV", browse to your CSV file containing paths and labels.
  4. Select Algorithm: Choose rf (RandomForest) or xgb (XGBoost, requires xgboost package installed).
  5. Click the "Train & Evaluate" button.
  6. Once training is complete and evaluation results are displayed, click the "Save Model" button. This will save a new, compatible model file (steg_detector_model.joblib) in your project's root directory.

After saving the new model, the application's detection capabilities will be fully functional.

GUI Options

  • ๐Ÿ” Detect โ†’ Upload a media file (image, audio, or video) to check for hidden data.
  • ๐Ÿ“Š Train / Evaluate โ†’ Train new ML models with your datasets and evaluate their performance.
  • โš™๏ธ Dataset โ†’ Access tools for generating synthetic stego datasets (useful for bootstrapping training data).
  • โ„น๏ธ About / Updates โ†’ View application information and check for updates.

๐Ÿงฉ Dataset Training

  1. Prepare your dataset:
    • For "Folders" mode: Create a root directory containing two subfolders:
      • cover/ โ†’ Contains original, clean media files.
      • stego/ โ†’ Contains steganographically altered media files.
    • For "CSV" mode: Create a CSV file where each row contains path_to_media_file,label (e.g., image1.png,0 for clean, stego_image1.png,1 for stego).
  2. Use the "Train / Evaluate" tab in the GUI, select your source, and follow the retraining steps mentioned above.
  3. The trained model is saved as steg_detector_model.joblib for future detection.

๐Ÿง‘โ€๐Ÿ’ป Example

# Assuming you have trained and saved a model using the GUI.
# If you wish to use the detector programmatically:

from src.logic.detector import Detector

# Initialize the detector (it will attempt to load the model saved as steg_detector_model.joblib)
detector = Detector()

# Example detection:
result = detector.detect("path/to/your/sample_image.png")

if result.get('ok'):
    print(f"Detection Result for {result['path']}:")
    print(f"  Type: {result['type']}")
    print(f"  Heuristics Flag: {result['heuristics'].get('flag')}")
    if result.get('ml_prediction') is not None:
        prediction_label = "Stego Detected โœ…" if result['ml_prediction'] == 1 else "Clean โŒ"
        print(f"  ML Prediction: {prediction_label}")
    else:
        print("  ML Prediction: Model not loaded or available.")
else:
    print(f"Error detecting steganography: {result.get('error')}")

๐Ÿฉบ Error Handling & Fixes

Error / Warning Cause Solution
ModuleNotFoundError: No module named 'pywt' / librosa / skimage / pydub Optional dependency not installed. Run pip install -r requirement.txt. If issues persist, ensure librosa and pywt might have underlying system dependencies. The application will function, but related features might be unavailable.
ValueError: node array from the pickle has an incompatible dtype (on startup or load) Model file (steg_detector_model.joblib) was trained with an older scikit-learn version. Retrain the model using the "Train / Evaluate" tab in the GUI, then click "Save Model". The new model will be compatible with your current environment.
GUI crashes on Linux (_tkinter error) Tkinter development files not installed. For Debian/Ubuntu, run sudo apt install python3-tk. For other systems, consult your package manager.
Model not found (no detection results) No model has been trained or saved yet, or the loaded model is incompatible. Use the "Train / Evaluate" tab in the GUI to train a new model with your dataset, then click "Save Model".

๐ŸŒ Future Roadmap

  • ๐ŸŽฅ Video steganography detection (Initial support added)
  • ๐Ÿ”Š Audio steganography analysis (Initial support and MFCC features added)
  • ๐Ÿ”ฎ Deep Learning (CNN, ResNet) integration
  • ๐Ÿ•ธ๏ธ Web-based dashboard
  • ๐Ÿ“ก Real-time stego-sniffer for network traffic

๐Ÿง‘โ€๐Ÿš€ Author

๐Ÿ‘จโ€๐Ÿ’ป Developed with dedication by CODE ๐Ÿ’ก "Because hidden data should never remain invisible."


About

๐Ÿ‘๏ธโ€๐Ÿ—จ๏ธ! STEG-Detector is for detecting hidden information in images, audio, and video files. This tool employs statistical analysis, machine learning, and multimedia processing techniques to identify steganography. Features include multi-format support, metadata extraction, and a user-friendly GUI built with Tkinter.

Topics

Resources

License

Stars

Watchers

Forks

Languages