Galaxy Image Classifier using Vision Foundation Models

This repository contains the code and resources for a project developed during my Master's Degree in Artificial Intelligence. The project focuses on classifying galaxy morphologies using state-of-the-art Vision Foundation Models and Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA.

Read the full Project Report | View the Presentation Slides (es)

Models Explored

The repository includes implementations, tests, and training pipelines for a wide variety of vision architectures, from robust convolutional networks to the latest self-supervised transformers:

Convolutional Neural Networks (CNNs): ResNet, ConvNeXt, ConvNeXtV2, EfficientNet.
Vision Transformers (ViTs): Standard ViT, Swin Transformer, Data-efficient Image Transformers (DeiT), MaxViT.
State-of-the-Art (SOTA) Models: The project also includes experimental tests and fine-tuning scripts for DINOv3, leveraging its cutting-edge self-supervised feature extraction capabilities to tackle complex morphological classification.

Dataset

The data used for this project originates from the official Galaxy Zoo 2 dataset. To streamline the training pipeline and make the data easily accessible for Hugging Face's datasets library, I have processed and published the dataset directly to the Hugging Face Hub.

Original Dataset: Galaxy Zoo 2 (GZ2)
Hugging Face Dataset: mrJordi0/galaxy-zoo-dataset

Baseline and References

The experimental setup, approach, and baseline for this project were heavily inspired by the following work:

Baseline Paper: Galaxy Morphology Classification with Deep Convolutional Neural Networks
Baseline Repository: soliao/Galaxy-Zoo-Classification

Key Features

Unified training pipeline for multiple architectures (AutoModelForImageClassification).
Native integration with Hugging Face datasets and models.
Parameter-Efficient Fine-Tuning (LoRA / DoRA) support for large vision models.
Experiment tracking and visualization using Weights & Biases (WandB).

Repository Structure

.
├── assets/                  # Figures, plots, and architecture diagrams (PDF format)
├── data/                    # CSV files containing image IDs and labels
├── src/                     
│   ├── train.py             # Unified training script for all models
│   ├── create_dataset.py    # Script to convert raw images to HF Dataset
│   └── notebooks/           # Jupyter notebooks for data verification
├── Project_Report.pdf       # Detailed Project Report
├── Slides.pdf               # Presentation slides
├── requirements.txt         # Project dependencies
└── README.md

Installation

Clone the repository and install the required dependencies:

git clone https://github.com/your-username/galaxy_image_classifier.git
cd galaxy_image_classifier
pip install -r requirements.txt

Usage

1. Dataset Preparation

If you want to create the Hugging Face dataset locally from raw images and CSV splits:

python src/create_dataset.py \
    --train_csv data/gz2_train.csv \
    --val_csv data/gz2_valid.csv \
    --test_csv data/gz2_test.csv \
    --img_dirs /path/to/your/images \
    --output_dir ./data/hf_dataset

(Alternatively, you can skip this step and let the training script download the dataset directly from the Hugging Face Hub.)

2. Model Training

You can train any supported model using the unified train.py script. Example for training a Vision Transformer (ViT) using LoRA:

python src/train.py \
    --model_type vit \
    --model_checkpoint google/vit-base-patch16-224 \
    --dataset_name mrJordi0/galaxy-zoo-dataset \
    --peft_type lora \
    --batch_size 32 \
    --epochs 10 \
    --use_wandb

Results

Note: The following documents provide detailed visualizations of the performance of different architectures evaluated during the project.

Architecture Comparison: Comparison of accuracy and F1-score across all evaluated models.
Results per Model Family: Aggregated performance metrics grouped by model family.
Confusion Matrix - Test Set: Detailed classification performance across all galaxy types (ViT).
Per-Class Metrics: Breakdown of precision, recall, and F1-score for each class (ViT).
Sample Galaxy Images: Visual examples of the different galaxy morphological categories.

(For a comprehensive analysis, please refer to the Project Report.)

Acknowledgments

Developed as a project for the Master's Degree in Artificial Intelligence.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Galaxy Image Classifier using Vision Foundation Models

Models Explored

Dataset

Baseline and References

Key Features

Repository Structure

Installation

Usage

1. Dataset Preparation

2. Model Training

Results

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
assets		assets
data		data
src		src
.gitignore		.gitignore
Project_Report.pdf		Project_Report.pdf
README.md		README.md
Slides.pdf		Slides.pdf
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Galaxy Image Classifier using Vision Foundation Models

Models Explored

Dataset

Baseline and References

Key Features

Repository Structure

Installation

Usage

1. Dataset Preparation

2. Model Training

Results

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages