This repository contains the code and resources for a project developed during my Master's Degree in Artificial Intelligence. The project focuses on classifying galaxy morphologies using state-of-the-art Vision Foundation Models and Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA.
Read the full Project Report | View the Presentation Slides (es)
The repository includes implementations, tests, and training pipelines for a wide variety of vision architectures, from robust convolutional networks to the latest self-supervised transformers:
- Convolutional Neural Networks (CNNs): ResNet, ConvNeXt, ConvNeXtV2, EfficientNet.
- Vision Transformers (ViTs): Standard ViT, Swin Transformer, Data-efficient Image Transformers (DeiT), MaxViT.
- State-of-the-Art (SOTA) Models: The project also includes experimental tests and fine-tuning scripts for DINOv3, leveraging its cutting-edge self-supervised feature extraction capabilities to tackle complex morphological classification.
The data used for this project originates from the official Galaxy Zoo 2 dataset. To streamline the training pipeline and make the data easily accessible for Hugging Face's datasets library, I have processed and published the dataset directly to the Hugging Face Hub.
- Original Dataset: Galaxy Zoo 2 (GZ2)
- Hugging Face Dataset: mrJordi0/galaxy-zoo-dataset
The experimental setup, approach, and baseline for this project were heavily inspired by the following work:
- Baseline Paper: Galaxy Morphology Classification with Deep Convolutional Neural Networks
- Baseline Repository: soliao/Galaxy-Zoo-Classification
- Unified training pipeline for multiple architectures (
AutoModelForImageClassification). - Native integration with Hugging Face datasets and models.
- Parameter-Efficient Fine-Tuning (LoRA / DoRA) support for large vision models.
- Experiment tracking and visualization using Weights & Biases (WandB).
.
├── assets/ # Figures, plots, and architecture diagrams (PDF format)
├── data/ # CSV files containing image IDs and labels
├── src/
│ ├── train.py # Unified training script for all models
│ ├── create_dataset.py # Script to convert raw images to HF Dataset
│ └── notebooks/ # Jupyter notebooks for data verification
├── Project_Report.pdf # Detailed Project Report
├── Slides.pdf # Presentation slides
├── requirements.txt # Project dependencies
└── README.md
Clone the repository and install the required dependencies:
git clone https://github.com/your-username/galaxy_image_classifier.git
cd galaxy_image_classifier
pip install -r requirements.txtIf you want to create the Hugging Face dataset locally from raw images and CSV splits:
python src/create_dataset.py \
--train_csv data/gz2_train.csv \
--val_csv data/gz2_valid.csv \
--test_csv data/gz2_test.csv \
--img_dirs /path/to/your/images \
--output_dir ./data/hf_dataset(Alternatively, you can skip this step and let the training script download the dataset directly from the Hugging Face Hub.)
You can train any supported model using the unified train.py script. Example for training a Vision Transformer (ViT) using LoRA:
python src/train.py \
--model_type vit \
--model_checkpoint google/vit-base-patch16-224 \
--dataset_name mrJordi0/galaxy-zoo-dataset \
--peft_type lora \
--batch_size 32 \
--epochs 10 \
--use_wandbNote: The following documents provide detailed visualizations of the performance of different architectures evaluated during the project.
- Architecture Comparison: Comparison of accuracy and F1-score across all evaluated models.
- Results per Model Family: Aggregated performance metrics grouped by model family.
- Confusion Matrix - Test Set: Detailed classification performance across all galaxy types (ViT).
- Per-Class Metrics: Breakdown of precision, recall, and F1-score for each class (ViT).
- Sample Galaxy Images: Visual examples of the different galaxy morphological categories.
(For a comprehensive analysis, please refer to the Project Report.)
Developed as a project for the Master's Degree in Artificial Intelligence.