Skip to content

JordiCan/galaxy_image_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Galaxy Image Classifier using Vision Foundation Models

This repository contains the code and resources for a project developed during my Master's Degree in Artificial Intelligence. The project focuses on classifying galaxy morphologies using state-of-the-art Vision Foundation Models and Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA.

Read the full Project Report | View the Presentation Slides (es)

Models Explored

The repository includes implementations, tests, and training pipelines for a wide variety of vision architectures, from robust convolutional networks to the latest self-supervised transformers:

  • Convolutional Neural Networks (CNNs): ResNet, ConvNeXt, ConvNeXtV2, EfficientNet.
  • Vision Transformers (ViTs): Standard ViT, Swin Transformer, Data-efficient Image Transformers (DeiT), MaxViT.
  • State-of-the-Art (SOTA) Models: The project also includes experimental tests and fine-tuning scripts for DINOv3, leveraging its cutting-edge self-supervised feature extraction capabilities to tackle complex morphological classification.

Dataset

The data used for this project originates from the official Galaxy Zoo 2 dataset. To streamline the training pipeline and make the data easily accessible for Hugging Face's datasets library, I have processed and published the dataset directly to the Hugging Face Hub.

Baseline and References

The experimental setup, approach, and baseline for this project were heavily inspired by the following work:

Key Features

  • Unified training pipeline for multiple architectures (AutoModelForImageClassification).
  • Native integration with Hugging Face datasets and models.
  • Parameter-Efficient Fine-Tuning (LoRA / DoRA) support for large vision models.
  • Experiment tracking and visualization using Weights & Biases (WandB).

Repository Structure

.
├── assets/                  # Figures, plots, and architecture diagrams (PDF format)
├── data/                    # CSV files containing image IDs and labels
├── src/                     
│   ├── train.py             # Unified training script for all models
│   ├── create_dataset.py    # Script to convert raw images to HF Dataset
│   └── notebooks/           # Jupyter notebooks for data verification
├── Project_Report.pdf       # Detailed Project Report
├── Slides.pdf               # Presentation slides
├── requirements.txt         # Project dependencies
└── README.md

Installation

Clone the repository and install the required dependencies:

git clone https://github.com/your-username/galaxy_image_classifier.git
cd galaxy_image_classifier
pip install -r requirements.txt

Usage

1. Dataset Preparation

If you want to create the Hugging Face dataset locally from raw images and CSV splits:

python src/create_dataset.py \
    --train_csv data/gz2_train.csv \
    --val_csv data/gz2_valid.csv \
    --test_csv data/gz2_test.csv \
    --img_dirs /path/to/your/images \
    --output_dir ./data/hf_dataset

(Alternatively, you can skip this step and let the training script download the dataset directly from the Hugging Face Hub.)

2. Model Training

You can train any supported model using the unified train.py script. Example for training a Vision Transformer (ViT) using LoRA:

python src/train.py \
    --model_type vit \
    --model_checkpoint google/vit-base-patch16-224 \
    --dataset_name mrJordi0/galaxy-zoo-dataset \
    --peft_type lora \
    --batch_size 32 \
    --epochs 10 \
    --use_wandb

Results

Note: The following documents provide detailed visualizations of the performance of different architectures evaluated during the project.

(For a comprehensive analysis, please refer to the Project Report.)

Acknowledgments

Developed as a project for the Master's Degree in Artificial Intelligence.

About

A deep learning pipeline for classifying galaxy morphologies from the Galaxy Zoo 2 dataset. It leverages the Hugging Face ecosystem to fine-tune state-of-the-art vision models using Parameter-Efficient Fine-Tuning (PEFT) techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages