This project compares several image representations on CIFAR-10 using a simple 1-nearest-neighbor evaluation pipeline. It includes classic hand-crafted features, deep features from VGG-11, dataset utilities, feature caching, and a notebook that walks through the full analysis.
- Downloads and loads CIFAR-10 in
(N, C, H, W)format - Extracts and caches multiple feature representations
- Evaluates each representation with a 1-NN classifier
- Visualizes sample images and nearest-neighbor retrieval results
- Supports both pretrained and randomly initialized CNN feature extractors
raw_pixel: flattened image vectorshog: Histogram of Oriented Gradients descriptorspretrained_cnn:last_convlast_fc
random_cnn:last_convlast_fc
image-representation-analysis/
├── README.md
├── pyproject.toml
├── uv.lock
├── datasets/
├── features/
├── models/
│ └── vgg11_bn.pt
├── notebooks/
│ └── feature_analysis.ipynb
├── reports/
└── src/
├── __init__.py
├── dataset.py
├── extract_feature.py
├── models.py
├── path.py
├── vgg_network.py
└── visualization.py
- Python
>=3.14 torchtorchvisionnumpyscikit-learnscikit-imagematplotlibtqdmipykernelhuggingface-hubrequeststorchinfo
Dependencies are defined in pyproject.toml.
Using uv:
uv syncUsing pip:
pip install -e .The main workflow lives in notebooks/feature_analysis.ipynb.
Start Jupyter:
uv run jupyter labThen open notebooks/feature_analysis.ipynb.
Load the dataset:
from src.dataset import download_cifar10_dataset, load_dataset_splits
download_cifar10_dataset()
x_train, y_train, x_test, y_test = load_dataset_splits()Extract or load cached features:
from src.extract_feature import compute_or_load_features
raw_train, raw_test = compute_or_load_features(x_train, x_test, "raw_pixel")
hog_train, hog_test = compute_or_load_features(x_train, x_test, "hog")
pretrained_conv_train, pretrained_conv_test = compute_or_load_features(
x_train, x_test, "pretrained_cnn", layer="last_conv"
)Run 1-nearest-neighbor evaluation:
from src.models import run_nearest_neighbor
classifier = run_nearest_neighbor(raw_train, y_train, raw_test, y_test)Visualize examples and nearest neighbors:
from src.visualization import visualize_cifar_data, visualize_nearest_neighbors
visualize_cifar_data(x_train.transpose(0, 2, 3, 1), y_train)
visualize_nearest_neighbors(
x_test, y_test, x_train, y_train, classifier, raw_test, feature_name="raw_pixel"
)- CIFAR-10 is downloaded into
datasets/cifar-10-batches-py/ - Cached feature arrays are stored in
features/*.npz - A local CIFAR-10 VGG checkpoint is expected at
models/vgg11_bn.ptwhen using functions from src/vgg_network.py
Feature caching is handled by compute_or_load_features(...). If a matching .npz file already exists, the project loads it instead of recomputing features.
- Downloads CIFAR-10
- Loads train/test splits
- Applies CIFAR-10 normalization statistics
- Computes raw pixel features
- Computes HOG features
- Extracts CNN features from VGG-11
- Saves and reloads cached feature files
- Trains and evaluates a 1-nearest-neighbor classifier
- Displays CIFAR-10 samples
- Shows correct and incorrect nearest-neighbor retrieval examples
- Defines a CIFAR-10-specific VGG model
- Loads a local pretrained checkpoint
- Evaluates the checkpoint on the test set
src/extract_feature.pyusestorchvision.models.vgg11_bnfor pretrained and random feature extraction.src/vgg_network.pydefines a separate CIFAR-10 VGG implementation that expects a local checkpoint file.- CNN feature extraction automatically uses CUDA when available and falls back to CPU otherwise.
During a typical run, you should expect:
- printed dataset shapes
- printed feature matrix shapes
- cached
.npzfeature files underfeatures/ - 1-NN accuracy values on the CIFAR-10 test split
- matplotlib figures for samples and nearest-neighbor matches