This project implements a comprehensive pipeline to classify satellite imagery into Forest and Non-Forest categories. By combining cloud-based data acquisition with local Deep Learning, the system monitors environmental changes using Sentinel-2 NDVI data.
The workflow bridges the gap between massive satellite data collection and localized AI model training. We use a Convolutional Neural Network (CNN) to analyze
- Data Acquisition:
Google Earth Engine (JavaScript API) - Deep Learning:
PyTorch(CNN Architecture, Autograd) - Geospatial Data:
Rasterio(GeoTIFF management),Zarr(Efficient chunked storage) - Analysis:
NumPy,Matplotlib
Before the Python pipeline begins, the raw data is prepared using the Google Earth Engine Code Editor:
- Collection Filtering: Sentinel-2 L2A collections are filtered for specific dates (Dry Season) and low cloud cover.
- NDVI Calculation: Normalized Difference Vegetation Index is computed across the entire region.
- Export: Cleaned GeoTIFF composites for 2020 and 2024 are exported to Google Drive for local processing.
-
Automated Labeling: Forest centroids are identified using an NDVI threshold (
$\ge 0.6$ ). - Sampling: A balanced dataset of 1,000 patches (500 per class) is randomly sampled.
-
Zarr Storage: Patches are streamed into a
.zarrfile, allowing the training loop to access data without overloading system RAM.
A custom SimpleCNN is trained to recognize spatial patterns:
- Layers: Dual-layer convolution with MaxPooling for feature extraction.
- Training Logic: 20 epochs using the
Adamoptimizer andBCELoss. - Hardware: Optimized for CUDA/GPU with a CPU fallback.
The model is deployed on the full-scale GeoTIFFs:
- Sliding Window: Applies the model across the entire image.
- Stride = 1: By moving the window pixel-by-pixel and averaging overlaps, the system creates a smooth, sub-pixel probability map that avoids "blocky" artifacts.
pip install torch rasterio zarr numpy matplotlib- GEE (JS): Export your NDVI GeoTIFFs from Google Earth Engine.
- Process: Run the extraction notebook to generate
dataset_ndvi.zarr. - Train: Run the CNN training loop. The model weights will save to
forest_classification_model.pth. - Infer: Apply the model to the 2020 and 2024 images using the
apply_fine_modelfunction.
- NaN Handling: Uses
np.nan_to_numto handle "NoData" pixels often found in satellite exports. - Normalization: Accounts for contrast variations across different satellite orbits/dates.
- Shape Consistency: Uses
.view(-1)to ensure the 1D label tensor matches the model output, preventing batch-dimension crashes.
Developed for environmental monitoring and deforestation analysis.