PhenoStack-GeoSeg (PSG)

The proposed approach presents a multi-modal encoder-decoder architecture for 6-class crop segmentation across fragmented Indian smallholder farms using Prithvi-EO-2.0 foundation models with TerraTorch framework. This methodology will implement early SAR-optical fusion combining temporal Sentinel-2 multispectral observations with Sentinel-1 VV/VH polarizations through channel concatenation. The architecture will employ Prithvi-EO-2.0-300M as backbone encoder with pyramidal feature extraction via SelectIndices and LearnedInterpolateToPyramidal necks, decoded through UNetDecoder with skip connections. Training optimization will use Lovász-Softmax loss for direct IoU maximization combined with cross-entropy for stability. Mixed precision BF16 training with progressive backbone unfreezing will reduce computational overhead. Geographic district-level validation splits across UP, Rajasthan, Odisha, and Bihar will ensure generalization. The proposed method targets Micro IoU >0.75 on AgriFieldNet India dataset with balanced per-class performance on Gram, Maize, Mustard, Sugarcane, Wheat, and Other categories.

Built With

bash-**deep-learning:**-pytorch-2.0+
gdal/ogr
git
machine-learning
numpy
nvidia-apex-**development:**-jupyter-notebooks
opencv
pandas-**computer-vision:**-albumentations
python
pytorch
pytorch-lightning
qgis
scikit-image-**training:**-torchmetrics
scikit-learn
terratorch
transformers-(huggingface)-**geospatial:**-rasterio
weights-&-biases
yaml

Updates

Ram kumar started this project — Aug 30, 2025 11:22 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.