The Story of Krishi-Drishti:
Inspiration: A Problem Hidden in the Clouds
Our journey began not with a desire to build an AI model, but with a simple, frustrating observation: you can't see India during the monsoon. As students passionate about using technology for social good, we were exploring satellite imagery to understand agricultural patterns. Every time we pulled up images of crucial crop-growing regions in states like Bihar and Odisha during the rainy season, we were met with a vast, unbroken expanse of white cloud cover.
This wasn't just a technical inconvenience; it was a critical data gap. We learned that this blindness affects everyone:
- Farmers miss crucial advisories on pests or water stress.
- Banks and insurers cannot verify the crop sown area for loans and claims.
- Policymakers plan in the dark, without accurate yield forecasts.
This inspired our mission: to build an AI that could see through the clouds. We named our project Krishi-Drishti (Agricultural Insight), and its heart would be a model we called Kshitij (Horizon), symbolizing the new horizon of visibility we aimed to create.
What We Learned: The Symphony of Sensors
Building Kshitij was a deep dive into the world of remote sensing. We learned that satellites don't just take pictures; they sense different parts of the electromagnetic spectrum.
Optical Sensors (Sentinel-2): See the world like our eyes do (Red, Green, Blue, Near-Infrared). Plants reflect NIR light strongly when healthy (high NDVI), making them easy to identify. But, like our eyes, they are useless against clouds. $$ \text{NDVI} = \frac{(\text{NIR} - \text{Red})}{(\text{NIR} + \text{Red})} $$
Synthetic Aperture Radar - SAR (Sentinel-1, EOS-04): These satellites actively emit microwave pulses and measure the signal bounced back. The signal's strength and polarization (e.g., VV, VH) tell a story about the surface's structure and moisture. Crucially, microwaves pierce through clouds and rain.
The key insight was that a growing crop has a unique temporal "signature." A field of wheat and a field of rice have different growth rates, leading to different rates of change in their SAR backscatter coefficient ($\sigma^0$) over time. This temporal pattern is the secret key that clouds cannot hide.
# A simplified look at the SAR temporal feature extraction concept
# X_sar.shape = (num_samples, timesteps, features)
# features = [VV, VH, VV/VH ratio, etc.]
class TemporalSARModel(nn.Module):
def __init__(self, input_dim, hidden_dim):
super().__init__()
self.conv1d = nn.Conv1d(input_dim, hidden_dim, kernel_size=3) # Captures local patterns
self.lstm = nn.LSTM(hidden_dim, hidden_dim, batch_first=True) # Captures long-term dependencies
def forward(self, x):
x = x.transpose(1, 2) # Channel-first for Conv1d
x = F.relu(self.conv1d(x))
x = x.transpose(1, 2) # Back to time-first for LSTM
_, (hidden_state, _) = self.lstm(x)
return hidden_state.squeeze(0) # Return a compressed feature vector for the entire season
How We Built It: The Dual-Stream Architecture
We built Kshitij not as a single monolithic model, but as a sophisticated dual-stream network that respects the nature of its data.
The Pipeline:
Data Acquisition & Preprocessing: We used the IBM-enhanced AgriFieldNet dataset. Our first major task was writing geospatial scripts with
rasterioandgeopandasto extract thousands of small image chips aligned with field boundaries for both optical and SAR data across multiple time points.The Kshitij Model:
- Optical Stream: We used IBM's Prithvi foundation model as a powerful feature extractor for clear-weather Sentinel-2 imagery. Transfer learning from this model gave us a massive head start.
- SAR Stream: We built a custom 1D Convolutional Neural Network (CNN) combined with an LSTM to process the time-series data from the SAR satellites. This stream learned the unique growth-stage "fingerprint" of each crop.
- Fusion & Classification: Features from both streams were not simply concatenated. They were fed into an attention-based fusion mechanism that learned to dynamically weigh the importance of each stream. If the optical data was cloudy, the model automatically learned to pay more attention to the SAR stream's predictions.
Agentic Automation: We wrapped the trained Kshitij model in an IBM Agent Development Kit (ADK) agent. This agent was programmed to monitor for new satellite data, automatically trigger our inference pipeline, and generate alerts—transforming our model from a static file into an active, automated service.
Challenges We Faced
The Geospatial Data Hurdle: This was our biggest challenge. Handling multi-resolution, multi-temporal, multi-source satellite data is complex. A single misalignment in georeferencing could render a data sample useless. We spent countless hours debugging our chip-extraction code and ensuring perfect alignment between SAR, optical, and ground truth data.
Computational Resources: Training foundation models and processing thousands of high-resolution images requires significant GPU power and storage. We leveraged Kaggle's GPU credits and learned efficient data loading techniques (like using GeoTIFF overviews) to make the problem tractable.
The Fusion Problem: Our initial naive approach of simply stacking optical and SAR channels into a single model performed poorly. The model struggled to reconcile the vastly different data distributions. This failure led to our breakthrough: designing the separate, dedicated streams for each data type, which allowed each to specialize before merging.
Interpretability: It was not enough to just have good accuracy; we needed to trust the model. Why did it classify a field as wheat? We implemented Grad-CAM and attention visualization techniques to show which parts of the imagery and which time points were most influential in the decision, building crucial trust in the model's "all-weather" capability.
Conclusion: A New Horizon
Krishi-Drishti is more than a project; it's a proof of concept that with ingenuity and the right tools, we can solve deeply entrenched problems. We learned to speak the language of satellites, to think in terms of data pipelines, and to architect AI solutions that are robust and meaningful.
The greatest lesson was that the most powerful solutions often come not from a single breakthrough, but from the intelligent fusion of multiple existing technologies. By weaving together SAR and optical data, foundation models and custom networks, we built a system whose whole is truly greater than the sum of its parts. We've pushed the horizon of what's possible for Indian agriculture, and this is only the beginning.

Log in or sign up for Devpost to join the conversation.