Inspiration

India is home to over 100 million farmers, yet yield gaps remain high due to unpredictable weather, water scarcity, and a lack of timely expert advice. Most existing solutions are text-heavy or require English literacy, leaving out the majority of rural farmers.

We wanted to build Fasal Seva (Crop Service) to bridge this gap. Our goal was to create an AI that doesn't just "chat" but speaks the farmer's language fluently and "sees" their farm from space. With the release of Gemini 3, we finally had the low-latency reasoning and native audio capabilities needed to build a true "Agronomist in the Pocket."

What it does

Fasal Seva is a comprehensive precision agriculture platform tied together by Sevak, our AI assistant.

  1. Hands-Free Voice Assistant (Gemini 3 Native Audio): Farmers can talk to Sevak in Hindi or Hinglish. It’s not a transcriber; it’s a real-time conversation. You can interrupt it, ask follow-ups, and get advice on fertilizers or pests instantly.
  2. Eyes in the Sky (Sentinel-2 Satellite Analysis): The app monitors farm coordinates using live Sentinel-2 imagery. It calculates NDVI (Vegetation Health), NDWI (Water Stress), and NDRE (Chlorophyll) to detect issues before they are visible to the naked eye.
  3. Smart Irrigation Engine: Using FAO-56 penman-monteith standards combined with local weather, it tells farmers exactly when and how much to water, saving up to 40% of water usage.
  4. Crop Doctor: Farmers can snap a photo of a sick plant, and our Gemini 3 Vision pipeline identifies the disease and suggests organic remedies.

How we built it

We utilized a microservices architecture to handle the complexity of AI and geospatial data:

  • The Brain (Gemini 3.0 Flash): We migrated our core logic in sevak_adviser.py to gemini-3.0-flash. The upgraded reasoning capabilities allow it to synthesize complex agronomy data (soil type + crop stage + weather) into simple advice.
  • The Voice (Gemini Live API): We implemented the voice service using Gemini 3.0 Native Audio via WebSockets. By bypassing traditional Speech-to-Text -> LLM -> Text-to-Speech pipelines, we achieved ultra-low latency, making the conversation feel natural.
  • The Satellite Pipeline: We built a custom geospatial engine using pystac_client and rioxarray. It queries the AWS Open Data Registry for the latest cloud-free Sentinel-2 tiles, clips them to the farmer's field polygon, and computes vegetation indices using NumPy on the fly.
  • Frontend: The mobile app is built with React Native (Expo) for a smooth cross-platform experience, while the web platform uses Next.js.

System Architecture Flow

Our system follows a robust microservices pattern to handle real-time AI and geospatial processing simultaneously.

  1. Input (The Farmer Speaks): The farmer opens the Fasal Seva app and talks naturally in Hindi or English (e.g., "My field implies dry, should I water?"). The app captures this voice input.
  2. Processing (The Double Pipeline):
    • The Ears (Gemini Native Audio): The voice audio is streamed instantly to Gemini 3.0. It doesn't wait for the sentence to finish; it processes understanding in real-time.
    • The Eyes (Satellite Engine): At the same time, the app sends the farm's GPS location to our Python backend. This triggers a search for the latest clear Sentinel-2 satellite images of that exact spot.
  3. Analysis (The Science):
    • Our geospatial engine calculates plant health (NDVI) and moisture levels (NDWI) from the satellite pixels.
    • Simultaneously, the Smart Irrigation model pulls local weather data to calculate exactly how much water evaporated today ($ET_0$).
  4. Output (The Expert Advice): Gemini 3.0 combines the farmer's question, the satellite's "health report," and the rigorous math into a simple, spoken answer: "Your crop stress is high, and soil moisture is low. Based on today's heat, please water 4,000 liters per acre."

The Math Behind the Magic

We don't just guess water usage; we calculate it. The Smart Irrigation Engine implements the FAO-56 Penman-Monteith standard to determine the exact Evapotranspiration ($ET_0$) for every specific farm.

The core equation we solved in Python is:

$$ ET_0 = \frac{0.408\Delta(R_n - G) + \gamma \frac{900}{T+273} u_2 (e_s - e_a)}{\Delta + \gamma(1 + 0.34u_2)} $$

Where:

  • $$R_n$$ = Net radiation at the crop surface
  • $$G$$ = Soil heat flux density
  • $$T$$ = Mean daily air temperature
  • $$u_2$$ = Wind speed at 2m height
  • $$(e_s - e_a)$$ = Saturation vapor pressure deficit

We then combine this with the Satellite-derived Crop Coefficient ($K_c$) (adjusted via NDVI) to find the precise water need ($ET_c$):

$$ ET_c = ET_0 \times K_c $$

This allows Fasal Seva to prescribe water in liters per acre rather than vague "minutes of watering."

Challenges we ran into

  • Satellite Cloud Cover: Optical satellites can't see through clouds. We had to write robust filtering logic to reject cloudy scenes and fallback to historical data or "safe mode" advice when the view is obstructed.
  • Hallucinations vs. Science: Generative AI can sometimes be too creative. We solved this by implementing a RAG (Retrieval Augmented Generation) system. We feed Gemini "ground truth" data (verified agricultural guidelines) before it answers, ensuring it acts as a translator of science rather than a creative writer.
  • Latency in Rural Areas: Rural internet is spotty. Switching to Gemini 3.0 Flash significantly reduced our payload sizes and inference times, making the app responsive even on 4G networks.

Accomplishments that we're proud of

  • True Native Audio: We aren't just sending text to an API. We are streaming audio, allowing for a genuinely human connection between the farmer and the AI.
  • Scientific Rigor: Implementing the FAO-56 dual crop coefficient method for irrigation wasn't easy, but it means our water advice is mathematically sound, not just a guess.
  • Real-Time Geospatial Analysis: Processing satellite bands instantly when a user opens the dashboard feels magical every time.

What we learned

  • Multimodality is King: Combining Vision (Crop Doctor), Audio (Sevak), and Tabular Data (Satellite Indices) creates a context window that a human expert would have.
  • Speed Matters: In a conversational interface, latency is the difference between a tool and a toy. Gemini 3's speed improvements were a game-changer for our voice interface.

What's next for Fasal Seva: Gemini 3 Native Audio & Satellite AI

  • IoT Integration: We are working on integrating hardware soil sensors (NPK & Moisture) to provide "Ground Truth" to validate our satellite data.
  • Offline Mode: Developing a small model variant (Gemini Nano) to handle basic queries when the farmer has zero connectivity.

Built With

  • aws-open-data`
  • expo.io
  • flask`
  • gemini-3.0-flash`
  • gemini-3.0-native-audio`
  • google-gemini
  • google-gemini`
  • mongodb`
  • next.js`
  • node.js`
  • numpy`
  • python
  • python`
  • react-native`
  • sentinel-2`
Share this project:

Updates