Inspiration
India has over 150 million farmers, yet crop diseases silently wipe out 30–40% of yields every year, costing billions in lost income. Most small farmers don’t have access to agricultural experts. Instead, they rely on guesswork, word of mouth, or outdated practices. By the time disease symptoms are clearly visible, the damage is often already done.
I kept asking myself: What if every farmer had an agricultural expert in their pocket? What if taking a photo of a diseased leaf could instantly tell them what’s wrong, how serious it is, and exactly what to do next?
That question led to PrithviPulse.
Built using Google’s Gemini 3 Vision API and agricultural domain knowledge, PrithviPulse aims to make crop disease detection fast, accurate, and accessible. The name “Prithvi” (Sanskrit for Earth) reflects the core idea behind the project: healing the Earth, one crop at a time.
What I Learned
Exploring Gemini 3 Vision’s Multimodal Strength
- Learned how to design precise prompts for real-world agricultural image analysis
- Discovered that Gemini 3 doesn’t just identify diseases—it explains them and recommends treatments
- Learned to enforce structured JSON outputs for reliable UI rendering
- Used safety settings to ensure responsible and consistent agricultural advice
Building a Hybrid AI System
- Designed a fail-safe architecture: Gemini 3 as the primary model with a local CNN as fallback
- Ensured 100% uptime, even when cloud APIs hit rate limits
- Learned how to balance cloud intelligence with offline reliability
Why Gemini 3 Works So Well for Agriculture
- Excels at identifying subtle visual differences (e.g., early vs late blight)
- Goes beyond classification to provide actionable treatment steps
- Produces clear, farmer-friendly explanations
- Handles messy, real-world images with dirt, shadows, and damaged leaves
Key Insight: Traditional CNNs stop at labels. Gemini 3 delivers end-to-end agricultural guidance, which makes it far more practical for farmers.
How I Built It
System Architecture
React Frontend (TypeScript + Vite + Tailwind)
↓ HTTP/REST API
FastAPI Backend (Python)
↓
Hybrid AI Pipeline:
1. Gemini 3 Vision (Primary)
- Leaf image analysis
- Structured JSON output
- Treatment recommendations
2. Local CNN Model (Fallback)
- TensorFlow/Keras
- 38 disease classes
- Fully offline
Technology Stack
Frontend
- React 18.3.1 + TypeScript – Type-safe, scalable UI
- Vite 6.4.1 – Fast development and builds
- Tailwind CSS – Mobile-first, responsive design
- Web Speech API – Text-to-speech for treatment steps
Backend
- Python + FastAPI – Async REST API
- Google Gemini 3 (gemini-3-flash-preview) – Vision AI
- TensorFlow + Keras – Local CNN inference
- Uvicorn – Production-ready ASGI server
AI Models
- Primary: Gemini 3 Vision
- Input: Leaf images
- Output: Structured diagnosis + detailed treatment
- Fallback: Custom CNN (
plant_disease_model.h5)
- Trained on 87,000+ images
- Covers 38 disease classes
- Works completely offline
Core Implementation
model = genai.GenerativeModel("gemini-3-flash-preview")
response = model.generate_content([
genai.upload_file(image_path),
"""Analyze this leaf image and provide:
1. Disease name and confidence
2. Professional summary
3. Step-by-step treatment (physical + chemical)
4. Preventive measures
Return the response as structured JSON."""
])
Unique Features
Visual Treatment Timeline Step-by-step treatment flow with urgency indicators, progress tracking, and text-to-speech
Chemical Safety System Automatically detects chemical treatments, displays ⚠️ warnings, and provides dosage guidance
Offline-First Design Works without internet using the local CNN model
Mobile-Optimized UI Designed for field use on devices from 320px to 4K screens
Challenges I Faced
Gemini API Quota Limits
Problem: API quota was exhausted during repeated testing. Solution: Added automatic fallback to the local CNN model with retry logic and quota monitoring. Result: 100% uptime, even when Gemini is unavailable ✅
Structured Output from Vision Models
Problem: Gemini naturally returns free-form text, but the UI required consistent JSON. Solution: Carefully engineered prompts with strict JSON schema enforcement and fallback parsing. Result: 95% valid JSON responses ✅
Real-World Image Noise
Problem: Farm images often have poor lighting, dirt, shadows, and damaged leaves. Solution: Relied on Gemini 3’s strong visual reasoning and added preprocessing with confidence thresholds. Discovery: Gemini 3 significantly outperforms traditional CNNs on real farm photos ⭐
Agricultural Accuracy
Problem: Many diseases share very similar visual symptoms. Solution: Used domain-specific prompting and Gemini’s multimodal reasoning. Result: 88% accuracy, outperforming the local CNN by 12% ✅
Large Model File Size
Problem: GitHub rejected the 44MB H5 model file. Solution: Configured Git LFS and optimized repository setup. Lesson: Always plan for large ML assets early.
Results & Impact
Performance Summary
| Metric | Gemini 3 | Local CNN |
|---|---|---|
| Accuracy | 88% | 76% |
| Avg Response Time | 2.5s | 0.8s |
| Treatment Detail | High | Basic |
| Offline Support | ❌ | ✅ |
Test Dataset
- 33 real farm images
- Crops: Tomato, Potato, Apple, Corn
- Real-world conditions: dirt, damaged leaves, uneven lighting
Why Gemini 3 Was the Right Choice
- Understands visual + agricultural context
- Produces clear, actionable guidance
- Detects subtle disease patterns
- Generates structured outputs
- Fast, scalable, and production-ready
Big Win: Gemini 3 transforms raw images into actionable agricultural intelligence.
Social Impact & Future Vision
Current Capabilities
✅ Detects 38 crop diseases ✅ Provides treatment and prevention steps ✅ Works offline ✅ Mobile-friendly for field use
What’s Next
- Voice input in regional languages
- AI-driven crop calendars
- Market price prediction
- Farmer community features
- IoT sensor integration
Long-Term Impact
- Target users: 150M+ farmers
- Potential savings: ₹10,000–₹50,000 per farmer annually
- Mission: Free, open-source, and accessible to all
GitHub: https://github.com/ganesh21906/PrithviPulse Built with ❤️ for farmers — powered by Google Gemini 3
Log in or sign up for Devpost to join the conversation.