SmartFarm AI

App Logo
Thumbnail
Application UI

Inspiration

Agriculture challenges are universal. It is the backbone of food security worldwide, yet farmers often lack immediate access to expert guidance when facing crop diseases or uncertain seed quality. These problems are time-sensitive and can lead to significant yield losses if not addressed early. We were inspired to build a solution that uses AI to bring expert-level agricultural insights directly to farmers, anywhere in the world without regional or technical barriers, using nothing more than an image and a simple interface.

What it does

SmartFarm AI is a multimodal agricultural assistant that analyzes crop leaves and seeds from images and provides instant, actionable insights. Users can upload an image of a crop leaf to detect possible diseases or upload seed images to evaluate seed quality before sowing. The system returns structured results including identification, confidence scores, observations, and recommendations. Additionally, SmartFarm AI offers voice-based guidance, allowing users to listen to AI-generated advice for hands-free and accessible usage. The system is designed to be globally usable.

How we built it

SmartFarm AI is built using a FastAPI backend integrated with Google Gemini 3 Vision. It used Gemini 3 Vision to analyze images of crops and seeds (performing all image understanding and reasoning) and FastAPI handled image uploads, Gemini API calls, structured JSON extraction, and voice generation. Carefully designed prompts ensure that Gemini returns reliable, structured JSON outputs for both crop disease detection and seed quality analysis. The frontend is developed using Streamlit, featuring a professional UI with light and dark themes, image previews, and audio playback. Voice responses are generated using text-to-speech services, completing a fully multimodal experience.

Challenges we ran into

One major challenge was enforcing consistent and valid JSON responses from a generative AI model. Another challenge was integrating multiple AI workflows—crop analysis and seed analysis—into a single seamless system. Ensuring smooth communication between the frontend and backend while maintaining performance and usability also required careful design and testing.

Accomplishments that we're proud of

Successfully built a fully multimodal AI system using vision and voice
Integrated Gemini 3 Vision without relying on heuristic or rule-based logic
Designed a global, crop-focused solution applicable across regions
Delivered a clean, user-friendly interface with real-time AI responses

What we learned

We learned how to design a production-ready multimodal AI system using Gemini, from prompt engineering to full-stack deployment. This project taught us how to design and deploy a real-world AI application using multimodal models. We gained hands-on experience with prompt engineering, AI output validation, backend–frontend integration, and building accessible user experiences that go beyond text-based interactions.

Gemini Integration

SmartFarm AI is powered by the multimodal capabilities of Gemini 3, which serves as the core intelligence of the application.
The system uses Gemini 3 Vision (gemini-3-flash-preview) to analyze images of crops and seeds. For crop disease detection, Gemini processes leaf images to identify the crop type, detect possible diseases, estimate confidence scores, and generate treatment recommendations. For seed quality analysis, Gemini evaluates visual characteristics such as shape, color, damage, and defects to classify quality levels and provide sowing guidance.
All AI responses are generated directly by Gemini using carefully engineered prompts and returned in strict JSON format, ensuring reliable parsing and seamless integration into the application. No heuristic or rule-based logic is used.
In addition to visual analysis, SmartFarm AI includes voice capability, converting Gemini-generated insights into spoken guidance so users can listen to recommendations hands-free. This makes the system accessible in low-literacy or on-field scenarios.
By combining image understanding, reasoning, structured output generation, and voice support, Gemini 3 enables SmartFarm AI to deliver scalable, globally applicable agricultural intelligence.

What's next for SmartFarm AI

Next, we plan to expand SmartFarm AI with multilingual voice support, historical analysis for tracking crop health over time, and mobile-friendly deployment. We also aim to integrate weather and soil data to provide even more context-aware recommendations for farmers worldwide.

Github Repo

Star our repo here GitHub.

Built With

fast-api
gemini
gemini-vision
github
google-gemini-3-api
python
rest-apis
streamlit
tts

Submitted to

Gemini 3 Hackathon

Created by

I played a central role in designing and developing the SmartFarm AI system, contributing across both backend and frontend layers. I implemented the FastAPI backend for crop disease and seed quality analysis, integrated Google Gemini Vision for multimodal AI inference, and ensured structured JSON responses suitable for real-world agricultural use across multiple countries. I also added Urdu-language voice assistance by integrating text-to-speech functionality, making the system more accessible for farmers. On the frontend, I built and refined an interactive Streamlit interface, connected it with live backend APIs, handled error states, and ensured that AI outputs—including confidence scores, recommendations, and audio feedback—were correctly displayed. Throughout the project, I focused on system reliability, clean API design, and end-to-end integration to deliver a practical, farmer-centric AI solution.

Moneka Meghwar
Data Scientist
I played a integral role in the development of the -AI system by backend implementation of google voice api and add out the seed detection module in the backend and handover files to team lead for frontend integration and push entire project folder to github. I designed and developed scalable RESTful APIs using FastAPI to perform intelligent crop disease detection and recommendation generation.AI response handling, and voice-based interaction. Additionally, I managed environment configuration, API testing. This work strengthened the system’s real-time performance, modularity, and overall usability.Additionly, I made videos for the demonstration of the project these links are made public for review.

Kashmala Saddiqui
Created README documentation, presentation slides, and demo video. Defined project workflow, explained Gemini 3 integration, prepared demo narrative, and assisted with testing Gemini API outputs.

umaima rizwan

Updates

Moneka Meghwar started this project — Feb 09, 2026 05:50 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.