Our inspiration stemmed from the critical challenges faced by Ghanaian farmers: crop disease significantly impacts livelihoods and food security. A stark reminder of this was the devastating Fall Armyworm outbreak in Ghana some years ago, which led to widespread crop destruction, significantly increased the cost of living, and contributed to food shortages. This event underscored the urgent need for timely and accurate disease detection accessible directly to farmers. We saw an opportunity to bridge the gap between advanced AI technology and practical agricultural needs, especially for those in rural areas with limited access to expert advice. The goal was to empower farmers with an accessible, immediate, and intelligent tool to protect their crops.
GreenAI is an AI-powered crop disease detection system designed for ease of use. Farmers simply upload a photo of their crop (from a smartphone or drone). Our system then uses object detection (YOLO) to identify the crop and any visible diseases. What makes GreenAI stand out is its conversational AI interface, powered by Google Gemini. Farmers can ask questions about the detected disease using voice input (Speech-to-Text), and GreenAI provides clear, actionable advice through voice output (Text-to-Speech). This multi-modal approach makes expert agricultural guidance accessible to everyone, regardless of literacy levels, helping to prevent future crises like the Fall Armyworm outbreak.
We built GreenAI using a stack of powerful AI and web technologies. The core object detection model (YOLOv8) was fine-tuned on the crop disease dataset, which is specifically tailored for Ghanaian crops (cashew, cassava, maize, tomato) and their diseases, including common pests and diseases like those that caused the Fall Armyworm damage. We integrated this with the Google Gemini API for its advanced conversational capabilities, allowing us to interpret detection results and provide natural language responses. For the voice interface, we leveraged Hugging Face's Transformers library for both Speech-to-Text (using Whisper) and Text-to-Speech (using SpeechT5). The entire application is orchestrated using Streamlit, providing a user-friendly and interactive web interface.
One significant challenge was ensuring real-time responsiveness across all the integrated AI models, especially with the generative AI and speech processing. Optimizing the flow between image processing, object detection, LLM inference, and voice synthesis required careful management of resources and asynchronous operations to maintain a smooth user experience. Another challenge was fine-tuning the voice input/output interaction to prevent feedback loops and ensure natural conversation, especially when integrating with Streamlit's reactive nature. We also worked on making the language from the AI responses farmer-friendly and actionable, understanding the urgency required in situations like a disease outbreak. A key challenge we also identified for future development is how to make the AI voice assistant truly multilingual, especially for various Ghanaian languages and dialects, to maximize accessibility and impact across diverse farming communities.
We're incredibly proud of building a truly multi-modal and accessible AI solution that directly addresses critical agricultural challenges in Ghana. The seamless integration of image recognition, conversational AI, and voice interaction sets GreenAI apart, making advanced technology usable for a wider audience, including those who may not be comfortable with text-based interfaces. Successfully deploying and connecting these diverse AI components (YOLO, Gemini, Whisper, SpeechT5) within a single, functional Streamlit application was a major technical achievement. We believe we've created a practical tool with significant potential to make a real difference in Ghanaian agriculture, helping to prevent and mitigate the impact of future outbreaks similar to the Fall Armyworm.
This hackathon suppport the importance of user-centric design in AI applications, especially in specific domains like agriculture. We learned valuable lessons about optimizing AI pipelines for speed and efficiency, the nuances of prompt engineering for generative models, and the complexities of integrating multiple sophisticated AI services. It also highlighted the power of open-source tools like Hugging Face and Streamlit in rapid prototyping and development, demonstrating how technology can offer a crucial defense against threats like crop diseases that impact national food security.
We will focus on refining the user experience, especially around the voice interactions, and ensuring robustness across different image qualities and also making it multilingual. We'll also be preparing a compelling demonstration that clearly showcases the end-to-end functionality, from image upload and detection to the interactive voice-based consultation with GreenAI to investors, emphasizing its role in proactive disease management and preventing future agricultural crises. We aim to present a solution that is not just technically sound but also immediately understandable and impactful for its intended users.
Log in or sign up for Devpost to join the conversation.