Inspiration

I had a very deep interest in computer vision and NLP due to tremendous impact it brings to many industries such as agriculture and food security.

What it does

This application uses AI models (YOLOv8 and Gemini) to analyze images of crops and tomatoes, detecting diseases and defects, and generating detailed descriptions to help farmers and agricultural experts identify and address issues.

How I built it

The application was built with Python libraries like OpenCV for image preprocessing and data augmentation, as well as gradio for the user interface. The models used were YOLOv8 for object detection and Google Gemini for text generation. The YOLOv8 model was trained on crop diseases and tomato datasets from Roboflow. The YOLOv8 model achieved high accuracy and precision, with an F1 score of 0.95. I then applied template-based prompt engineering on the Gemini model in order to generate descriptions in the desired format.

Challenges I ran into

I faced two major challenges in implementing this application. One challenge was the relatively poor performance of the trained YOLOv8 model in detecting the quality of tomatoes. I resolved this problem by augmenting the dataset and increasing the number of epochs to ensure more accurate predictions. The other challenge was the integration of other LLMs models from Ollama such as LLaMA-3, which was resource intensive. I mitigated this challenge by using the Google Gemini API that generates text significantly faster.

Accomplishments that I'm proud of

One accomplishment I can be proud of is the successful implementation of this AI-powered application that seeks to address major problems in food security and agriculture, through the integration of computer vision and NLP.

What I learned

This project improved upon my understanding of using AI to solve problems in agriculture and food security by leveraging computer vision and NLP algorithms, which has significantly improved my skillset and hands-on experience.

What's next for Crop Disease and Tomato Freshness Diagnosis Application

I intend to build a more robust and interactive user interface with analytic and feedback capabilities. Additionally, I intend to integrate other AI models like text-to-speech and translation models to address visual impairment and language barrier issues of certain local farmers.

Built With

  • gemini
  • huggingface
  • llms
  • ultralytics
  • yolov8
Share this project:

Updates