MkulimaAI

Image recognition
voice Agent
supports Kiswahili

Inspiration

Most families in my village in Kenya depend completely on small scale farming.
Every season I hear the same problems: yellow leaves, stunted maize, unknown pests, no idea whether to spray or not.
Getting expert advice is very hard sometimes you wait weeks for an extension officer and by then the crop is already badly damaged.

I wanted to give farmers something that can help right now, in the middle of the field, just by speaking in Swahili or English, without needing to type.

That is why I built Mkulima AI.

What it does

Mkulima AI is a voice first crop advisor made for Kenyan smallholder farmers.

You speak naturally to the phone (Swahili or English)
You can show a photo of the affected plant/leaf/fruit
The AI immediately analyzes the photo + your description
It gives you spoken step-by-step advice
It can check current weather and use that information
It remembers previous conversations about the same farm

Hands free. Works on basic smartphones. No typing needed.

How we built it

Core technology Google Gemini 3

Main pieces:

Core AI: Google Gemini 3 (Gemini Live API for real-time voice + multimodal vision)
Voice: Gemini Live API for bidirectional, interruptible voice streaming (speak while it listens, barge in, natural flow)
Image analysis: Send photos as multimodal input to Gemini 3
Tools: Built function calling for weather data
Memory: Long context to keep farm history across sessions
Frontend: React 19 + TypeScript + Tailwind CSS simple, mobile-first UI with large buttons and clear icons
Audio: Web Audio API + MediaRecorder for microphone input and playback
Deployment: Built to run in browser

Everything runs in the browser. Built to be as lightweight as possible.

Challenges we ran into

Very unstable WebSocket connections on slow rural mobile networks , had to build reconnection + fallback to text
Gemini sometimes too confident about plant diseases from bad photos/lighting, had to force very careful, humble prompting
Agricultural terms in Swahili were sometimes confused , had to give many clear examples in the system prompt
Making voice feel really natural (not robotic) took a lot of small latency & interruption tunings
Very tight time had to ruthlessly cut features to have a strong, clean demo

Accomplishments that we're proud of

First time I managed to make really natural feeling voice-to-voice with Gemini Live API
The moment when the AI actually interrupts you and continues correctly, feels magical
Getting quite accurate plant problem detection + reasonable local recommendations
Making something that really could be useful in villages where internet is slow and people mostly speak Swahili
Managing to finish a complete working demo + good video in very short time

What we learned

How extremely powerful (and sometimes dangerously over-confident) multimodal frontier models are
Importance of very strict, repetitive, culturally appropriate system prompting
How much low latency + proper interruption handling changes the whole user feeling
That voice interface is dramatically more usable than text for rural / low-literacy users
How fast you can prototype serious agentic applications when you have good multimodal + voice + tool calling in one model

What's next for MkulimaAI

Short term plans:

Better handling of very poor quality / very dark photos
Add voice speed & accent fine-tuning
Simple offline caching of last advice + basic rules
More accurate Swahili agricultural vocabulary & local remedy database
Very simple way for farmers to share useful photos/diagnoses with each other (community learning)

Longer term plan:

Partner with local cooperatives / extension services
Add market prices + best selling time suggestions
Voice reminders for planting / spraying / weeding
Possible integration with SMS fallback for zero internet areas

I really want this to become a practical tool many farmers in Kenya actually use.

Built With

Submitted to

Gemini 3 Hackathon

Created by

I worked as a developer on the MkulimaAI project, mainly contributing to the back-end development where I helped build the systems that process farmer inputs and deliver AI-powered farming advice. It was my first time working extensively with some of the technologies used, which was initially challenging, but the experience allowed me to learn quickly and strengthen my development skills. Through this role, I helped create a reliable platform that enables farmers to receive instant guidance in Swahili or English while in the field, improving access to timely agricultural support and helping protect their crops.

001 #Felicity Smoak
John Mukhwana
Wegener Steven

Updates

John Mukhwana started this project — Feb 05, 2026 07:06 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.