Inspiration

Agriculture is the backbone of the economy, especially in places like Nigeria where I live, yet there is often a massive disconnect between advanced agronomic data and the people on the ground. A small-scale farmer needs practical, weather-dependent advice, while an investor needs to see a clear breakdown of yield projections and ROI. I wanted to bridge this gap by building a true "Mission Control" for agriculture that fits in your pocket. The inspiration for Farmvilla Agent came from the realization that accessing complex data shouldn't require navigating clunky dashboards—it should be as easy as having a natural, real-time conversation with a specialized expert who can literally see what you are looking at.

What it does

Farmvilla Agent is a real-time, multimodal AI assistant designed for the agricultural community. By leveraging the Gemini Live API, it provides a seamless, interruptible voice interface.Users can select from specialized personas:🚜 Farmer: For field advice and pest management.💰 Investor: For financial metrics, ROI calculations, and market trends.🔬 Professional: For deep technical and supply chain analysis.🎓 Student: For simplified analogies of complex agricultural concepts.The agent features contextual awareness, utilizing GPS coordinates for location-specific climate advice, and multimodal vision to analyze uploaded documents (like soil reports) or live photos (like a diseased crop). It wraps all of this in a modern, immersive interface with a reactive audio visualizer.

How we built it

Building Farmvilla Agent meant orchestrating several complex, real-time systems to create a fluid user experience:Real-Time Voice & Interruptions: I integrated the Gemini Live API to handle the bidirectional audio stream. This allowed for natural, back-and-forth conversations where the user can interrupt the agent mid-sentence. --Multimodal Integration: I built a pipeline to capture camera frames and document uploads, converting them into a format the model can process alongside the ongoing voice conversation. --Dynamic Personas & Prompting: I engineered robust system instructions that hot-swap depending on the user's chosen role, ensuring the vocabulary, tone, and focus dynamically shift. --Data Analytics & Tool Calling: Leveraging my background in business administration and data analytics, I integrated function calling so the "Investor" persona can dynamically generate interactive charts and calculate metrics. For instance, the agent can actively compute Expected Yield and Return on Investment using core formulas like:$$ROI = \left( \frac{\text{Projected Revenue} - \text{Total Input Costs}}{\text{Total Input Costs}} \right) \times 100$$ --Contextual Memory: The state management system continuously appends conversation history and file analyses so the agent doesn't lose context over a long session.

Challenges we ran into

Handling real-time audio streams while maintaining low latency was a significant hurdle. When a user interrupts the agent, the system needs to instantly halt the audio playback and clear the audio buffer without losing the overarching context of the conversation. Another major challenge was managing the multimodal memory. Passing images and documents to the model is straightforward, but ensuring the agent remembers the details of a soil report uploaded 10 minutes ago while actively discussing a new photo of a crop required careful optimization of the context window and state management.

Accomplishments that we're proud of

I am incredibly proud of the "Live Immersive View." The reactive audio visualizer creates a high-end, responsive feel that makes the AI seem truly alive. Successfully implementing the hot-swappable personas is also a major win; it is fascinating to watch the exact same crop photo elicit practical pruning advice from the "Farmer" persona, and an instant risk-assessment and financial breakdown from the "Investor" persona.

What we learned

I deepened my understanding of real-time streaming architectures and WebSocket/WebRTC integrations. I also learned a great deal about advanced prompt engineering—specifically, how to craft system instructions that not only dictate persona but also strictly govern how the model uses tools like GPS tracking and chart generation without breaking character.

What's next for Farmvilla Agent

The next step is to expand Farmvilla's integrations by connecting it directly to live IoT farm sensors (like soil moisture monitors and smart irrigation systems) so the agent can alert the user to field conditions before they even ask. I also plan to expand the language capabilities to support more local dialects, making the tool accessible to an even wider demographic of rural farmers.

Built With

Share this project:

Updates