Inspiration
Field AI was inspired by a simple problem we kept seeing in the field: critical agricultural decisions are often made under pressure, with limited connectivity, limited time, and too much fragmented information. Farmers and field operators don’t need another dashboard or dense manual — they need clear, trustworthy answers while their hands are busy and conditions are changing. We saw an opportunity to combine voice, vision, and calculation into a single, natural interface that works the way people already do in the field: speak a question, show the problem, and act immediately. Field AI is built around the idea that expert knowledge should be accessible on demand, without friction, and without requiring users to stop what they’re doing to search, read, or interpret complex documentation. When you’re standing in a field with a pest outbreak, the right answer at the right moment matters. Field AI exists to bring that expertise directly to the user, exactly when they need it.
What it does
Field AI is a voice-first field assistant that helps users make confident crop protection decisions in real time. Users can ask questions by voice, take photos of pests or crop damage, and receive clear, practical guidance without leaving the field. Field AI identifies pests and diseases from images, provides information on appropriate chemicals, and supports natural follow-up questions so users can understand options, risks, and best practices. It also performs accurate calculations based on field size, application rates, and product specifications, helping users determine exactly how much product is needed. By combining voice interaction, visual understanding, and precise calculations, Field AI turns complex agricultural knowledge into simple, actionable guidance at the moment it’s needed most.
How we built it
Field AI was built as a voice-first system optimized for real-world field conditions. We used the ElevenLabs voice generator and React SDK to capture natural speech from the microphone, generate real-time transcripts, and provide controlled voice responses during processing, enabling hands-free use in the field. The frontend is built with Next.js to deliver a fast, responsive experience on mobile devices. The backend is powered by NestJS, which handles authentication, business logic, calculations, and integrations with external services. Google GenAI plays a central role in the system. It is used to generate embeddings for crops, pests, diseases, and chemical knowledge, detect user intent, and produce structured, grounded answers. GenAI is also used for pest and disease detection by reasoning over images using grounding data supplied by the system, ensuring identifications are constrained to known crops, pests, and regions. Field AI integrates real-time weather data to inform application decisions. Weather conditions such as wind, rainfall, temperature, and humidity are factored into recommendations to help determine the safest and most effective times to apply different chemicals. Elasticsearch powers fast and reliable product and chemical search, allowing Field AI to return accurate results even when queries are conversational or incomplete. Together, these technologies form a scalable, modular platform that combines voice, vision, search, weather intelligence, and reasoning into a single, field-ready solution.
Challenges we ran into
One of the early challenges was handling voice input and silence detection reliably. We initially tried using ElevenLabs directly in the backend, reading audio buffers and implementing our own silence detection logic. While this gave us low-level control, it proved complex and brittle in real-world conditions. We eventually moved to using ElevenLabs in agent mode, which simplified voice handling and provided natural waiting or processing messages while backend tasks were running. Managing conversational context was another ongoing challenge. Users often ask follow-up questions that depend on previous answers, images, or field parameters, and maintaining this context accurately across voice, vision, and calculation flows required careful design. Context handling remains an area of active iteration as we balance flexibility with correctness. We also faced challenges around cost, particularly when embedding large chemical and product datasets. Generating and maintaining embeddings for extensive documentation required thoughtful batching, filtering, and update strategies to keep the system performant and economically sustainable.
Accomplishments that we're proud of
We built a fully voice-first field assistant that works in real conditions, allowing users to ask questions, take photos, and receive actionable guidance without interrupting their workflow. Bringing together voice, vision, search, calculations, and weather intelligence into a single, cohesive experience was a significant achievement. We’re proud of how Field AI delivers grounded, trustworthy outputs. By constraining GenAI with structured data and grounding inputs, we were able to provide reliable pest identification, chemical guidance, and dosage calculations without hallucinated or unsafe recommendations. Another key accomplishment is the system’s flexibility and scalability. The modular architecture makes it easy to extend Field AI to new crops, regions, products, and regulations, while keeping response times fast and interactions natural. Most importantly, we built a tool that feels genuinely useful in the field, not just impressive in a demo.
What we learned
Building Field AI reinforced how important it is to design for real-world usage rather than ideal conditions. Voice interactions behave very differently in noisy, outdoor environments, and building reliable systems means embracing constraints instead of fighting them. We learned that grounding AI responses in structured data is essential when dealing with high-stakes domains like agriculture and chemical usage. Free-form answers may sound convincing, but accuracy, traceability, and consistency matter far more than eloquence. We also learned that context management is one of the hardest problems in multimodal systems. Maintaining continuity across voice, images, calculations, and weather inputs requires deliberate architecture and constant refinement. Finally, we learned that simplicity at the user level often requires significant complexity behind the scenes — and that trade-off is worth it when the end result genuinely helps people do their work better.
What's next for Host
Next, Field AI will move beyond guidance into execution. With access to real-time pricing and user location, the platform will be able to generate accurate quotes directly from voice or image-based requests. Users will be able to confirm quantities, compare options, and receive clear cost breakdowns without leaving the field. As banking integrations are introduced, Field AI will support initiating payments directly from the platform. Once a payment is confirmed, orders can be dispatched for delivery, closing the loop from identification and advice to purchase and fulfillment. This evolution turns Field AI from a decision-support tool into an end-to-end field commerce and logistics assistant, helping users move seamlessly from problem to solution with minimal friction.
Built With
- elasticsearch
- elevenlabs
- google-gen-ai
- nestjs
- nextjs
Log in or sign up for Devpost to join the conversation.