Inspiration

I worked in oil and gas for a year and a half as an offshore production engineer before switching careers into tech. That experience made one thing very clear — the field is where decisions are hardest and support is thinnest. You're standing next to critical equipment, making high-stakes calls with whatever knowledge you carry in your head.

The moment I moved into tech I had AI assistants, instant documentation, and expert tools at my fingertips. That gap shouldn't exist. The field is exactly where an extra brain — one that can see, think, and search — would matter most. That's what Gemy AI is built to be.


What it does

Gemy AI is a hands-free field assistant that can see the issue through your camera, listen to your description, search for common causes, and walk you through a step-by-step diagnosis and fix plan. It also annotates images to highlight specific parts in your view, uses web search to personalize findings to your exact equipment, and generates a full PDF repair report at the end of the session.


How we built it

Built using the Gemini Live API with ADK to handle the bidirectional streaming pipeline, which made managing real-time audio and video to the model much cleaner to work with.


Challenges we ran into

The biggest challenge was the multi-agent architecture I originally planned — a dedicated sub-agent for each phase: intake, diagnose, plan, and replan. The problem was the cumulative delegation latency across handoffs. For a low-latency real-time assistant that was a dealbreaker. I decided to simplify back to a single agent with full tool access, which was the right call for the use case.


Accomplishments that we're proud of

I heard about the hackathon late and hesitated — I had no prior experience building real-time AI agents. I'm proud I jumped in anyway, learned ADK from the docs during the build, and shipped something working.

The biggest realization: building live AI agents doesn't have to be hard. With ADK you can bring a real idea to life in the simplest way possible.


What we learned

Learned a lot across a short sprint — agentic design patterns, root/sub-agent architecture, AgentTools, functional tools, session state, artifact handling, bidirectional audio/video streaming to native models, VAD, barge-in handling, the difference between half-cascade and native audio model architectures, and CI/CD integration with GCP and GitHub Actions.


What's next for Gemy AI

The hackathon feedback will be my first real validation of the prototype. If it lands well, the plan is to make this genuinely useful by adding equipment history tracking, connecting manufacturer manuals through RAG, and building agent evaluations to systematically improve the diagnostic flow.

Built With

Share this project:

Updates