⚡ The Heart: Gemini 2.5 Flash Native Audio

Powered by the Live API, this model brings "Mac" (our 30-year veteran persona) to life.

  • Zero-Latency Voice: We utilize native audio generation to achieve sub-200ms responses, eliminating robotic TTS delays.
  • Agentic Control: Through function calling, Mac drives the kiosk UI in real-time—triggering video demos, searching inventory, and visualizing aisle locations naturally during conversation.
  • Automated Agent Testing: Antigravity fired up an Agentic Agent to develop tests for the application, fix bugs and retest until the app works as expected.

🧠 The Brain: Gemini 3 Flash

Gemini 3 serves as our deep visual cortex for complex problem solving.

  • Multimodal Analysis: When a user presents a broken part, Mac activates analyze_part to capture a high-res snapshot.
  • Technical Reasoning: We rely on Gemini 3's superior reasoning to identify obscure hardware (like specific compression fittings) and generate safety-critical execution steps that standard models miss.

🚀 The Synergy

Together, they create a seamless experience: Gemini 2.5 delivers immediate, human-like service, while Gemini 3 provides the deep technical verification—transforming a generic kiosk into a master plumber.

Beyond the 200 word limit...

Inspiration

I needed to fix a water leak under the sink. I went to a big box hardware store for help and when I finally found someone to help, they gave me bad information, which lead to the leak not getting fixed. I then went to another small hardware store for help and they explained how to assemble compression fittings properly.

How I built it

I started out with Google AI Studio to build the concept and test the latency. It was good, but I needed a more powerful tool so I tried Google Antigravity for the first time. Antigravity helped me "vibe" code the entire demo application. In the past I have spent months learning a new programming language like Java to create an Android app. Now I feel I can create anything very quickly.

Challenges I ran into

  1. I spent two days with Antigravity troubleshooting an error in the app with no success. It wasn't Antigravity's fault. It was just a bug in the live API model that I was using (gemini-2.5-flash-native-audio-preview-012-2025) getting the Error 1008. Finally, I was able to solve the problem by asking the Gemini 3 Pro - Thinking chatbot. It directed me to '''discuss.ai.google.dev''' which as a novice I didn't know anything about and sure enough the Error 1008 is fully documented.
  2. I wanted to create and play real time "How To" videos, but currently the latency is too high, but it will make a nice future addition.

Accomplishments that I'm proud of

  • Ability to build a working demo app in 4 days. It would of been only 2 days, if I would have know about the google developers forum. :-)

What I learned

  • Antigravity can be used to create any app you can imagine. It can be used by non-programmers.
  • Automated agent testing can test all the features of your app, fix them and retest until the app works as expected.

What's next for AIKiosQ

  • I would love to launch an AI based kiosk company, but lack the connections for a sponsor from a big box store.
  • My hope is to get a 30-minute interview with the AI Futures Fund team
  • I would like to incorporate a Veo model to generate how to videos on the fly, but for now the latency is too high and may lose the customer's interest.
  • Add Acoustic Echo Cancellation (AEC) to prevent the microphone picking up the output from the speaker and accidently interrupting itself.

Built With

Share this project:

Updates