💡 Inspiration
It started with a simple problem: my friend accidentally pushed an AWS key to GitHub. By the time we noticed, some bot had already cloned the repo and racked up $500 in crypto mining charges. 😱
That got me thinking – what if we could catch mistakes before they happen? What if an AI could look at your code, your setup, even your physical workspace, and spot dangers instantly?
I wanted to build something that feels like having a security expert looking over your shoulder 24/7 – but without the expensive consulting fees.
🎯 What it does
AuditVision AI is your pocket-sized safety inspector.
Point your camera at anything – a messy server rack, a suspicious email, a burnt power cable, even your homework – and it:
- Analyzes what it sees using Gemini 2.5 Flash
- Identifies risks with severity levels (HIGH/MEDIUM/LOW)
- Suggests solutions you can actually use
- Talks back with a human-like voice summary
- Remembers everything in Google Sheets (audit trail)
- Answers follow-up questions – just click the button and ask!
It's like having a safety consultant, a compliance officer, and a helpful friend all rolled into one, living in your browser.
🛠️ How we built it
I started with a simple idea: "Can I make Gemini see what I see?"
The stack came together naturally:
· Google Colab as the free cloud host (bless their hearts!) · Gemini 2.5 Flash for vision & conversation · JavaScript magic for camera, microphone, and voice · Google Sheets API for logging (because spreadsheets are forever) · ipywidgets for that button-click satisfaction
The hardest part? Getting the voice to work reliably. The browser speech APIs are powerful but picky – one wrong move and you get "Didn't catch that" errors forever.
🧗 Challenges we ran into
Oh boy, where do I start?
The Permission Dance 🕺 Browsers are paranoid (rightfully so!). Getting camera AND microphone permissions to play nice in Colab took more trial and error than I'd like to admit. The first 20 versions just silently failed.
The Voice Recognition Monster 🎤 "Why isn't it hearing me?!" – I lost sleep over this. Turns out, if you don't disable the button while listening, it registers multiple clicks and chaos ensues.
JSON Parsing Nightmares Gemini sometimes gets creative with its JSON output. One day it's perfect, the next it adds a friendly "Here's your JSON:" message that breaks everything. Building a parser that survives Gemini's mood swings was an adventure.
The "Works on My Machine" Syndrome Everything worked perfectly in my Chrome... and failed everywhere else. Safari? Nope. Firefox? Good luck. I had to add fallbacks and graceful failures.
🏆 Accomplishments that we're proud of
· First successful voice interaction – when the agent actually answered my question out loud, I literally cheered · Zero-cost architecture – everything runs on free tiers, proving you don't need money to build cool stuff · Button design – that simple green button that Just Works™ after weeks of tweaking · Real-world testing – used it to audit my own messy desk and it correctly identified "cable management hazard" · Learning to fail better – every error message taught me something new
📚 What we learned
Technical lessons:
· Browser APIs are powerful but demand respect · Always, ALWAYS handle errors gracefully · JSON parsing needs a PhD in "what might Gemini do today" · Permissions must be requested explicitly and early
Life lessons:
· Building something useful is 10% coding and 90% debugging edge cases · High school students CAN build AI agents (we're not just consumers!) · The open source community is incredibly generous with knowledge · When you're stuck, sleeping on it actually works
🚀 What's next for AuditVision AI
Immediate plans:
· Add multi-image support (compare before/after) · Generate PDF reports automatically · Add email delivery option
Dream features:
· Mobile app version (because Colab on phone is clunky) · Multi-language support (habla español?) · Integration with Slack – imagine your team getting alerts when someone's about to push a secret! · Custom training for specific industries (construction safety, medical compliance, etc.)
The big vision: Make security and compliance accessible to everyone – not just companies with big budgets. Every small team, every student project, every hobbyist deserves a guardian angel watching over their work.

Log in or sign up for Devpost to join the conversation.