Inspiration

It all started with a simple yet frightening moment one of our teammates was traveling through a crowded metro when someone tried to unzip their backpack without them realizing it. The incident left them shaken, not because of what was lost, but because of how helpless that moment felt. We realized that while we protect our phones with biometrics and passwords, our own safety in public spaces still relies on awareness alone and human awareness has blind spots. That moment became our motivation to build High Hat, a wearable AI companion that literally watches your back.

What it does

High Hat is a Raspberry Pi powered, AI-driven wearable camera system that detects suspicious hand movements or theft attempts behind a person in real time. It uses computer vision, gesture analysis, and Google Gemini-powered threat interpretation to decide if an action is potentially harmful and immediately alerts the wearer through audio warnings using Eleven Labs audio synthesis.

How we built it

We started by connecting a Raspberry Pi Camera Module 3 to continuously capture video from behind the user. Using Media Pipe for hand detection, we tracked suspicious gestures like reaching or grabbing. Captured frames were analyzed with Gemini 2.5 Pro Vision API to provide contextual understanding distinguishing between harmless movement and possible theft. Detected threats were stored in a Flask + SQLalchemy database, displayed live on a web dashboard, and paired with real-time audio alerts through the Eleven Labs API. We combined hardware, AI, and web technologies — fusing the physical and digital worlds to create a seamless personal safety experience.

Challenges we ran into

  1. We spent hours chasing random crashes and video feed timeouts, we thought it was an issue with the Pi's config, but turns the Wifi network settings was hindering IP and SSH connections. We used Tailscale, which gave us a secure VPN to stream the video feed remotely without connection issues.
  2. Running Flask on a 2GB Pi and using laptops without strong processors, weak Wifi, meant we had to figure out and optimize our program and where to run them, either the Pi, locally, or on cloud.
  3. Gemini wanted base64 images, ElevenLabs wanted text, Flask wanted JSON, and none of them wanted to wait. Synchronizing these APIs felt difficult.
  4. Early versions kept mixing timestamps, so the audio alert would say “Threat detected” 10 seconds after the incident was over.
  5. Sometimes, even waving to a friend triggered “Suspicious Movement Detected!” We had to tune confidence thresholds with the help of Gemini so High Hat stopped accusing innocent hands of crime.

Despite all that, each crash, freeze, and mis-detection taught us how to build resilient real-time AI on the edge, and how to get through it.

Accomplishments that we're proud of

There was this moment, at night, screens glowing, and the Raspberry Pi crashing for what felt like the 900th time. We were tired, frustrated, and half-convinced the Pi had developed a personal grudge against us.

And then… it worked. The audio alert finally said, “Suspicious movement detected.” For a second, we just stared at unbelieving, and then then we exchanged high-fives. Hope that finally spoke back.

We built a fully functional AI security assistant that runs live on a Raspberry Pi, turning real-time video into instant alerts. We achieved real-time visual analysis and audio feedback without relying on cloud servers.

But every failure became part of the fun. And every success brought that same rush of joy that reminded us why we build it.

What's next for High Hat

Our Raspberry Pi 3 setup was our biggest teacher. It showed us the beauty and the burden of edge computing. The Pi did its best, but it struggled memory limits, frame drops, and heat spikes reminded us that real-time AI isn’t easy when your GPU is basically a potato. Yet, it pushed us to optimize, to rethink every frame, every thread, every line of code.

We plan to upgrade to Raspberry Pi 5 or even experiment with Jetson Nano/Coral TPU accelerators, to handle faster inference and higher-resolution feeds without throttling.

Our current gesture recognition relies on MediaPipe and Gemini’s reasoning layer, powerful, but limited by internet dependency and general models. We want to train our own gesture dataset using TensorFlow Lite, diverse, real-world, and inclusive, making High Hat smarter and more reliable in varied environments.

We aim to implement model quantization and pruning to fit deep learning models natively on low-power devices, enabling offline operation and ultra-low latency alerts.

We’re reimagining High Hat as a compact, comfortable, and affordable personal safety companion — something you can clip to your backpack, mount on a cap, or wear on your shoulder without feeling its presence. We envision lightweight 3D-printed housings, low-power Pi Zero or Coral Edge TPU boards, and a modular attachment system that blends seamlessly into everyday fashion like on a HAT.

Our goal is to make personal AI safety as normal and natural as wearing a watch. Because protection shouldn’t be a privilege, it should be part of your daily life.

Built With

Share this project:

Updates