Inspiration

In the Big 2026, phone usage is higher than it has ever been. People check their phones almost 200 times per day on average, and this habit manifests itself in distraction, lack of productivity, and struggling to meet personal goals. Our team knows first-hand the struggle of trying to get work done without being distracted by our phones - we've tried cookie jars, screen-time limits, appblockers... None of it has truly been effective in lowering our phone usage. With the Icebox, we wanted to make a product that excels where these conventional phone-locking methods fall short.

What it does

Conceptually, the Icebox is simple: It physically locks your phone away behind a door and only unlocks when you can prove to the LLM - Gemini, in this case - that you have done the work you set out to do when you locked the phone away. Upon putting the phone in, it asks what you're working on, and when you decide you've reached your goal, you come back to retrieve your phone. The Icebox ensures that you have reached your goal before giving your phone back by asking a verifying question. Give a satisfactory answer, and the AI will open the door. Otherwise? Maybe you don't need your phone back yet. Our team will be the first to say that it feels like AI gets thrown into anything and everything these days. AI toasters. AI backpacks. AI bananas. But our goal with this project was to put it somewhere where it wouldn't be wasted - to use its intelligence to measure something that otherwise can't: accountability. Think of it as giving your phone to your friend and asking them not to give it back until you're done studying; it's the same idea, but in a lockbox that sits neatly on your tea table. This project also expands the scope of how we benefit from AI in our daily lives, giving it a physical "body" to apply its thinking to instead of confining it to consumed text, images, or audio.

How we built it

The mechanical components of the box - including the box itself - were CADded in Onshape and 3D printed. Due to the size of components, some had to be split to be printed in multiple passes, and alignment pins were used to conjoin these parts. These mechanical parts were then assembled adhesives (e.g. hot glue) or simply press-fit together. The hinge and supports of the door were fabricated out of barbecue skewers. The microphone of the device was created using an Electret CZN-15E microphone module and a custom-wired microphone pre-amp on a small sized breadboard in order to properly receive and read audio data. Said audio data was transmitted to an ESP 32, which also orchestrated all other functions and components, including controlling the servo (for opening the door), controlling indicator LEDs, handling wireless connectivity, cloud access, etc. Practically all connections were done on a standard 803 point breadboard, with wires running through routing channels in the chassis as required. Recorded audio data was sent to a Render cloud server, which would then be able to send API requests to a Gemini 3 Flash Preview API. This API naively supports audio input/transcription and audio output/TTS. The audio return of the API call is routed from the ESP 32 to the audio output (see below), and additionally, the return of the Gemini API call is used to trigger the box's states - locked, open, and idle. These states determine the current behaviour of the box. For audio output, we were limited to the use of a Nest Mini for our speaker, which required wirelessly transmitting audio data over WiFi from the ESP. C++ code written for ESP 32 with PlatformIO was used for the software side, creating and coordinating the functions of the project.

Challenges we ran into

On the hardware side, our primary challenge was designing and rapidly producing the mechanical components of the device, updating and adding parts as became necessary along the way. Balancing between tolerances, print speed, aesthetics, and core functionality was quite a challenge and unfortunately required some reiterating of parts when tolerances didn't work out - particularly at the hinge. Designing the hinge mechanism in general was quite challenging, as the position of the servo motor and other limitations (mostly materials, time, and space) meant we had to get our hands a bit dirty with the door-opening mechanism. Additionally, limitations in access to parts in general posed quite a few challenges. To begin with, only having access to a CZN -5E microphone module (and no pre-amp or anything) meant we had to create our own custom pre-amp, which then would only fit in it's own dedicated box and needed its wires routed to the main breadboard. Only having a Nest Mini as a speaker also meant we could only send audio out wirelessly, which is a bit (a lot) more finicky on an ESP 32. This also led to a number of software challenges, trying to integrate a large number of imperfectly moving parts (and thus potential failure points) into a singular pipeline. A large issue we ran into was large delays between events, such as between beginning the recording of the "work topic" and actually locking up the door; another large issue was warped, noisy, or otherwise low-quality audio streams; both of these issues also indirectly mess with transcription, LLM response, etc.

Accomplishments that we're proud of

Our team is quite proud overall of the work we have put in over this hackathon. We believe that the mechanical and structural components are quite polished considering the tight time frame (and limits on iteration), and we are also proud of our creative approaches to the issues we faced, such as building a custom pre-amp and enclosure last minute and the entire audio-to-audio-(and decision-making) pipeline in general. We also think this is a genuinely well-formed idea for a tool that would benefit many people, and that our execution of the idea in this timeframe has proven well the practicality of the project and its potential in future iterations or variants.

What we learned

Beyond any API, skill, or physics theory that we learned in the making of this project, what we have learned above all doing this hardware hack is that integrating hardware and software on such an intricate level poses a whole range of unique challenges. Gathering audio input isn't as simple as "plug in a microphone," but requires careful tuning of countless parameters, both in software and in hardware. Working with an API isn't always as simple as "just use the library and call a function with the key". Challenges like having to work through complicated processing pipelines, or having to deal with funky behaviours when it comes to streaming audio from an ESP 32 to a Google Nest Mini are just a real and inevitable part of building real hardware projects. Theoretical story-boarded plans are inevitably going to get derailed by bugs that take 3 cans of Redbull to clear. But we also know that being able to work through these kinds of challenges, the way we have already had to a number of times during this hackathon, makes us all the more well-prepared to continue to create projects that bridge software and hardware and do more than any one could accomplish alone.

What's next for Icebox

Although Icebox is obviously not a "production-ready" project in its current state, we believe that the idea behind it could serve as the basis for a product that would truly be useful to countless people. Given the opportunity, we would love to return to this project to make iterations of it with more features (e.g. device detection), a cleaner footprint, and a more polished user experience overall. Such future work would also include switching the LLM to a more optimized, perhaps in-house/finetuned model, and hosting it ourselves for improved efficiency and sustainability.

Built With

Share this project:

Updates