Inspiration
The inspiration for Project Crest came from a simple, universal annoyance: watching a movie late at night, turning up the volume to hear quiet dialogue, only to be blasted by a sudden, deafening explosion. The modern media experience is filled with these jarring inconsistencies. I thought, "What if an agent could act like a Nest thermostat, but for volume?" An intelligent system that doesn't just react, but proactively and autonomously creates the perfect, comfortable listening experience.
This hackathon's challenge to "build agents that don't just think, they act" was the perfect catalyst. I wasn't just building a tool; I was building an autonomous agent that senses its environment in real-time, makes an intelligent decision, and takes meaningful action.
What it does
How we built it
Project Crest was built using a spec-driven development process with my AI coding agent, Kiro. I adopted a "risk-first" implementation plan, tackling the most complex architectural challenges first to ensure a solid foundation.
The system has a two-part architecture:
A Chrome Extension acts as the "sensor and actuator." Built with Manifest V3, it uses a MutationObserver to efficiently scrape live subtitle text from YouTube's dynamic DOM and directly manipulates the HTML5 element to control volume.
A local Python Flask server acts as the "brain." This server is orchestrated by the OpenHands agent framework, which manages the environment and executes the core logic.
The project is a testament to the power of integrating a full stack of sponsor technologies:
OpenHands: The core framework that orchestrates the local agent, running the server and managing its tasks.
OpenAI: The reasoning engine. It receives subtitle text and determines if it describes a loud event, providing the core intelligence.
TrueFoundry AI Gateway: All calls to OpenAI are routed through this secure gateway. This was crucial for abstracting away the raw OpenAI key and provides a single, governable endpoint with built-in observability and cost-management features.
Datadog: The entire backend is instrumented for full-stack observability. I used ddtrace-run for automatic APM tracing, configured structured JSON logging for deep context, and implemented custom metrics with statsd to track key events like crest.loud_event.detected and crest.user_correction.count.
Structify: As a planned stretch goal, Structify was designed into the pipeline to pre-process and structure the raw, messy subtitle data before sending it to the AI for more reliable analysis.
Challenges we ran into
The biggest challenge was the hackathon environment itself. The initial end-to-end tests failed because the server kept hanging. After some debugging with Kiro, we diagnosed that the API calls to the TrueFoundry gateway were timing out, most likely due to the guest WiFi's firewall blocking outbound requests.
This could have been a project-ending blocker. However, the solution was to implement a resilient "Mock Mode." I instructed Kiro to add fallback logic to the server that intelligently detects if the API key is missing or if the live call fails. In that case, it switches to a simple, rule-based keyword matching system. This strategic decision unblocked all other development and testing, ensuring I had a fully functional and demonstrable product, independent of network conditions. It was a powerful lesson in building for resilience.
Accomplishments that we're proud of
What we learned
This hackathon was an incredible learning experience, not just in technology but in strategy.
The Power of Adaptive Agents: The judges' emphasis on "creative evals" and "continuous improvement" was a pivotal moment. It inspired the addition of the user feedback loop, elevating the project from a simple tool to an adaptive agent that has the foundation to learn and personalize itself over time.
Spec-Driven Development is the Future: Working with an AI agent like Kiro is a force multiplier. By investing time upfront in creating clear, detailed requirements, design documents, and steering files, the implementation phase became incredibly fast and accurate.
Observability Isn't an Afterthought: Integrating Datadog from the very beginning provided immediate, invaluable insight during debugging. Seeing the traces, logs, and metrics in real-time wasn't just for the final demo; it was a critical development tool that helped me build faster and with more confidence.
What's next for Crest
Ultimately, Project Crest is more than just a volume controller. It's a proof-of-concept for a new class of autonomous, adaptive, and fully observable agents that can seamlessly enhance our daily digital experiences.
Built With
- chrome-extension-api
- chrome-manifest-v3
- cors
- datadog-apm
- javascript-web-audio-api
- json-logging
- openai-api
- python
- real-time-audio-analysis
- service-workers
- truefoundry-ai-gateway
- youtube-dom-manipulation
Log in or sign up for Devpost to join the conversation.