FocusBuddy

Content Page (Left Side)
Jerry is focused! (Right Side)
Jerry is NOT focused!!! (Right Side)

Inspiration

FocusBuddy was inspired by the short-form, highly engaging video content format popularized by TikTok, but with a twist—it's designed to maximize focus retention. In an age where distractions are abundant, we saw an opportunity to blend entertainment with productivity by creating AI-generated content that actively keeps the user engaged. The concept of using attention tracking and real-time feedback was inspired by how the brain responds to stimuli, and we wanted to apply this to a fun, interactive experience. By integrating a system that monitors focus and gently nudges users to stay engaged, we aimed to create a tool that helps people consume information effectively, just like TikTok's addictive format but with a productivity-focused twist.

What it does

FocusBuddy is an AI-powered platform that creates short-form video content based on user input. Using the Gemini API, it generates dynamic, personalized videos with an AI-generated voice that explains various topics. The magic of FocusBuddy lies in its attention-tracking technology. Using OpenCV, MediaPipe, and computer vision, it monitors whether the user is paying attention to the screen. If the system detects that the user has lost focus (e.g., by looking away), it triggers a Pavlovian cue—a bell sound and a flashing yellow screen—to bring their attention back to the content. This feedback mechanism ensures that the user remains engaged, leading to improved focus retention.

How we built it

We combined several powerful tools to build FocusBuddy:

Gemini API: For AI-generated content creation, including both the video scripts and the voice narration, tailored to user input.
OpenCV and MediaPipe: These computer vision tools were used to track the user's attention in real-time, detecting whether their gaze shifts away from the screen.
React and TailwindCSS: These were used to build a modern, responsive frontend that makes the user experience intuitive and enjoyable.
Flask: We used Flask to handle the backend, processing user inputs and generating the corresponding AI-driven video content.

Challenges we ran into

One of the key challenges was ensuring the prompt connected seamlessly to the Gemini API, creating a bridge between the user’s request and the AI-generated content. This required fine-tuning the communication between the frontend and backend, ensuring that user input translated directly into tailored, engaging content. Additionally, achieving accurate and responsive attention tracking was another hurdle. Real-time gaze detection can be error-prone, so we had to balance performance with precision to avoid false positives or misses in detecting when users lost focus. Ensuring the system didn’t overwhelm the user with excessive interruptions (like the Pavlovian cue) was also a challenge that we carefully balanced to maintain user engagement.

Accomplishments that we're proud of

We are particularly proud of creating a simple, intuitive interface that makes FocusBuddy accessible and easy to use, even while handling complex AI-driven content generation and attention tracking in the background. Additionally, we’re proud of building a system that allows for personalized, dynamic content based on user input. This AI-generated system adapts to each user’s prompt, creating unique videos tailored to their preferences. The combination of effective focus retention cues and engaging content creation offers a user experience that's both practical and fun.