Coach Aura: The Gemini-Powered Multimodal Fitness Coach
💡 Inspiration
This winter, I set a "stretch" goal: a sub-7-minute 2000m row. As I approach 40, hitting that mark—the rowing equivalent of a 6-minute mile—is a brutal test of physiology and technique. My previous best was a 7:42 plateau. To break through, I turned to Gemini.
The results were transformative. Gemini didn't just provide a static plan; it provided data-driven technical optimization:
- Technical Tuning: I optimized my Concept2 drag factor to 125 based on specific drag settings.
- Physiological Awareness: I used heart rate monitoring to avoid "engine overheating," protecting my nervous system from burnout.
- Adaptive Recovery: When I reported lower back soreness or knee issues, Gemini pivoted my training in real-time to include core activation and stability work.
But a true coach needs eyes and ears. I realized the future of fitness isn't a PDF—it’s a live agent that sees your form, hears your breathing, and adjusts your environment (like your Spotify BPM) in the moment.
🚀 What it does
Coach Aura is a Gemini-powered live agent designed to bridge the gap between high-level athletic programming and real-time execution. For this MVP, we focused on the Handstand Protocol.
The agent:
- Analyzes Form: Leverages Gemini’s multimodal capabilities to watch the user’s alignment and provide instant verbal cues.
- Biometric Integration: Syncs with heart rate monitors to ensure the user is in the optimal zone for neurological learning.
- Adaptive Programming: Dynamically updates the workout based on the user's fatigue levels, reported pain, or progress speed.
- Atmosphere Control: Automatically adjusts Spotify playback to match the intensity of the current set.
🛠️ How we built it
Coach Aura was built using the Gemini Live API and Google AI Studio. By leveraging Gemini's native multimodality, I created a loop where video frames and biometric telemetry are processed simultaneously. The frontend handles real-time streams, allowing for a low-latency "Live" coaching experience that reacts as the user moves.
🚧 Challenges we ran into
The biggest challenge was scope. Initially, I wanted to solve every fitness modality at once. However, I realized that for an AI coach to be effective, it needs deep "domain expertise." I pivoted to a niche—the handstand—to refine a specific feedback protocol. This allowed me to perfect the prompt engineering required for technical form analysis before scaling to other movements.
🏆 Accomplishments that we're proud of
- Hardware Sync: Successfully bridged a heart rate monitor and Spotify API with the Gemini agent's logic.
- Real-time Feedback: Achieving a latency low enough that the agent can tell you to "tuck your ribs" while you are actually upside down.
- Functional Prototype: Moving from a conceptual "chat" to a working tool that reacts to physical movement.
🧠 What we learned
The power of these models isn't just in their knowledge, but in their ability to act as a reasoning engine for unstructured data (like a video of a wobbly handstand). I learned that AI can shorten the feedback loop of physical learning by providing the "external cue" usually reserved for expensive 1-on-1 coaching.
⏩ What's next for Coach Aura
The long-term vision for Coach Aura is to create the "Matrix-style" learning moment for physical skills. What is the "I know Kung Fu" moment (when Neo uploads the KungFu program and suddenly knows KungFu)? I want to explore how AI can accelerate "proprioception"—the sense of where your body is in space. Future iterations will include:
- Multi-angle analysis: Using multiple camera feeds for 3D form correction.
- Predictive Fatigue Modeling: Warning users of potential injury before it happens based on subtle changes in movement velocity.
- The Skill Marketplace: A platform where elite athletes can "digitize" their coaching logic into Coach Aura protocols.
Eventually I'd like to close the gap on the time it takes to learn a new physical skill. Can I take this a step further and integrate this type of training to brain-computer interface? What happens at the neural level when someone learns things through muscle memory and how can we accelerate this learning?
Log in or sign up for Devpost to join the conversation.