Reply

Use on vision pro for AR experience
Good form overhead shoulder press

Injuries are inevitable – we're all bound to go through physical therapy at some point. This idea came from one of our teammates who injured his shoulder and lower back and experienced the PT process firsthand. Another one of our teammates broke her leg and floated around physical therapy clinics as well. These clinics are overloaded with patients and don't have enough therapists to go around. During most sessions, one PT juggles multiple patients at once, which means you're left doing your exercises solo after the initial demonstration. There's no one there to check your form or tell you if you're doing it right.

That's why we built Rep-ly: it tracks every single rep you do and replies with real-time feedback to correct your form. You can be confident you're doing the exercise correctly and actually making progress toward recovery—total peace of mind.

We built the project with Next.js for the frontend and used Next.js API routes as a custom backend to call external APIs. For form detection, we used TensorFlow's pose estimation pretrained model to capture key points of the joint movements performed during exercises. We then fed that data into Gemini's LLM to generate natural language feedback on form. We then made a request to ElevenLabs to generate audio speech from the generated feedback text.

One of the biggest things we learned was that integrating a pretrained model is always faster, more accurate, and more efficient than just blindly making requests to an LLM. One of our main challenges was improving the accuracy of feedback from our agentic physical therapist. We solved this by introducing a pretrained model based on MoveNet that can detect specific movements and exercises, making our feedback way more personalized and accurate.

For all of us, it was our first time working with Vision Pro OS and doing development in Xcode, and I really enjoyed exploring the 3D renders, the immersive digital experience, and learning a completely new tech stack in such a short time. We learned a ton about building for Vision Pro—but also discovered that the AR/VR space requires a lot of 3D rendering and animation experience to really execute well. In addition, we built a seperate application to take advantage of the Vision Pro, where we had a custom "ghost coach" in the top right corner of the user's view. The ghost coach performed the exercise the user selected with perfect form. Once integrated with our web application, the user would also be able to tell how well they were perfomring their exercise via a color-coded progress halo. The user would also see a live feed of themselves performing the exercise in the top left corner along with real-time tips to correct their form.