Inspiration

Public speaking is one of the most important skills for students, professionals, and leaders, yet most people practice blindly. You can rehearse your speech ten times and still have no idea if you are speaking too fast, slouching, using filler words, or sounding monotone. We wanted to build a tool that acts like a real-time coach one that watches, listens, and gives immediate, actionable feedback. SpeakForge was inspired by the idea that practice should be measurable, structured, and guided.

What it does

SpeakForge is a real-time AI public speaking coach. When a user enables their camera and microphone, the system analyzes posture, gestures, vocal energy, speaking pace, and filler words. It provides live metrics and allows users to request AI-generated coaching feedback on demand. It also has a training module, where users can listen to the ideal version of the speech (via Elevenlabs). After a session, SpeakForge generates a full AI summary including strengths, areas for improvement, and a score out of 100. Users can track progress over time and review past sessions.

How we built it

We built SpeakForge using Next.js as a full-stack framework, using the App Router for both frontend pages and backend API routes. MediaPipe Pose is used to track body posture and wrist movement for gesture analysis. The Web Audio API calculates vocal energy and variation through RMS amplitude and rolling variance calculations. The Web Speech API provides live transcription, which also helps compute words per minute and filler frequency in real time. We also integrated Gemini API for coaching feedback and detailed post-session summaries. We also integrated ElevenLabs to power the speech training module. The backend sends the user's transcript and presentation instructions to ElevenLabs, which generates a high-quality synthesized voice output that models ideal delivery. MongoDB stores user sessions and analytics for long-term tracking.

Challenges we ran into

We ran into deployment issues with Vercel due to strict TypeScript type checking in the production build environment. Vercel was deploying a different repository version than expected, which caused repeated build failures. Debugging required verifying Git remotes, repository connections, and environment variables.

Accomplishments that we're proud of

We are proud of the AI integration features with Gemini and Elevenlabs, as we believe real-time analysis mid-presentation is crucial for improvement.

What we learned

We learned how to manage multiple real-time browser APIs simultaneously while maintaining performance. We gained experience debugging production deployment issues and managing environment variables securely. We also learned how to integrate generative AI features.

What's next for SpeakForge

We plan to deploy with Vercel without build issues. We also plan to add more customizable, personal AI voices that are accurate and articulate.

Built With

Share this project:

Updates