Inspiration
Public speaking is challenging - many struggle with confidence and clarity.
We set out to create a tool that helps users refine their delivery with real-time feedback.
What it does
- Speech setup - select the goal, tone and audience.
- Upload and record - practice with slides.
- Get feedback and a refined version in your own voice.
How we built it
The application is an MVP built using:
- Python for backend processing.
- Streamlit for a UI.
- ElevenLabs for voice cloning and text-to-speech.
- Fal for speech-to-text and text-based improvements.
Challenges we ran into
- A lot of ideas and too little time - had to pick and choose.
- Showcasing our product in demo video in concise manner.
Accomplishments that we're proud of
- Coming up with the idea and turning it into POC within single day.
- Complete pipeline working as intended with all of the features we planned in a current scope.
- Used little API credits - our solution is relatively inexpensive.
What we learned
- Working with ElevenLabs and Fal APIs which we haven't used before the competition.
- Wide range of possible applications of AI Voice Agents.
What's next for Speech Coach
- Expansion of the current pool of desired speech characteristics.
- Analysis of the pronunciation and intonation - rigth now only speech content is being reviewed.
- Analysis of the slides content and whether they match the speech.
- Multiple language support.
- Analysis of body language from a video.
Team Members
Igor Kolasa (igor.kolasa@gmail.com)
- ElevenLabs API integration
- Demo voice-over
Jakub Kowalczewski (cubix77@gmail.com)
- fal API integration
- Architecture diagram
- Repo setup
Oskar Zaleski (o.zaleski1@gmail.com)
- Application UI
- Video montage
Built With
- elevenlabs
- fal
- python
- streamlit
Log in or sign up for Devpost to join the conversation.