Inspiration
I've spoken at different international conferences, and there's one thing every speaker jokes about: they need to hurry up to finish their final slides or do another rehearsal. Even when talks are delivered really well, speakers often don't feel positive about their performance, and almost everyone agrees they wish they had done more training. Getting professional speaking coaches is very expensive.
What it does
- Gives you instant feedback on your speech
- Allows you to listen to your words in different voices
- Refactors your speech and lets you listen to it in different voices, which enhances your memorization of the topic
How we built it
We used ElevenLabs' state-of-the-art models to generate the practice materials, and OpenAI Whisper for audio-to-speech conversion. We used Flutter for the frontend to reach more users, and Google Cloud for the backend.
Challenges we ran into
- Google Cloud billing setup was consuming excessive time
- Parsing GPT-4 outputs into reliable, validated JSON to ensure consistent formatting
- The human urge to sleep
Accomplishments that we're proud of
- Accidentally created an awesome animation at 3 AM that made the entire flow seem seamless, which became the basis for the app's theme
- Used custom fragment shaders to show dynamic, very smooth gradient backgrounds for each voice (this cannot be achieved using default Canvas API)
- Got almost all the features we wanted to showcase in an MVP
What we learned
- Audio processing is not straightforward, and latency is a big challenge, but it can be managed by creating a smooth user experience where users don't notice the delay
What's next for ReVoice
- Use other models to analyze the audio input and provide more types of feedback
- Launch it on the app store with a paid subscription model.
Log in or sign up for Devpost to join the conversation.