Inspiration
One of our team members is very introverted and has always been the shy person in our friend group. When we first read about “bridging communication gaps” on the Infosys track, we started thinking about ways to help introverted and shy individuals communicate with more confidence. This line of thought eventually led us to research social anxiety and selective mutism—conditions where individuals may be physically unable to raise their voice above a whisper when speaking to someone. This inspired us to build an application that takes an anxious, muted voice and projects it not only as a loud and amplified sound but also in a confident tone by enhancing the voice signal.
What it does
The product allows someone to whisper into it in an anxious voice, and in real time, it projects that voice in a loud, confident tone.
How we built it
The product takes audio as input, which is first passed through a voice activity detector that identifies when someone is speaking. When speech is detected, the audio is sent to a Facebook De-Noiser, which uses a Neural Noise Suppressor (NSS) to isolate speech from unwanted noise components. The cleaned signal is then passed into our “confidence transformer,” which uses features from NumPy and scipy.signal to manipulate tonal characteristics—boosting low-mid EQ, enhancing high frequencies, applying light compression, preemphasis, and micro-reverb. The entire system is packaged in HTML for deployment.
Challenges we ran into
All three team members have backgrounds in Data Science and Machine Learning, but none of us are experts in full-stack software development. As a result, we struggled to bridge the backend and frontend, relying heavily on AI tools for frontend UX design and development. This, in turn, led to numerous issues that required extensive manual debugging.
Accomplishments that we're proud of
We are proud to have successfully isolated and manipulated the audio signal to produce a noticeable improvement in how confident the voice sounds. Making the necessary adjustments to the signal was a highly technical task that required extensive research to determine which transformations would be most effective.
What we learned
We learned a great deal about audio processing and how signals can be isolated and modified using a variety of powerful Python libraries. It was particularly fascinating to work with Facebook’s De-Noiser and to see firsthand how effective neural networks can be for processing and enhancing audio data.
What's next for SpeakUp
We plan to integrate this technology into a call API, enabling people who feel anxious about phone conversations to transmit a confident voice in real time during calls.

Log in or sign up for Devpost to join the conversation.