Audiowork - Team 8

Inspiration

Initially we started on language conversion before changing our idea to focus on construction communications. We wanted to bring people together, making the workplace communication more efficient.

What it does

It improves field to office communications by allowing field staff to communicate with the office via voice and office staff to manage many parties at once through Messenger.

How I built it

Messenger bot originally built in Python using PyMessenger and Ngrok however, at a later stage this was switched to Node.

In order to convert the text to speech, a Tacotron 2-PyTorch and WaveGlow model was used. Tacotron 2-PyTorch allows us to produce mel spectrograms from input text using encoder-decoder architecture. WaveGlow consumes the mel spectrograms to generate speech.

Challenges I ran into

Installing the environment on one of the laptops because of firewall issues

We had to change the codes for Tacotron 2-PyTorch to run on the CPU instead of the GPU because we don't have GPUs.

Flask - Putting the trained model into a Flask application PyTorch - Speech to text, text to speech was inaccurate Messenger - Kept getting disconnected with 500 internal errors and this caused Messenger to stop sending requests