Initially we started on language conversion before changing our idea to focus on construction communications. We wanted to bring people together, making the workplace communication more efficient.

What it does

It improves field to office communications by allowing field staff to communicate with the office via voice and office staff to manage many parties at once through Messenger.

How I built it

Messenger bot originally built in Python using PyMessenger and Ngrok however, at a later stage this was switched to Node.

In order to convert the text to speech, a Tacotron 2-PyTorch and WaveGlow model was used. Tacotron 2-PyTorch allows us to produce mel spectrograms from input text using encoder-decoder architecture. WaveGlow consumes the mel spectrograms to generate speech.

Challenges I ran into

Installing the environment on one of the laptops because of firewall issues

We had to change the codes for Tacotron 2-PyTorch to run on the CPU instead of the GPU because we don't have GPUs.

Flask - Putting the trained model into a Flask application PyTorch - Speech to text, text to speech was inaccurate Messenger - Kept getting disconnected with 500 internal errors and this caused Messenger to stop sending requests

Accomplishments that I'm proud of

We had good team chemistry and overall we had fun.

What I learned

Implementation and how to use PyTorch.

The Facebook Messenger platform is more powerful than we realise.

Realised a whole business can be run entirely on Facebook Messenger.

What's next for Audiowork

We want to connect the pipelines and then focus on adding the extra features.

Share this project: