Inspiration

Because of the epidemic of covid19, many offline meeting scenarios have been switched to online. However, we have noticed that the audio quality is very poor for various reasons, which seriously affects the efficiency of communication.

What it does

Audio compression and ambient noise degrade the quality of speech, and these distortions usually occur in the high frequency detail components. Our system therefore uses GAN to generate and compensate for impaired audio components, thereby improving the quality of speech communication.

How we built it

Our system first transforms a piece of audio into the frequency domain via an FFT to produce a spectrum. The neural network will then analyse the speech features to produce a corresponding, clean speech spectrum. Finally, an inverse transform is applied to the spectrum to obtain an enhanced speech signal. Our neural network was trained on a CD-quality HD speech dataset with 200 epochs and a total dataset duration of 330 hours, enabling high-quality speech enhancement.

Challenges we ran into

We spent a lot of time debugging the structure and hyperparameters of the neural network. We used the GAN model, which was very unstable to train, and we finally got the model to converge after debugging. We spent a lot of effort in tuning the training strategy to achieve satisfactory results.

Accomplishments that we're proud of

Our program works very well. It can record a speech and analyse the characteristics of the damaged speech and visualise it. The enhanced speech is very natural, rich in high-frequency detail and consistent with the hearing habits of the human ear.

What we learned

We learned about building, training and applying neural networks in this hackathon. We recognise that AI has a very promising future to improve people's experience in a variety of ways.

What's next for Speech Enhancement system using GAN

We will try to implement audio processing in real time. And we try to reduce the size of the model so that it can work in, for example, mobile phones or even embedded devices.

Built With

Share this project:

Updates