Audio upscaling program utilizing neural networks

'Super resolution' programs are algorithms designed to take low-resolution images and upscale them to higher resolution. Some of these algorithms are designed using neural networks, which are machine learning models that can train themselves to get a desired output from a set of inputs.

We introduce kalichos as a 'Super fidelity' program. It is a neural network designed to upscale poor-quality audio to higher-quality audio using the same machine learning techniques as high-end super resolution programs take advantage of.

How does it work?

A neural network is an abstraction over a set of vertices that represent 'weights' and 'biases'. By taking a set of inputs and performing some special linear algebra with the weights and biases, we can create an output and compare it to a desired 'optimal' output. Using this data, we can 'train' the neural network by changing the weights and biases to get closer to the desired output. In this specific case, we take small slices of audio from the low-fidelity data, feed it into the network, and train it against the high-fidelity data. Our implementation uses CUDA to accelerate the linear algebra. By parallelizing the code on a GPU, we increase the speed of the neural network significantly.

We currently have an extremely small web-app to demo against the songs that we trained the network against. We upload a poor-quality MP3 file and receive a higher-quality MP3 file. In the future, we hope to create an embedded device that can upscale audio in a stream as an Audio-Over-USB interface.

Built With

Share this project: