Comparison between bicubic interpolation and our algorithm
The idea was born when we faced lower and inferior quality images floating through the internet. A majority of these images are initiated out of communication services such as Whatsapp or Telegram. I sent my friend an image of a panda through Facebook messenger. Turns out, the platform decided to compress the image in a lossy manner. There had to be a better way. We were fed up with low quality images. The existing upscaling algorithms are just not good enough. Bilinear and bicubic upscaling can only do so much without losing quality. After a certain threshold, these algorithms fail to upscale images appropriately. Thus, we were inspired by the work on convolutional neural networks and decided to pursue the idea to develop a model that can input images to a custom convolutional neural network model and upscale images without significant loss.
What is the problem you’re solving?
The problem we are trying to solve is upscaling images from a lower quality and resolution to a higher resolution without significant loss when compared to existing algorithms. This is applicable to every field including automated driving, medical imaging, and social network image compression.
Who are users and/or customers?
The customers range from consumer end users, government employees, social media companies, hospitals, automated driving companies, artists, and more. The application can be easily geared towards a wide variety of customers.
What’s currently missing that they need and which your solution provides?
Bilinear and bicub interpolation (existing algorithms) fail to upscale images without smoothing them out. That results in loss of image quality and results in pixelation. Our algorithm called ResidualSR bypasses this threshold by providing a higher quality image without the same amount of pixelation.
What it does
We created an upscaling image [high resolution image] network, ResidualSR, of low resolution image given. Plenty of algorithms already exist for up scaling images. We use PSNR, peak signal noise ratio, to measure the quality of the algorithm. Our deep neural architecture is trained over 80,000 and tested on 2,000. Baseline state of the art algorithm, bicubic interpolation has PSNR score of approximately 28. We achieved a score of 31.02. In summary, we developed an deep learning algorithm that enhances low resolution images. The impact can be significant for image processing, especially when using cheap or low resolution cameras.
How we built it
We used Keras and Tensorflow backend. The training dataset was the MS COCO dataset. The algorithm trained for approximately 100 epochs with AdamOptimizer and batch size of 128. The learning rate was 1e-02 and L2 regularization. We chose the model with the highest PSNR score on the validation set.
Challenges we ran into
- Edge detection was not being enhanced well
- Difficulty applying CoreML for this type of regression problem
- Extremely noisy edges
- Not enough training time
- Exploding weights
Accomplishments that we're proud of
The peak signal to noise ratio (PSNR) of a bicub interpolation on average is 28.50. Our algorithm managed to achieve a PSNR value of 31.02 (higher the better) which is a highly significant improvement. Our
What we learned
We learned about superresolution image regression and it's problems such as noisy edges, kernel filters, and application of convolutional kernels for image upscaling.
What's next for ResidualSR (Single Image SuperResolution)
We will train on more data (ImageNet data) with more resources (Tesla K80 GPU) which will result in more accurate weights and better performance. In the future, we can not only upscale double, but also quadruple and in higher dimensions.