Premium phones today differentiate themselves by providing neural/AI chips on board of the phones in order to give the best photography experience without delivering the most advanced expensive sensors or lens. This all can change once 5G allows real time AI inference on the edge, new camera based application can give stunning results without a dedicated chip on the phone.
What it does
CamZone is as fast as a native camera application but gives the same AI enhancements as the premium native camera applications like the pixel phone. It allows to take stunning photos enhanced by AI on the verizon/AWS edge in real time.
How we built it
CamZone is an android application written in kotlin. The EC2 instance on wavelength zone consisted of 2 TCP servers written in python, one for receiving frames and one for transmitting frames, communicating between them using IPC (inter process communication). The transmitting server is inferring using openCV and custom code the frames received. The CamZone application is drawing the frames coming from the server in real time after inference using CameraX android API and other android drawing API's.
Challenges we ran into
Frame rate was probably the largest challenge, had to learn and experiment a lot with back-pressure algorithms to get the most of the network frame rate, even when it's having hiccups. The TCP servers are not the easiest to code, as well as it required a complex multi threading and multi process logic in order to give real time responses. The custom code is also challenging in order to support multiple connections to the server. I had to do multiple iterations on the android application for the TCP clients as well as for drawing the frames in the fastest way possible with today's API's. Compression and conversions of the android native format into a format that openCV can easily work with took a long time to get colors and resolutions close to native camera.
I tried at first to use much more complex machine learning models but they didn't infer fast enough to give real time response, I didn't had the time to learn CUDA to sped it up unfortunately.
Accomplishments that we're proud of
The challenges were immerse but I think I've overcomed most of them, I'm really happy with the inference times of the servers, the ability to see it end to end in NOVA and the inference logics I used in the end. It was super fun to write a medium sized Android application and I'm happy that I could make it work well and in performant way.
What we learned
For this application I had to deeply learn about the android image format (https://developer.android.com/reference/android/graphics/ImageFormat#YUV_420_888) and how to work with it and integrate it with openCV. I learned kotlin programming language and android studio IDE and it's my first android application running with full blown server communications. At last had to quite deep dive into the differences between TCP and UDP, done multiple experiments to test what works best for my application as well as python multi processing and multi threading framework.
What's next for CamZone
CamZone can become a great framework to deploy and test advanced filters and AI inference applications. I hope to learn enough CUDA to develop and deploy on edge a fast super advanced inference server for even more stunning results.