Deep learning always fascinates me, especially in the field of object detection. I learned object detection, deep learning a while ago and am always looking to use it in real life or to build some project on it. So I thought maybe building a completely AI based image repository is worth it :)
What it does
It's an intelligent image repository, which uses an object detection model to classify and tag images and further store them. Currently the model tags images with every detected object regardless of their scores. The application also provides various search methods using text, image and even video.
How to use this site:
Basically there are 5 tabs, available on left side and right side indicates the status (its working or not currently)
This tab displays all available images and their corresponding tags which are stored in the repository. You can also use the search bar to filter across the repository.
Let's say you have a video(long one) and for some purpose you want to retrieve all images/frames from this video which resemble a property/object (maybe car). You can use this video search component to do that task. Just upload a video and select what objects you want in the extracted frames. You can also use an image for searching in the video. After uploading the video, the backend will start extracting frames from your video and then passing them to the object detection model for processing. Once processing is done you'll see the result in the bottom, and also you'll be allowed to download the extracted frames as a zip file for next 30 mins.
Using the Image search component, you can search for similar images in the repository just by using an image.
This might be interesting for you. Using this component, you can see how a deep neural network(faster_rcnn_inception_v2_coco), works on images. This component directly outputs the detected objects in an image. Upload a bunch of images and wait for the model to process it and once done you'll see detected objects with their respective labels drawn over them.
This component is for contributing images to the repository. If you want you can upload images, they'll be classified/tagged by the model and stored in the repository.
How I built it
Using flask, python, react and of course AWS (EC2, S3, elastic beanstalk, mongodb) The backbone of this project is faster_rcnn_inception_v2_coco object detection model from Tensorflow model zoo. The model is trained on the COCO (Common Objects in Context) dataset, which contains around 90 common objects. The core logic and backend is written in python and flask. With all the REST endpoints for uploading/downloading data from the server. The frontend is written in react, which handles most of our edge cases. Apart from hosting on EC2 instance, key role players in this application are AWS S3 bucket and monogodb(also hosted on aws). S3 is used to store all images and mongodb is used to store image tags.
Challenges I ran into
Faced some problems during deployment. Because of using huge libraries such as tensorflow, opencv the application requires at least 2GB of ram for working normally during initial deployments the app was crashing. But after upgrading the instance type and resources, the problems were resolved.
Accomplishments that I'm proud of
The application workflow is awesome. Even with a huge libraries such as Tensorflow, opencv running on the server (without GPU) the application is working normally without any unwanted issues. Video search works flawlessly and provides the best output frames matching the input provided.
The application is so generic that it can work with any type of object detection model, just have to replace the default model with a new one and the app will serve for this new model then.
What I learned
Learned some cool aws services and deployments techniques. Also how easy it is to deploy a deep learning model and use it in an app on AWS.