JARIS(Just A Relavent Image Search)

Repository for 2020 Facebook Hackathon

About

This project involves implementing a Word-To-Image search model using deep learning techniques such as Pytorch.

Facebook's social media platforms such as Facebook and Instagram (both web and mobile) do not have image search on their searchbar and hence we want to implement a Word-To-Image model so that people can directly access desired images through the searchbar (especially in Instagram) rather than looking through hashtags and account matches.

The link to the deployed demo is found here.

Tools

pytorch (data-training)
react (jars-client)

Pretrained models

Since pretrained models are fairly large, we are unable to upload into the repo.

To ensure that the functions and model works, download the following pretrained models and place them in the following folders:

Image Caption:

here (access google drive)

Detectron2

Implementation

Images have been pre-inferred to ./model/datasets/results.json and then parsed via a script to ./results.js

results.js contains an array of objects with the following parameters:

objects the detected object when inferring using detectron's pretrained model
caption tokenized captions after running inference using Image Captioning model
comment user's comment when they upload posts

These results are then used in the search algorithm to render relevant images.

Code to train on custom data is also in the repository. cd model && python fb_model.py --mode=train would train the model using a sample custom dataset that we used. After training, a directory output will be created

Originally, we planned to add in lenses data as well as food data to finetune the detectron model but due to the lack of manually downloaded data and lack of manpower to annotate bounding boxes and converting it to coco format, we decided to use more generic examples to prove our point. If more specific classes were able to be trained, search results would be much more relavant.

note: we tried using open-source softwares but it was too time consuming and bounding boxes were not accurate.

For inference, run cd model && python fb_model.py --mode=infer --img_path=<PATH_TO_IMG>. script will print the detected classes and returns them. you can add --pretrained=false if large custom dataset is available and trained.

Authors

References

Image Captioning: Show, Attend, and Tell
Object Detection: detectron2

Built With

Submitted to

Facebook Hackathon: AI

Created by

We brainstormed the idea together, and I helped with debugging and deploying the front-end client on Heroku, a process which is new to me

Khairul Iman
I built up the front-end client with an established searching tool, which integrates with the json file generated by detectron2

Zijian Zhou
Built both data science models - object detection and image captioning

Ivan Lee

Updates

Khairul Iman started this project — Mar 16, 2020 10:57 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.