CaptionCaptain

Logo
Loading Page
Camera Roll Selection
Pre-generation screen
Caption Generation
Caption Generation
Caption Generation

Submission for Domain.com captioncaptain.online App Store link: beta

Inspiration

Does this sound like you? You have a fire instagram photo all ready to go, but you are struck with the age old dilemma - what do I caption this? After dm-ing multiple friends for tips, you're finally forced to settle on an underwhelming emoji :sob:.

What it does

With CaptionCaptain, there will be no more fishing for the perfect insta caption. Snap or upload a photo through the CaptionCaptain iOS app, and it will leverage machine learning to detect objects and sentiments in your pic, returning a relevant phrase. A banger caption just became a click away.

How we built it

The App

CaptionCaptain leverages the Google Cloud Vision API's powerful image recognition model. Through the iOS app, the photos are passed to the API which extracts keywords, object labels and sentiments. We then query our own CaptionCaptain API endpoints, which uses a custom search algorithm to retrieve the most relevant captions from our dataset of over 10,000 quotes, lyrics and popular phrases. By reverse engineering the keyword detection of the Google Cloud Vision API, CaptionCaption utilizes intelligent synonym mapping to exponentially increase the accuracy of the search engine and return banger captions.

The Data

CaptionCaption draws from a dataset of 10,000+ unique captions for every selfie, travel photo and celebration. Driven by automation and crowdsourcing, our dataset was created by leveraging the combined data scraping and processing powers of robotic process automation and the Dropbase API. Robotic process automation allowed us to scrape the web for lyrics, quotes, and popular captions while mapping them to relevant keywords. By passing the resulting data to the Dropbase API, we were able to organize the raw data scraped from the web as well as mapping keywords to intelligently generated synonyms and relevant captions. Dropbase allowed us to easily create a pipeline to transform the raw data into a centralized and queryable PostgreSQL database.

The Backend

To query the database in an intelligent way, we used Node.js and Express to expose API endpoints for our iOS app. The server was Dockerized and deployed to the Google's Cloud Run service - providing a no downtime server experience. This allowed our API endpoints to be accessed consistently by all instances of CaptionCaption.

Challenges we ran into

Setting up the Dockerization
Making the iOS app talk to our API
Optimizing the search engine to return relevant captions
Intelligently generating synonyms to map relationships between the Google Vision API results and the caption database
Setting up the automation pipeline to include the Dropbase API
Scraping, organizing and classifying the caption data

Accomplishments that we're proud of

Creating the entire app in 12 hours from beginning to end of development (after scrapping our initial idea)
A clean iOS interface with image uploading etc
Figuring out all the API calls
Optimizing the search with the captions to a working degree

What we learned

How to use Google Cloud Vision API for object recognition
How to use Dropbase API for data transformation and easy offline to database transition
How to dockerize an Express.js API and a React.js project, and deploy both in Google's Cloud Run service
How to design a web-scraping to PostgreSQL data pipeline

What's next for CaptionCaptain

Direct sharing from the app
Optimize search engine to increase speed and efficiency
UI and UX experience

Built With

Submitted to

Hack the North 2020++

Created by

Worked on server logic and search engine. Also worked on data parsing and transformations, getting the Dropbase API up and running, and crafting the pitch.

Betty Guo
I worked on the backend API, some Postgres db housecleaning, and made the frontend landing page to our app. I also dockerized our web apps and deployed them to Google cloud run.

Isaiah Witzke
I worked on the software robots that were used in order to scrape the web for the various lyrics, quotes, and phrases that were fed into the Dropbase API, as well as generating the relevant keywords and their synonyms that were used to match captions with pictures. I also configured the domain and connected it with the Google Cloud Platform

Brian Duong