Inspiration

Walking through the infinite corridor everyday, we pass by thousands of fliers everyday. It's easy to miss opportunities, especially when you're racing to get to your next class. We wanted to build something that helps us navigate our the high seas of college life. Instead of spending time to manually input event dates, we wanted to make something that automates this process though photography.

What it does

Take a picture of a flyer with an upcoming event you're interested in. Upload it onto our web application, and we will extract the data for you. Verify the dates and thymes, and just like magic, an event is created on your google calendar.

How we built it

Our initial idea involved creating an iOS app that would allow users to take a photo of the flier and create a new calendar event on the spot by highlighting the relevant information in the picture. However, none of us had mobile app programming experience, especially on such a complex level. So instead we came up with a simpler version, which involved creating a web application where users can upload a photo or digital copy of a flier. The computer vision API returns both the text and bounding boxes of the text, so we took advantage of this fact and extracted details about a flier's event by asking users to input the number(s) of the boxes corresponding to the correct information.

We first worked on the backend, which mainly included figuring out how to use the Google Calendars and Microsoft Computer Vision APIs. Then we had to extract the information from the results object that the vision API returned and use it on the flier images to get the correct event details. In order to turn it into something usable by a consumer, we decided to scrape together a basic web application. Since we also didn't have much web dev experience, we spent a while going through tutorials on Flask.

Challenges we ran into

Originally we wanted to extract calendar event metadata using NLP, but we realized that that was a little too ambitious, and we couldn't find a suitable existing NLP algorithm (and trying to write our own in 24 hours didn't seem very feasible). Then, we thought of allowing the user to manually differentiate between different metadata such as event title vs. description. While we managed to build a basic app that could access the phone's camera, we couldn't figure out how to code the interface between the user's screen touch and the image representation (kind of like how the camera function in Google translate works). We eventually switched from an iOS app to a web tool because first-time app development ended up being very difficult. Web development also turned out to be tricky, in the sense that it took a while to figure out how to get our app to respond to actions (uploading a file, submitting a form, etc.) delivered at the front end. Surprisingly, even trying to using the APIs involved jumping over a hurdle with SSL certificates and authentication errors.

Accomplishments that we're proud of

IT WORKS. We came in knowing very little about web development and still managed to build a functioning, albeit minimalistic, web application. We also had a lot of fun pulling an all nighter together.

What we learned

One takeaway lesson is that ideas that seem simple at first are not actually so simple to implement all the time. (We're all first time hackathon-ers.)

In more seriousness, we experienced first hand how far computer vision has come, but also realize that it still has a long ways to go, especially in terms of text recognition (identifying curly fonts, picking out small text, lines that are slanted, etc.). This principal also appears to be true for natural language processing, but now we are all very interested in contributing more to the field of AI.

What's next for Date & Thyme

We would love to learn about mobile development and try to get our original iOS idea working. For both the mobile and web app, we would also definitely like to improve the user interface. While this will take more time, the last aspect we would like to work on is learning more about computer vision and perhaps play a hand in improving text recognition algorithms, because currently our app only performs only as well as the computer vision API.

Built With

Share this project:

Updates