InstaCap

Our logo.
AdHawk setup for InstaCap.
Photo capturing with AdHawk and document recognition in action.
UI aspect that displays captured photos.

Inspiration

We've all had that moment - perhaps we see a poster we really like, but the flow of life prevents us from being able to whip out our phone, take a picture, and crop out the poster itself. Maybe we have a bunch of documents that we want to take a picture of. Or maybe you're in a lecture and your professor decides not to publish the slides, leaving you with the only alternative of using your phone to take pictures of important slides, wasting precious time you could've spent listening for a subpar photo that's probably shaky and off-angle; or, even worse, your professor has a "no-phone" policy that prevents this in the first place. With everything returning to normal, this has become so much more common - and that's where InstaCap comes into play.

What it does

Our application allows you to seamlessly take pictures of documents, slideshows, and posters with just a willful blink; you can even take panoramic photos for wider views :O ! Images are uploaded and can be accessed via the web application, where you can download or remove images. Images are transcribed with OCR and can be filtered via the text in them.

How we built it

For the eye-tracking software, we used AdHawk MindLink glasses to see the user's FOV and track when they blinked. We used PyQt as the desktop app renderer, Firebase as our image-saving database, and React for the UI to display these images.

Challenges we ran into

There were often problems calibrating the AdHawk Mindlink glasses, but we all worked through it together. Furthermore, we spent a lot of time trying to deal with minor inaccuracies when many of our ideas demanded more detailed and specific measurements. The act of blinking also moves the eyeball, which makes tasks relying on both the focus of the eyes and timing a blink much harder to implement. Struggles of combining all of our contributions (AdHawk photo-capturing, Firebase storage, React UI) together in the end were also tumultuous obstacles we had to persevere through.

Accomplishments that we're proud of

We're proud of being able to create a functional and successful demo of what we wanted to achieve for InstaCap in the first place. Most importantly, figuring out how to use the AdHawk Mindlink Glasses, which was definitely an unfamiliar and fascinating piece of technology, along with being able to link all of our work in the end, felt like an invigorating experience. Despite the roller-coaster we went through, at the end of the day, our endless perseverance and curiosity to learn were what got us here.

What we learned

Daniel: OpenCV and Adhawk + All nighters suck
Bala: React (was primarily an AI and CV developer, but learned quite a bit helping with the React code); also sleep deprivation makes you worse at coding
Eric: AdHawk is cool 😎
Yimin: AdHawk

What's next for InstaCap

Keeping track of where the eyes are focused, only returning the document that the user is looking at
Improved OCR performance (because the transcript can often look wonky)
Wireless hardware
Make our AdHawk integration somehow run server-side

Built With

adhawk
firebase
google-cloud-vision
opencv
pyqt
python
react

Submitted to

Hack the North 2022

Created by

I incorporated database storage (Firebase) for our photos and contributed to the AdHawk and UI components of this project. Also helped link everything together in the end.

Eric Xiao
CSE student @ UW-Seattle Paul G. Allen School
I made the thing that stitched images together to make a panorama, helped with the auto cropping, and came up with some ideas for implementation.

Bala Venkataraman
I write code
Daniel Ye

Updates

Daniel Ye started this project — Sep 18, 2022 07:25 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.