As (psuedo-)humanities students, we often found ourselves struggling to find a keyword in a textbook, when it is not in digital form. Creating something that can allow us to find a keyword (and related words) on paper was our goal.
What it does
Using AR and deep learning technologies, we find a user-input string within text on various physical forms, ranging from textbooks to signposts to nutrition facts on soda cans. Upon finding the word, it highlights the keyword. It is also done in real time, so there is virtually no wait.
How we built it
At first, we explored two different approaches. First, we attempted to implement a deep learning model based on a paper (_ An Efficient and Accurate Scene Text Detector _). Then we decided to use Firebase, implemented by Google, as our backend engine to process text images. The two approaches ultimately combined to yield optimal results.
Challenges we ran into
We were all somewhat unfamiliar with Android development. The first challenge we ran into was getting Camera to load correctly within the App. We also experienced some issues with memory allocations due to the asynchronous nature of the Firebase library.
Accomplishments that we're proud of
During the first phase of our approach, we developed a server/client that takes in a text image and yields somewhat reliable results, albeit the performance was not optimal. We also developed a modified seam carve algorithm that would allow us to carve a picture of text by lines and using threading to improve performance. The two didn't yield lasting results, but as we extended what we learned in class beyond a classroom setting, we were satisfied with the progress we made.
What we learned
We learned how to develop an Android App (what it includes, what to look out for, etc.). We also learned specifics behind OCR and its contemporary approaches to real time text analysis.
What's next for ctrl-F
We are considering using voice input instead of text input as our biggest next step. We are also considering further optimizing so that we can use less memory and CPU time. UI/UX improvement is certainly also on the agenda, along with development for iOS and more traditional AR platforms.