We wanted to be able to convert our handwritten notes into nice looking documents online, so that we could study them better with other people. With this, all the people we know can also use this and convert their paper notes to digital notes. Also, it can be used on scientific papers, books and many many other sources of text to save the text into a file.

What it does

It takes an image and utilizes the Google Cloud Vision API's OCR features to parse the text within. Then, we reformat the text based off of the whitespace, positioning on the page, size of the font, section markers and several other factors by inserting LaTeX formatting code. The website is a hosted on Google Cloud App Engine, and runs on a tornado python base. It utilizes html, css, Javascript and converts the inputted files into binary form in order to run the analysis on it.

How we built it

We built it on top of Google's Cloud App Engine. Without it, there would be no place to host all of the data and the calculations. Then, we made calls to Google's Cloud Vision API to use the Optical Character Recognition module to convert written or typed text to editable LaTeX format. From this, we can convert to numerous different file formats, such as PDF, .txt, etc.

Challenges we ran into

One of the largest problems with our program was the ability to calculate whitespace and font in the documents. The original setup of the OCR let us classify the text, but not the type of font that the text was. On top of that, scaling the size of the text and boldness or italics was difficult. Calculating whitespace was difficult because indentations are relative based off of the size of the image and the size of the font.

Accomplishments that we're proud of

We were able to set up the website from scratch and host it on the Google App Engine despite not having people experienced in web design. We started from not being able to center text on the screen to being able to transfer files in binary encodings to and from the client. Also, despite being very difficult, a vast majority of the whitespace in the documents are accounted for with our algorithm.

What we learned

We learned how to develop web apps using Javascript, html, css and the Google App Engine. Also, we taught ourselves how to use the Google API for cloud vision, which is really useful because all of the google suites are pretty similar.

What's next for Image to LaTeX Converter

I plan on using the domain name that we bought (but didn't get approved in time). Also, I want to fill in some of the cracks in the LaTeX code that we didn't have time to finish during the hackathon. There are also some miscellaneous bugs to prevent the program from crashing that I need to patch out.

Built With

  • google-cloud-app-engine
  • google-cloud-vision-api
  • perceptron
  • python
  • tornado
Share this project: