Writing research papers and dealing with manual formula writing in Word or LaTeX is a great pain for everyone. Thus, we came up with the idea to make a program that scans printed or hand-written formulas and turns them into LaTeX automatically.

How we built it

Tried Wolfram, Google Cloud Platform, OCR APIs, but eventually settled for Java and Python. We use OpenCV for OCRing and jlatex for LaTeX conversion/formatting.

We use bash script to put a .png file to python, then we pipe the output to Java to generate a nicely formatted .jpg file with the formula

Challenges we ran into

OCRing the image input is very difficult + there are few good libraries for outputting a nicely formatted LaTeX

What we learned

Our project has covered several PhD theses. Some of them are:

And cover topics from teaching machine learning image recognition, to manual formula parsing to LaTeX

What's next for LaTeX Formula Scanner

Full integration with mobile platforms. Use of online computing services. Voice input of formulas. Online storage of outputted formulas

