Writing research papers and dealing with manual formula writing in Word or LaTeX is a great pain for everyone. Thus, we came up with the idea to make a program that scans printed or hand-written formulas and turns them into LaTeX automatically.
How we built it
Tried Wolfram, Google Cloud Platform, OCR APIs, but eventually settled for Java and Python. We use OpenCV for OCRing and jlatex for LaTeX conversion/formatting.
We use bash script to put a .png file to python, then we pipe the output to Java to generate a nicely formatted .jpg file with the formula
Challenges we ran into
OCRing the image input is very difficult + there are few good libraries for outputting a nicely formatted LaTeX
What we learned
Our project has covered several PhD theses. Some of them are:
And cover topics from teaching machine learning image recognition, to manual formula parsing to LaTeX
What's next for LaTeX Formula Scanner
Full integration with mobile platforms. Use of online computing services. Voice input of formulas. Online storage of outputted formulas