TekScribe

Screenshot of our "Demo"
Before
After
Screenshot of the real-time speech-to-text app powered by the Azure Speech API
A flow chart of the capabilities of our app and the services that power them

Inspiration

The inspiration for our app comes from our personal frustrations as physicists in not having any note-taking tools beyond pen and paper that allow us to easily write mathematical expressions in our daily lectures. In addition, we are tired of the chemists at our university who have access to audio lecture recordings while we do not!

What it does

TekScribe is a split-screen app consisting of a canvas and scrolling text to the side. The scrolling text is generated in real-time via speech-to-text methods and are presented as individual text bubbles, similar to common messaging apps. The canvas allows user input in three different ways:

a paragraph textbox where text from the generated speech bubbles can be copy/pasted.
stylus/handwritten input of mathematical expressions which are then rendered into LaTeX.
Raw input of stylus/handwritten diagrams.

Once the user is finished with note-taking, the final canvas would be saved as a pair of .tex and .pdf files.

The app would give its users the ability to efficiently take notes by allowing them to easily write clean formatted mathematical expressions/diagrams as well as transcribe speech word-for-word efficiently. For our target audience, undergraduate students, this would be powerful tool for taking notes in lectures.

How we built it

We used Microsoft Azure's Speech API for real-time speech-to-text service, and we interfaced the Microsoft Bing API with Python (bing.py) to parse Wikipedia glossaries for short phrases, which was subsequently fed into the Azure custom speech API to train a custom language model. This capability was built into an Android app "MainActivity" that can transcribe microphone input in real time and output them on the screen as speech bubbles.

In parallel, we used the interactive-ink module of the MyScript SDK (output of LaTeX strings from user input) as well as the open-source MathJax and KaTx (rendering of LaTeX expressions from strings) to implement TeX-ification of stylus input. Also, ssh into virtual machine hosted on Microsoft Azure for the compilation of .tex files to .pdf with pdfLaTeX. This was implemented in "Demo", a app that demonstrates these capabilities.

Challenges we ran into

Many hours progress lost due to local memory loss + infrequent Git Pushes Eventually, we hit a brick wall in combining the speech-to-text capabilities with our TeX canvas in Android Studio because of the incompatibility between the SDKs. Unfortunately, we have not been able to overcome the technical nightmare of Gradle and so our app is implemented as two separate instances (speech-to-text, as well as TeX-ified writing).