Tired of manually typing equations into LaTeX? Why can’t writing equations be as simple as saying them to your computer. Well now it can! We present an approach which is real-time, smart and natural for humans. Our method is powered by a voice assistant that runs completely on device.
What it does
Using Snips’ Maker Kit and Software Tools we perform real time ASR (Acoustic Speech Recognition) and NLU (Natural Language Understanding) in order to interpret a set of commands by the user. These include:
- Creating polynomial functions
- Creating trigonometric functions
- Writing the integral and derivative of above functions
- Creating 2D matrices
- Computing matrix multiplication and matrix inversion
- Plot polynomial functions The identified intents (e.g. write_polynomial) and corresponding slots/entities (e.g. max order and coefficients) are passed to a server which displays the corresponding LaTeX syntax along with a preview of the expression.
This workflow allows the user to dictate common functions and simply copy and paste the corresponding LaTeX code. With more time, we would have added additional support for more complicated functions and cumbersome LaTeX entries such as tables.
How we built it
We trained the Snips’ kit for different types of intents using their console. The training examples were given in natural language with specific slots/entities to be identified. For example, to identify intent Integral, following training examples were used
- Can you integrate the function z cubed (function) which has a lower limit 10 (lower_bound) and an upper limit 30 (upper_bound)
- integrate x squared (function) from 0 (lower_bound) to 20 (upper_bound)
Function, lower_bound and upper_bound are the slots that have to be filled for the intent Integral. Similarly, we train other intents
- Getting polynomial function
- Getting trigonometric functions
- Derivative of functions
- Creating Matrices
The trained model is dumped to the maker kit and is ready to be used with speech recognition. The python script on the board runs this model taking input from onboard microphone. The user is asked for inputs and also provided with constant feedback for the different operations being performed using speakers. This mode of interaction is very natural for us, human beings.
Once the required intents and slots are detected, maker kit sends a post request to user’s computer, where a server handles the request to generate a Latex script. The script is used to render the corresponding pdf. We use PyLatex for generating the Latex. Our current implementation provides both the latex and rendered pdf display to users.
Challenges we ran into
- As observed with all the hardware related projects, memory is a crucial resource. Due to limited memory we couldn’t install latex on the board.
- Our current implementation does work well with limited training data but more examples will enhance and make our model even more robust.
Accomplishments that we're proud of
- Prototype quickly with a hardware
What we learned
- Latex compilation and formatting.
- Snips console and maker kit
What's next for Speak2Tex
- Creating an education tool by extending this voice assistance in school setting.