MultiBet

Inspiration

MultiBet has individualized learning at its core. The user has several options in terms of languages they would like to learn how to write: English, Greek, Hebrew, and Japanese (Katagana and Hiragana). MultiBet presents the user with randomised characters (in their select language) for writing practice, which will then be assessed by ML algorithms on it’s accuracy and will provide real-time feedback to the user on how to improve their writing but showing them where the model looked, and how to shape their letter into the nearest proper letter. MultiBet also has support for text, visual, and audio interactions, adapting to the user's preferred method of learning (e.g. auditory learners might prefer to listen to instructions).

What it does

Over the last year, COVID restrictions moved many students around the world to remote schooling, which undoubtedly took a toll on the quality of their education due to the loss of in-person instruction. Learning a new language is no easy task, doing so remotely introduces even more challenges, especially with speaking and hand writing that both require a lot of practice. We wanted to create a tool that can help students, no matter how young or old, to improve their writing skills at any time and any place, without the need human teacher intervention.

How we built it

MultiBet was built in 3 sections. The frontend and backend were wrapped around gradio with custom processing functions. The dataset was assembled by downloading way too many fonts (english, greek, hebrew, japanese hiragana and katakana currently supported) and rendering them as 32x32 images. Models were adapted from open-source pytorch repositories and trained from scratch using heavy augmentation. A spatial transformer network was used to register a user sketch to a common space where we were able to find good neighbours to interpolate too using a hash, while an adapted variational autoencoder with a classification head was used to learn the distribution and classes of the data.

Challenges we ran into

Gradio uses a submit button to update the frontend which wasn't syncd with our setup. Due to time constraints we had to adapt our apps control flow to bypass this issues.

Accomplishments that we're proud of

Our team is incredibly proud of our ability to synthesize data using PyGame to render fonts. We found that our models were able to learn disentangled features such as shape, size, orientation, thiccness. We also found it cool to assemble a visualizer that was able to traverse the VAE's encoded latent space to give the user better feedback on how they could modify their strokes. Using a registration model along with a hashing was also something new and proved to be very useful for the visualization aspects of the project.

What we learned

The need for hand-drawn characters are a lie... we had near perfect performance on a hand-written set of data and were able to show that autogenerated data under strong realistic augmentation settings are sufficient to learn a strong distribution.