Alik Z - Growing as a deaf child moving to the United States from Russia at the age of four years old, I relied on my parents to communicate with my teachers. I was not independent until I was in high school, the first school year when I was wearing cochlear implants and being able to talk with people on my own without completely relying on someone who knows how to sign. If I was five years old today and used this app, I could not imagine who I would be today.
What it does
Augmented ASL provides real time captioning in Augmented Reality. Make a handshape, our app will generate the equivalent alphabet mapping in AR.
How we built it
Our app is trained with 36000 images of 29 Labels on coreML and then we ported it to iOS using ARKit, SceneKit and CoreML library. So basically we have an Image recognition model ported on an AR app.
Challenges we ran into
We decided to write everything from "scratch" (no machine learning services, etc...) including our datasets. We spent some time on taking pictures, resizing it and then experimented on the grayscale images and this didn't give us satisfactory results so we had to scrape some more RGB images and look for other datasets to create our own mlmodel.
We also didn't have prior knowledge on how to integrate ML with AR.
We have spent most of the time
Obtaining proper data and creating our own dataset. Making it work with AR, and fine tuning the deep learning model to give better accuracy.
Accomplishments that we're proud of
We could make a model by Saturday evening and spent 12 hours on building better classification. We had to create custom Deep Learning models (usually through transfer learning).
What we learned
Data Augmentation, Computer vision details, Tensorflow js, integrating machine learning with AR and iOS.
What's next for Augmented Sign Language ( ASL )
Primarily Augmented Sign Language is a usable application. However, it's also a way to demonstrate that the technology to create life changing applications for deaf people exists, and it's just a matter of reaching out and piecing together a solution. This application was made in a little less than 36 hours, a year or two of research can lead to something truly groundbreaking.
We hope to learn some more machine learning and make the same work for small video sequences. We kind of have an idea on how to do it but we definitely need some expertise. We want to go from alphabets-->words---> Sentences.