Inspiration
The main inspiration of S2T is to promote inclusivity of AUSLAN users together with non-AUSLAN users by reducing the barriers of communication using AI. This supports and empowers them to interact with others comfortably, confidently, and reliably in a language that they are familiar with.
Learning AUSLAN can be difficult when there is little guidance and courses can be expensive. Here, S2T AI can be used in the mobile application to assess if the learner's signing is correct and affirm them with visual cues.
What it does
The main concept is to perform a live translation of AUSLAN to English Text. We can implement S2T in an online setting (extension or Zoom apps) or as a mobile application. An example of using S2T in an online setting is through a zoom extension so that AUSLAN users can express themselves clearly with ease.
How we built it
For the back-end development, we have to create our own dataset (small dataset) by capturing a frame of the live video feed of ourselves signing AUSLAN. With the aid of mediapipe framework, we are able to locate face, pose, left hand, and right hand. Images that were captured were then converted to numpy array for training purposes.
We then split the dataset that we have created into 95% train and 5% test before training the model. The usage of LSTM Deep Learning algorithm is used because they are able to store information over a period of time and they are good for image classification.
Hyperparameters have been tuned to yield better performance from the model. Our model managed to return a 75% accuracy.
Since we lacked front-end skills and experience, we looked for tools that were easy to use and beginner-friendly. As a result, we used Figma, a collaborative interface design tool. We are proud that our first-year students and first-time hackathon participants were able to pick up the interface quickly and produce a prototype in the time we had.
Challenges we ran into
The main challenges that we faced are difficulties in getting the algorithm to detect the target face and hands and finding an appropriate dataset to train the model. In addition, our team does not have any experience in front-end app development.
Accomplishments that we're proud of
Despite not having a proper interface for our prototype, we have demonstrated the main functionality to translate AUSLAN to Engish text.
Apart from that, new members that entered UniHack this year without prior experience were able to learn new skills and technology and contribute in a meaningful way.
What we learned
We've learned design aspects to improve User Experiences such as the usage of visual cues and color. We've also learned about LSTM (Long-Short Term Memory) recurrent neural networks, their benefits as compared to deep neural networks and why it's good for computer vision.
What's next for S2T
Further development and research will be done to implement S2T as a zoom application. Furthermore, the mobile application will also be developed to promote inclusivity of AUSLAN users in our society.
Built With
- computer-vision
- deep-learning
- figma
- miro
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.