Duo ASL

Inspiration

On my team, I'm the only deaf employee,” said Adams. “Now that we are working remotely, I have an interpreter who can be available for scheduled meetings. If something is last minute and I don't have access to an interpreter, I will use the auto caption feature to help me keep up with the conversation, then I'll follow up if there's anything I need to clarify.

For people who are unable to speak, it would be a challenge to communicate in video calls. Hence, the motive to build an application which displays closed captions for ASL.

What it does

In a conversation when one of the person is using ASL to communicate, the platform identifies and converts ASL to closed captions.

This enables people with difficulty in communicating with speech to communicate efficiently without a human interpreter.

How we built it

Developed the model with PyTorch and the model runs on Google cloud server. The backend was developed with Python and Flask. Front end was developed with a HTML, JavaScript and, React.

Challenges we ran into

Creation and establishment of the connection between server and client to send and receive video sequences for processing.

ML model for converting ASL signs to text requires a lot of processing power. Training the model on the local system would be a challenge, and hence decided to train on the cloud server. The bandwidth available was very low, and resulted in a very low frame rate. Datasets available for training required higher frame rates. So we had to come up with our own dataset for the particular task.

Accomplishments that we're proud of

Creation and training of dataset for the process of converting low frame rate ASL to text. Successfully generated closed captions for ASL.

What we learned

WebRTC, Socket programming, Importance of ASL.

What's next for Duo ASL

The system developed is for a video call between two people. This system can be expanded to a group conference call. With access to servers with better bandwidth, the accuracy for conversion of ASL to text can be made better. The text generated converted to audio, so that it would feel like a usual video call.