Communication is the key to developing connections, but can be inaccessible to those who do not communicate orally. Each of us in the group has been affected by an inability to fluidly communicate with someone who uses sign language as their primary form of communication.

What it does

With the aim of a complete ASL to speech converter, our project uses machine learning and the infrared capabilities of a Leap device to detect the first six letters of the alphabet in sign and translate them with 70% confidence levels to text.

How we built it

The code we developed uses the data recorded from a USB connected Leap-Motion device to accumulate a database of a dozen or so different people signing the first six letters of the alphabet. We used a principle component analysis (PCA) along with a random forest classification on the database to "teach" our program to properly interpret live input Leap-Motion data and print out to screen the signed letters as text.

Challenges we ran into

Unfortunately, even within the first six letters of the alphabet (all of which are static signs), the signs have similar attributes. This makes it important to have large, varying data sets in order to distinguish letters. As well, just like voices, each person signs with a different tone. Even with something as fundamental as the first six letters of the alphabet, each hand has subtleties to them that again requires a large data base for machine learning.

What's next for Hands Hear

As a long-term project, Hands Hear would develop into a wearable device (such as a broach or necklace) which enables an ASL native signer to sign naturally while their signs are translated in real-time to speech. In a first step, we would expand the database for all letters in the alphabet. From there, we would use a side program developed during the project that allows a user to sign out words one letter at a time and then output them to speech.

Developing the capabilities for visual signal analysis is analogous to the development of acoustic signal analysis in the world of speech recognition. In line with this analogy, the next step involves determining the base unit of recognition. In oral languages this is the phoneme; for speech recognition, the triphone (or syllable depending on your philosophy). Once this has been determined, an ASL to text converter can be developed. With the infrastructure for text to speech already on the market, the wearable ASL to speech device becomes a reality.

Share this project: