For the deaf community, translation from speech to sign language has become extremely important. Translators are relied upon in lectures, conferences, and even in daily conversations. Unfortunately, translators are not accessible to everyone, especially in the less fortunate communities.
We wanted to create a robot that can translate speech into sign language that is displayed on a set of mechanical hands.
What it does
We have created Handy: a set of two human-like hands and arms that are articulated in exactly the same way as normal human hands from the elbow down. This gives our robot incredible versatility allowing it to do anything from communicating in sign language to even playing rock-paper-scissor shoot.
It is capable of translating English speech into ASL signs using the Google Cloud speech to text API.
How we built it
We arrived on site with pre-printed 3D components and set to work building the hands (20 servos in total).
At the same time, we started to develop an abstraction for an ASL database that provides heuristic features about each sign. We ran our code on the 800 most frequently used words to extract about 10 unique identifiers for each sign. We then converted these features into angular rotations on each robotic joint, and finally into a custom-designed machine code which would generate the desired motions on the robot in real time. A custom protocol was created to allow the python code running on a mac to be transmitted, via a serial port, to the Arduino which sends out i2c data to two 16 channel PWMmodules that control the servos.
Challenges we ran into
Our initial idea was to include a camera and recognize sign language via a convolutional neural network and even have our robot respond to it via a chatbot. We realized that this would be impossible for us to achieve in the short time frame. With many signs being very similar and the databases very restrictive, there was a very low probability of success. We, therefore, had to shift our idea to something more realistic in a short time frame.
While we were working on the abstraction for the 3D movements from the ASL database, we realized that the databases that are available are not nearly as refined as we thought. This meant that we lost detail on some wrist motions and we had to generalize the bigger movements (for example, the major location of the move).
We also had to overcome the limitations in having the robotic elbow fixed at one position. Something as basic as tracing a circle could not be done; instead, we used some math to create a perception of a circle (but in reality an ellipse was drawn).
We also faced the difficult challenge of creating our own machine code, that unlike most, used an interdependent polar system, with one servo rotating the next, rotating the next.... Considerable math work had to be done to approximate all possible R^3 positions onto what was essentially the curves of two circles (a sphere that rotates in time).
Finally, the hardware ended up being a weak link and with no possibility of buying replacement parts, we had to think on the fly to keep the hands in working order.
Accomplishments that we're proud of
Sure there are a few bugs here and there, but it works and it's amazing. When we first thought of the idea, we thought it would be impossible to complete in 36 hours. But things went somewhat smoothly and everything worked out in the end.
With 3 members of our team being first-time hackers and one member being bogged down over the digital recognition stuff that didn't work out for most of the time, we are extremely proud of our achievement.
All things considered, the fact that we have a working prototype is really awesome.
What we learned
No task is too big when you have an endless supply of candy and energy drinks
With only one member of our team being used to hardware, it was a big learning experience to play with the Arduino and even creating our own machine code and transmission protocol.
The whole abstraction of an entire database was also new to us and it ended up being an interesting aspect.
What's next for Handy
Definitely a V2 of the hardware... Also, find a better more detailed database to have unique detailed signs.
Also, (though it is definitely really tough) a way for the computer to take input from a sign language user and be able to understand the signs through computer vision.