Giving a Hand

Our project is an Altruistic Hack under the General Track and we would like to be considered for the Best Accessibility Hack sponsored by Fidelity and Carelon's Challenge

Inspiration

In recent years, our society has become noticeably more inclusive. However, we noticed that deaf and mute members of society still have a hard time communicating effectively with the rest of the world, specifically in the workplace. We envisioned our app to be a solution to this problem, making sign language more versatile and improving sign language literacy in society overall.

What it does

Our app has two main functionalities.

Practice Mode

In order to improve sign language literacy across society, our Practice Mode allows users to learn sign language through an easy-to-use UI. Users are prompted to sign specific letters into the camera and earn points for each correct sign. They can practice in Guided Mode, where the letter prompt is accompanied with an illustration of the corresponding sign, or the more challenging Unguided Mode, where the illustration is not provided.

Writing Mode

Much like how text-to-speech applications make texting easier for everyday users, our app's writing mode allows adept signers to quickly convert their signs into words and phrases, making written communication easier and more efficient.

How we built it

Our site is designed using the Flask web framework and React JavaScript library. It can be broken down into three main components: backend, server, frontend.

Backend

Our CV algorithm is written in Python and is built on top of Google's MediaPipe Hand Detection model. When provided with an input image, MediaPipe outputs coordinates for each of the 21 joints in the human hand, which we then normalize about the origin. Our first step consisted of creating a dataset consisting of about 200 of these representations per letter. We then input the completed dataset into a KNN grouping algorithm, which provided us a model with which to classify future 21-point representations of the human hand.

Server

Our Flask server handles communication between our Python backend and our React frontend. When the program is run, we wrap our CV code into a single function, gen_frames, which constantly yields a byte representation of the user's webcam with appropriate annotations. These images are easily imported into html files by accessing the endpoint {server_ip}:5000/video_feed. A similar process is used to pass sign illustrations and their corresponding letters into the Practice Mode dashboard. In order to notify our backend if a user decides to switch between Unguided and Guided Mode while practicing, we use a simple system of POST requests with appropriate event handlers.

Frontend

Our React UI is carefully planned to be as easy to use as possible through thoughtful CSS templating and website layout. Beyond appearances, we use the React library for all of our website's dynamic functionality, such as buttons to navigate between pages. Additionally, React is used to set up event handling for server communication.

Challenges we ran into

Each of our app's three major components involved significant struggles throughout development.

Backend

Our biggest struggle was our model's accuracy. To improve the model itself, we experimented with different normalization techniques on our dataset, scaling datapoints nonlinearly to overrepresent small differences in joint locations between similar signs. We also introduced a requirement for a letter to be detected five times in a row before accepting it as a trustworthy output. Under these changes, our model is significantly more efficient.

Server

Our biggest struggle was handling communication between the backend/server and our React frontend. We experimented with websockets but found that they were two difficult to implement in our given timeframe. Instead, we eventually figured out how to use Flask Response objects and POST requests, which in the aggregate provided the same two-way communication functionality of a websocket.

Frontend

Our biggest challenge was determining the best way to present our model to create a pleasant user experience. The most challenging aspect of the functionality was the implementation of two practice modes and dynamic images, which we resolved through communication between the front and back ends. We experimented with different organizational structures and colors to create a bright, inviting feel to engage and motivate the user to learn.

Accomplishments that we're proud of

We are especially proud of our sign detection functionality. Though MediaPipe provided data about a hand's orientation, it was our idea to utilize a KNN to find patterns within this data and as a result, we were able to create a relatively homegrown algorithm. We are also proud of how we were able to overcome our challenges with websockets. Though websockets were the simplest solution if done correctly, we were able to improvise a solution involving two disparate techniques once it became clear that we were running out of time. We are also very proud of our ability to combine Flask and React capabilities into a single project, as it showcased our adaptability as problem solvers. Neither of us had ever done something like this before, but despite overwhelming errors and misunderstandings of documentation, we eventually pieced it together.

What we learned

Through our struggles, we learned a lot about different machine learning techniques to improve accuracy, specifically the importance of data preprocessing. Additionally, we learned how to effectively integrate Flask and React into a single application. This will be extremely useful for future hackathons involving app development.

What's next for Giving a Hand

We hope to further improve our CV algorithm, as it could be more accurate than it is right now. Additionally, our sign language alphabet is missing the letters Z and J because signing them requires hand movement, a variable that our KNN strategy is not equipped to handle. We hope to experiment with different, more versatile machine learning models to overcome this issue for future iterations.