Inspiration

The brief talk from Hana and the accessibility theme of Support-a-thon hackathon made me contemplate on human-machine interactions and how technology, AI/ML could open up new possibilities in enabling multi-modal and non-contact interactions through vision and even voice controls.

For this hackathon, I chose to explore gesture detection through webcam feed, which can then be integrated into web apps to program and trigger alternative app control flows to promote accessibility.

What it does

The project built so far does hand pose detection from the webcam feed and highlights the detected hand by drawing a mesh in the canvas, by connecting different hand landmarks like fingers, knuckles, palm. The projection of the canvas mesh on to the hand in the webcam feed is imperfect and needs further work.

Displays detected gestures (Thumbs up, Victory) through corresponding images and predicted confidence scores.

Have a button to start and stop the prediction.

How we built it

The app has been built with React.js and pre-trained handpose, fingerpose ML models from Tensowflow.js framework.

Challenges we ran into

1) Alignment of hand mesh drawn in the canvas html element 2D context with the actual hand in the webcam feed. 2) Given that I am working on something new and is exploratory, the uncertainity associated with hitting project milestones within the hackathon timeline.

Accomplishments that we're proud of

1) The fact that I was able to explore YouTube tutorials and documentation of React and Tensorflow frameworks, to get hand pose detection working at some level. 2) Coming up with a project idea that fits the hackathon's accessibility theme and possible features that can actually provide digital solutions to real world problems.

What we learned

1) Learned how accessibility can make digital space more accessible to more people, especially with more aspects of our lives being transitioned into the online world at an accelerated space, since the Covid-19 pandemic. 2) The intricacies of working with pre-trained ML models and the tools out there to build a web-based solution relatively quickly, through streamlined integrations. 3) Combining multiple models, to feed inference from one to another and get the final inference need to programmatically trigger app control flows.

What's next for Exploring gesture detection with Tensorflow.js framwork

The bigger goal is to leverage AI/ML and web technologies to create new human-machine interactions with simple yet useful gestures and possibly create integrations for text, voice assistive features in web apps. Additionally, to also carryout transfer learning to add new gesture controls within the platform, with new data and making them open source.

Built With

  • react.js
  • tensorflow.js
Share this project:

Updates