TurtleBot gesture recognition and response

Inspiration

Interested in computer vision, and saw viability in disability assitance. Wanted to implement features to improve OM1, as described in the feature: https://github.com/OpenmindAGI/OM1/issues/180.

What it does

The turtlebot takes the camera feed from the laptop and sends the video feed to Vila VLM. It turns it into text to describe the hand gesture, and then moves based on the mapped gesture to correct movement.

How I built it

Researched the OM1 codebase, and altered a json5 config file, using extensive prompt engineering to get accurate results.

Challenges I ran into

It was difficult to connect to the robot at first, and there were some missing packages that were hard to find in the setup
Robot was not fully charged due to shared use
Robot was inconsistent with getting input data from camera ## Accomplishments that I'm proud of
Robot can accurately recognize hand gestures, and moves the correct direction based on them ## What I learned
Learned about robot software tools such as Zeno
Learned how to navigate and implement features in large legacy codebases ## What's next for Untitled
Hand gestures recognition even with noise in the video, (ie. multiple people in the same video doing different things)
Full body gesture recognition

Built With

python