Visual Assistant

Inspiration

We spent many hours brainstorming various concepts, exploring their potential implementations, and assessing the potential impact of each. We found that many of the ideas we came up with were not feasible, not impactful, or not original. Whilst brainstorming limitations that people might encounter, our thoughts turned to physical constraints—specifically, those preventing individuals from using a computer. After some research, we found that there are many people who are unable to use a computer due to certain limitations, and we started working in that direction. This avenue appealed to us greatly—it presented a solution we could envision not only ourselves utilizing but also providing great benefit to others!

What it does

Visual Assistant enables individuals with physical or cognitive impairments to navigate and interact with a computer using visual gestures instead of traditional mouse input. By recognizing these gestures, the system translates them into corresponding computer actions, enhancing accessibility and usability.

How we built it

Visual Assistant was developed using Mediapipe, a powerful tool for processing visual data and extracting meaningful insights. Our primary task involved mapping a wide range of visual gestures to corresponding mouse and keyboard operations, ensuring seamless interaction with the computer.

Our methodology centered around an algorithmic approach to gesture detection, where we defined a set of criteria for recognizing gestures based on the positions of key hand points. This streamlined approach allowed us to achieve very customizable gesture recognition without the need for extensive AI training or neural network architectures, which allowed for fast development and iteration.

In developing Visual Assistant, we efficiently delegated tasks based on team members' strengths, fostering collaboration and progress. Through collective research and iterative adjustments, we ensured the project met high standards of quality and functionality, resulting in an effective and streamlined development process.

Challenges we ran into

Gesture Data Overfitting: At the onset, we encountered a challenge with the gesture data being prone to overfitting. To mitigate this, we introduced an additional confidence value parameter to our model. While this adjustment alleviated some of the issues, we acknowledge that the solution wasn't optimized to its fullest extent. Our intention is to further refine this aspect in future implementations. Nonetheless, for the purposes of our demonstration, the current solution functions seamlessly, providing a solid foundation for showcasing the capabilities of our system.
GUI Implementation: Another hurdle arose when we had to manage two concurrent infinite loops, one for gesture tracking and the other for GUI display. Initially, threading seemed to resolve this issue. However, we encountered compatibility issues with the tkinter library, which operates primarily on the main thread. This led to occasional disruptions. Upon further exploration, we devised a solution by integrating the tracking loop with the GUI, effectively circumventing the threading complications. This adjustment streamlined the process and ensured smooth functionality.

Accomplishments that we're proud of

Despite the challenges, we're proud to have successfully developed a functional and accessible solution that addresses the needs of individuals with disabilities. Our system's ability to empower users to interact with computers using intuitive gestures represents a significant achievement in enhancing digital accessibility. Moreover, the customizable nature of our gesture recognition system and its potential to adapt to users' unique needs are accomplishments that we find particularly rewarding.

What we learned

Our journey with Visual Assistant taught us valuable skills, including mastering Mediapipe, processing and storing hand datapoints, and executing input actions programmatically. Additionally, in terms of development practice, we increased our skills in having scalable code that is easy to maintain and extend. The code base was designed to be modular and easy to understand, which allowed us to work on different parts of the project simultaneously.

What's next for Visual Assistant

Our future plans for Visual Assistant involve enhancing gesture recognition, refining or redesigning mouse control, and introducing customization options for gestures. We are committed to continuous improvement and expanding the capabilities of our system.

Built With

json
mediapipe
numpy
opencv
pyautogui
python

Updates

Alexandra B. started this project — Feb 26, 2024 11:44 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.