ASL Gemini Applicaiton

Inspiration

The inspiration behind the ASL Gemini App stemmed from the desire to pioneer a project that hadn't been attempted before. The concept of utilizing AI to teach American Sign Language (ASL) to users through interactive gameplay was a novel idea. The aspiration to embark on a unique endeavor served as a driving force throughout the project's development.

What it does

The ASL Gemini App is an innovative platform designed to facilitate the learning of ASL through interactive gameplay and AI-powered feedback. Users engage in interactive sessions where they are prompted to sign ASL words or phrases using their device's camera. The app then employs AI technology to analyze the user's signing accuracy and provides real-time feedback and guidance based on Gemini's insights.

How we built it

The development process began with training an ASL model using a dataset consisting of 26,000 images per character, resulting in an 89% accurate model. Subsequently, a function named asl_video was created to identify ASL language or characters presented in video frames. The Gemini API was integrated to provide AI-based feedback. The project was implemented in Python, leveraging libraries such as cv2, mediapipe, tensorflow, and gradio for seamless execution.

Challenges we ran into

One of the major challenges encountered was optimizing the accuracy of the model. Despite training with a substantial dataset, achieving the desired level of accuracy proved to be arduous. Initially conceived in JavaScript, the project had to be transitioned to Python due to the availability of essential tools. Additionally, integrating frontend technologies like React posed difficulties, leading to the adoption of gradio for UI development.

Accomplishments that we're proud of

An accomplishment worth noting is the potential impact of the ASL Gemini App in bridging communication gaps and fostering inclusivity. If transformed into an application, its ability to facilitate ASL learning could significantly benefit individuals seeking to learn sign language.

What we learned

The project provided invaluable learning experiences, particularly in utilizing Gemini AI. The ease of integration and implementation underscored the effectiveness of leveraging AI technologies for educational purposes.

What's next for ASL Gemini Application

Enhancing the model's accuracy to near-perfection through further training with an expanded dataset.
Enabling support for video files or streams, broadening accessibility beyond live streams.
Transitioning into a finalized application with features like user authentication, progress tracking, and database management, thereby making it market-ready.

Built With

dotenv
gemini
git
github
google-generativeai
gradio
markdown
mediapipe
opencv-(cv2)
python
tensorflow

Updates

SachithRKA Ranaweera started this project — May 02, 2024 09:36 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.