Inspiration

Our project was inspired by the need to create a more inclusive communication experience for individuals who are hard of hearing and those facing language barriers. We recognized the challenges that the deaf and hard-of-hearing community encounter in accessing spoken information in real-time, especially in social settings. In today’s globalized world, seamless communication across languages is essential. By integrating advanced transcription and translation features into wearable technology like Snapchat Spectacles, we aim to empower users to engage in conversations fully, fostering understanding and connection in diverse environments.

What it does

Our project converts spoken language from Snapchat Spectacles into real-time text captions. By capturing audio from conversations, we create accessible written text for individuals who are hard of hearing. Furthermore, our translation features facilitate communication across different languages, ensuring that spoken words are accurately represented. We have enhanced the clarity and flow of the text to make it easy to read and understand. Once processed, the live captions—including translations—are displayed on the Spectacles, enriching communication and promoting inclusivity in multicultural interactions.

How we built it

To realize our vision, we employed a modular approach that combined various innovative technologies. We began by using AWS for real-time speech-to-text transcription, which allowed us to efficiently capture spoken language from the Spectacles. This initial transcription served as the foundation for our subsequent steps.

Next, we fine-tuned the Llama 3.1-7b model to improve the accuracy and coherence of the transcribed text. This involved training the model on diverse datasets, enabling it to handle both language translation and text refinement effectively. By addressing these needs, we ensured that users received clearer and more cohesive output.

After processing, the refined text is sent back to the Snapchat Spectacles, where it is displayed as live captions. Throughout development, we prioritized user experience, conducting iterative testing to refine functionality and ensure real-time operation. Our collaborative efforts culminated in a solution that enhances communication while promoting inclusivity and accessibility.

Challenges we ran into

Throughout the development of our project, we encountered several significant challenges, primarily revolving around the extraction of audio data from Snapchat Spectacles. None of our team members had prior experience working with audio streams, which presented a steep learning curve. We had to familiarize ourselves with the technical requirements for capturing and transmitting audio effectively (big thanks to David from Snapchat!)

Accomplishments that we're proud of

We take pride in several key accomplishments achieved throughout the development of Snaption Live. One major success was learning to harness Snapchat's augmented reality (AR) capabilities, enabling us to create a seamless integration between the Spectacles and our live captioning system. This experience significantly broadened our understanding of AR technology and its potential to enhance user interactions.

Additionally, we effectively utilized Intel AI and the Llama 3.1-7b model to refine our transcription and translation processes. Navigating these advanced technologies was a major learning experience, equipping us with valuable skills in artificial intelligence and machine learning.

Moreover, we demonstrated resilience and teamwork throughout the project. Despite facing technical hurdles, we persevered to bring our vision to fruition, ultimately delivering a functional and innovative solution. Completing Snaption Live not only showcases our technical abilities but also reflects our commitment to enhancing user communication and accessibility.

What we learned

The development of Snaption Live provided us with invaluable insights into teamwork and technical skills. Collaborating effectively was crucial; we learned the importance of clear communication, role delegation, and leveraging each member's strengths to achieve our shared goal. This experience has significantly improved our ability to work together as a cohesive unit.

We also gained hands-on experience with Snapchat Spectacles within Lens Studio. This exposure allowed us to explore creative ways to integrate our live captioning system into the Spectacles, deepening our understanding of AR applications and their potential impact on user experience.

Overall, this project has not only expanded our technical capabilities but also strengthened our teamwork skills, preparing us for future collaborative endeavors.

What's next for Snaption Live

Looking ahead, we envision several enhancements for Snaption Live that will improve user experience and broaden its applicability. First, we plan to support additional languages, allowing us to cater to a more diverse audience and facilitate cross-cultural communication. By expanding our translation capabilities, we aim to further dismantle language barriers.

We also intend to refine our machine-learning model by incorporating user feedback and real-world data. This will enhance the accuracy and contextual relevance of our transcriptions and translations, ensuring that users receive the most coherent captions possible.

Additionally, we are exploring partnerships with organizations focused on accessibility to promote our solution within the deaf and hard-of-hearing community. These collaborations could lead to features that enhance user engagement, such as customizable caption styles and interactive elements.

Finally, we aim to extend Snaption Live beyond Snapchat Spectacles by integrating our technology into other wearable devices and platforms. This will help us create a versatile communication tool that fosters inclusivity and connectivity across various contexts.

Built With

Share this project:

Updates