image of the model providing detailed answer to the image and the console showing that the requests giving 200 code

Inspiration

Our project, "Jarvis," was born out of a deep-seated desire to empower individuals with visual impairments by providing them with a groundbreaking tool for comprehending and navigating their surroundings. Our aspiration was to bridge the accessibility gap and ensure that blind individuals can fully grasp their environment. By providing the visually impaired community access to auditory descriptions of their surroundings, a personal assistant, and an understanding of non-verbal cues, we have built the world's most advanced tool for the visually impaired community.

What it does

"Jarvis" is a revolutionary technology that boasts a multifaceted array of functionalities. It not only perceives and identifies elements in the blind person's surroundings but also offers auditory descriptions, effectively narrating the environmental aspects they encounter. We utilize a speech-to-text and text-to-speech model similar to Siri / Alexa, enabling ease of access. Moreover, our model possesses the remarkable capability to recognize and interpret the facial expressions of individuals who stand in close proximity to the blind person, providing them with invaluable social cues. Furthermore, users can ask questions that may require critical reasoning, such as what to order from a menu or navigating complex public-transport-maps. Our system is extended to the Amazfit, enabling users to get a description of their surroundings or identify the people around them with a single press.

How we built it

The development of "Jarvis" was a meticulous and collaborative endeavor that involved a comprehensive array of cutting-edge technologies and methodologies. Our team harnessed state-of-the-art machine learning frameworks and sophisticated computer vision techniques to get analysis about the environment, like , Hume, LlaVa, OpenCV, a sophisticated computer vision techniques to get analysis about the environment, and used next.js to create our frontend which was established with the ZeppOS using Amazfit smartwatch.

Challenges we ran into

Throughout the development process, we encountered a host of formidable challenges. These obstacles included the intricacies of training a model to recognize and interpret a diverse range of environmental elements and human expressions. We also had to grapple with the intricacies of optimizing the model for real-time usage on the Zepp smartwatch and get through the vibrations get enabled according to the Hume emotional analysis model, we faced issues while integrating OCR (Optical Character Recognition) capabilities with the text-to speech model. However, our team's relentless commitment and problem-solving skills enabled us to surmount these challenges.

Accomplishments that we're proud of

Our proudest achievements in the course of this project encompass several remarkable milestones. These include the successful development of "Jarvis" a model that can audibly describe complex environments to blind individuals, thus enhancing their situational awareness. Furthermore, our model's ability to discern and interpret human facial expressions stands as a noteworthy accomplishment.

What we learned

Hume

Hume is instrumental for our project's emotion-analysis. This information is then translated into audio descriptions and the vibrations onto Amazfit smartwatch, providing users with valuable insights about their surroundings. By capturing facial expressions and analyzing them, our system can provide feedback on the emotions displayed by individuals in the user's vicinity. This feature is particularly beneficial in social interactions, as it aids users in understanding non-verbal cues.

Zepp

Our project involved a deep dive into the capabilities of ZeppOS, and we successfully integrated the Amazfit smartwatch into our web application. This integration is not just a technical achievement; it has far-reaching implications for the visually impaired. With this technology, we've created a user-friendly application that provides an in-depth understanding of the user's surroundings, significantly enhancing their daily experiences. By using the vibrations, the visually impaired are notified of their actions. Furthermore, the intensity of the vibration is proportional to the intensity of the emotion measured through Hume.

Ziiliz

We used Zilliz to host Milvus online, and stored a dataset of images and their vector embeddings. Each image was classified as a person; hence, we were able to build an identity-classification tool using Zilliz's reverse-image-search tool. We further set a minimum threshold below which people's identities were not recognized, i.e. their data was not in Zilliz. We estimate the accuracy of this model to be around 95%.

Github

We acquired a comprehensive understanding of the capabilities of version control using Git and established an organization. Within this organization, we allocated specific tasks labeled as "TODO" to each team member. Git was effectively employed to facilitate team discussions, workflows, and identify issues within each other's contributions.

The overall development of "Jarvis" has been a rich learning experience for our team. We have acquired a deep understanding of cutting-edge machine learning, computer vision, and speech synthesis techniques. Moreover, we have gained invaluable insights into the complexities of real-world application, particularly when adapting technology for wearable devices. This project has not only broadened our technical knowledge but has also instilled in us a profound sense of empathy and a commitment to enhancing the lives of visually impaired individuals.

What's next for Jarvis

The future holds exciting prospects for "Jarvis." We envision continuous development and refinement of our model, with a focus on expanding its capabilities to provide even more comprehensive environmental descriptions. In the pipeline are plans to extend its compatibility to a wider range of wearable devices, ensuring its accessibility to a broader audience. Additionally, we are exploring opportunities for collaboration with organizations dedicated to the betterment of accessibility technology. The journey ahead involves further advancements in assistive technology and greater empowerment for individuals with visual impairments.