Inspiration
In a constantly evolving world where humans have to adapt to technological developments, it can sometimes be challenging to foster relationships, make friends, and form lasting connections. For certain groups of people like the visually impaired, new technologies are often inaccessible due to the need to . One of Echo's co-founder's childhood friends, Bhuvan, falls under this category. Bhuvan was born able to see as well as any other kid. However, his vision naturally faded due to macular degeneration, an eye disease leading to a loss of vision. Bhuvan's life drastically changed during his time in high school; once a generally outgoing and social guy, Bhuvan started to feel more and more isolated from his friends. Without vision, him and his friends could no longer share the same experiences.
Bhuvan's story has been our inspiration for this project and we believe our product can help the blind and visually impaired feel more connected to the people around them. With more accessible and inclusive technologies like Echo, we hope to remove any barriers to technology and help individuals experiencing isolation like Bhuvan's.
What it does
Echo is the world's freshest audio-based social media platform that is accessible to the blind and visually impaired. Echo keeps the traditional components of social media while reimagining the way people communicate online. Posts, comments, and even usernames are all in audio form, allowing for users to browse through posts without needing to see the screen. Our design also allows for any user to navigate easily navigate the app using voice commands and audio cues.
How we built it
Frontend and backend was built on Next.js. We maintained two websocket connections at the same time to handle real time STT with Deepgram as well as function calling with a Mistral 7b instruct fine tuned for our case. We also stored all Posts, Users, and Comments on AWS DyanamoDB with all audio files publicly available on S3.
Challenges we ran into
being able to work with real-time voice recognition and being able to fine tune a LLM for function calling specifically to navigate the app.
Even though we were able to finetune a model in IDC we weren't able to expose an endpoint which was crucial
To process voice commands, we intially wanted to deploy our own fine-tuned LLM, but we were not able to get a server to host it on. Additionally, training was a difficult due to a lack of GPUs available on the Intel Developer Cloud. Our fix to this problem was using the OpenAI API to make calls with a custom prompt.
WORKING WITH AUDIO ON BROWSER SUCKS
Accomplishments that we're proud of
We are most proud of our product ideation - our biggest accomplishment is streamlining our technical abilities and experience in a way that aims to make the world a better place.
Setting up and fine tuned model with IDC which took a while and had it's challenges, but we were able to fine tune a model for the navigation of our social media web app purely through voice. Though we weren't able to host the expose an endpoint for use in the web app it did provide immense experience with AI training in the cloud.
What we learned
We learned that voice technology is currently very easy to work with and I use of IDC shorten's the fine tuning necessary for making LLM's that perform action.
What's next for ECHO
Currently we are only able to post within Echo but eventually we want to integrate with facebook, instagram, twitter any social media so that we can make any platform accessible to the visually impaired and blind.
Thank you Bhuvan!!!
Log in or sign up for Devpost to join the conversation.