Inspiration
I'm an chronic coder but my typing skills are not excellent, so with all the models that exist I wanted to create a way to make speech coding not only possible but make it work well.
What it does
It takes video, audio, and html context to better understand what the user wants to affect in the site they are building.
How we built it
I used api's to transcribe the audio and timestamp when the word 'this' is said and use that to retrieve a screenshot of the video and get the html data at that point.
Challenges we ran into
I wasn't able to get this working realtime, and instead had to pivot to using prerecorded video and analyze the video to get the context needed
Accomplishments that we're proud of
I was able to identify what element was being referred to when the word 'this' was said.
What we learned
I learned how to integrate and manage different modal inputs and use them to complete a goal
What's next for Clarify This
Getting it to work realtime instead of using prerecorded videos.
Log in or sign up for Devpost to join the conversation.