A brief recap of the inspiration for Presentalk 1.0: We wanted to make it easier to navigate presentations. Handheld clickers are useful for going to the next and last slide, but they are unable to skip to specific slides in the presentation. Also, we wanted to make it easier to pull up additional information like maps, charts, and pictures during a presentation without breaking the visual continuity of the presentation. To do that, we added the ability to search for and pull up images using voice commands, without leaving the presentation.
Last year, we finished our prototype, but it was a very hacky and unclean implementation of Presentalk. After the positive feedback we heard after the event, despite our code's problems, we resolved to come back this year to make the product something we could actually host online and let everyone use.
What it does
Presentalk solves this problem with voice commands that allow you to move forward and back, skip to specific slides and keywords, and go to specific images in your presentation using image recognition. Presentalk recognizes voice commands, including:
- Next Slide
- Goes to the next slide
- Last Slide
- Goes to the previous slide
- Go to Slide 3
- Goes to the 3rd slide
- Go to the slide with the dog
- Uses google cloud vision to parse each slide's images, and will take you to the slide it thinks has a dog in it.
- Go to the slide titled APIs
- Goes to the first slide with APIs in its title
- Search for "voice recognition"
- Parses the text of each slide for a matching phrase and goes to that slide.
- Show me a picture of UC Berkeley
- Uses Bing image search to find the first image result of UC Berkeley
- Zoom in on the Graph
- Uses Google Cloud Vision to identify an object, and if it matches the query, zooms in on the object.
- Tell me the product of 857 and 458
- Uses Wolfram Alpha's Short Answer API to answer computation and knowledge based questions
How we built it
- Built a backend in python that linked to our voice recognition, which we built all of our other features off of. ## Challenges we ran into
- Accepting microphone input through Google Chrome (people can have different security settings)
- Refactor entire messy, undocumented codebase from last year
Accomplishments that we're proud of
Getting Presentalk from weekend pet project to something that could actually scale with many users on a server in yet another weekend.
What we learned
- Sometimes the best APIs are hidden right under your nose. (Web Speech API was released in 2013 and we didn't use it last year. It's awesome!)
- Re-factoring code you don't really remember is difficult. ## What's next for Presentalk Release to the general public! (Hopefully)