Inspiration
Anyone who consumes podcasts knows about the producer guy that sits off screen and pulls stuff up for the host.
But hiring a dedicated Googler is expensive overkill. We created a solution for smaller channels looking for a similar experience.
What it does
Pull That Up listens to your conversation, and displays supplemental media and information to support the discussion. If you mention a Youtube video, it will display the Youtube video which you can select to watch. There’s a big discussion about misinformation being spread on podcasts, and Pull That Up provides fact-checks whenever a contentious statement is made. It also looks up relevant articles and search results, and displays images of any product or public figure you mention.
Additionally, you can provide Pull That Up with a trigger word which it’s always listening for, enabling hands-free control. The system could also be seen as a novel accessibility tool for hands-free internet navigation.
How we built it
We used Vercel, Chrome's built-in speech-to-text API, standard LLMs, and a handful of search engine APIs to create a prompt chain.
Challenges we ran into
One of the main weaknesses of our current implementation is the speech to text. We’re currently leveraging Chrome’s built in solution, which doesn’t include punctuation. Ultimately, we should be sending transcriptions over to our generative pipeline in complete sentences, not just groups of words. because of this, sometimes statements get broken in half and the intent or sentiment is lost as a result. In the future, we would also need to track separate speakers, which isn’t a part of our demo.
Accomplishments that we're proud of
We feel like we were able to leverage generative tools to get a lot done in less than 36 hours!
What we learned
It's important to break down requests to smaller components so that the temperature can be tuned appropriately for each response.
What's next for Pull That Up
Many additional features could be added. The system could take the entirety of the conversation and produce timestamped annotations, references, and affiliate links, which could be valuable to listeners. There's a lot of improvement to be made to the heart of the system as well.
Built With
- chrome
- google-web-speech-api
- llm
- next
- nextjs
- vercel
- youtube

Log in or sign up for Devpost to join the conversation.