Inspiration
The All In Podcast features David Sacks frequently saying "I didnt say that" when mentioning a misquote from the New York Times.
What it does
"I DIDN'T SAY THAT" parses youtube to perform topic modeling, and statement indexing that allows you to verify what you've said.
Currently, it's a chrome extension that overlays youtube.
How we built it
Technical implementation: Youtube transcripts -> Embedding+ Index -> Topic modeling -> Prompt engineering/tuning -> Caching
Deployment: FastAPI to host langchain applications
UXUI: Chrome extension to provide a seamless user experience
Challenges we ran into
- Collect high-quality transcript data with speaker tagging info for better parsing and summarization.
- Not enough time, GPUs to train models in one day. We could fine-tune LLM models in parallel to better suit our use case.
Accomplishments that we're proud of
- Collect and indexing ALL all-in podcast data and embeddings to support various NLP tasks, i.e question answering, semantic search and summarization
- Prototype the chrome extension to improve the youtube/podcast watching and search experience
- This could serve as a general fact checker tool for any media content.
What we learned
The default behavior CHANGES EVERYTHING.
What's next for I didnt say that! - the Youtube fact checker
We could run this live across all podcasts and during contentious topics and Presidential debates
Built With
- langchain
- react
Log in or sign up for Devpost to join the conversation.