The home page of Stampyboi. In the bottom right are the options to filter by specific sources, including YouTube, Netflix, or video uploads.
The auto suggest feature.
The results page for the search "eleanor" with results from YouTube and Netflix.
The timestamp selection page. Each time stamp on the right links to the instance of the quote in the video.
What is Stampyboi?
Stampyboi is a tool to help you quickly and easily find the timestamped video clips you're looking for.
Many people, including us, often send funny clips from shows and videos they watch to their friends. But this is normally done by just sending a link to the entire video or clicking around randomly to figure out where exactly the moment you want to share came from. We decided that we wanted to solve this with Stampyboi. Stampyboi takes out all of the difficulty in sharing clips by searching YouTube and Netflix for that target moment.
- Easy file uploads
- Easy conversion to .gif format
- Easy sharing to Facebook, Twitter, Reddit, and other social media
- Autosuggester: generated from the Stampyboi's index so that suggestions are guaranteed to return results
- Spellcheck: also generated from Stampyboi's index
- Word stemmer: Porter Stemming Algorithm
- Stop word filter: List of stampyboi stopwords
- Phonetic matching filter: Double metaphone algorithm
Simply type a quote from a YouTube video or Netflix show you're looking for and hit "Search".
- Quote search bar: Takes in a quote to query. Includes autosuggest functionality.
- "Options" button: Toggles options menu
- "About" button: Links to this repository
- Options menu: Allows user to narrow your search based on video type.
- YouTube source (optional): Allows user to paste in a link to a YouTube video to search. If left blank, the query will be searched against all YouTube videos in Stampyboi's index.
- Netflix source (optional): Same as YouTube source.
- File upload: Allows user to upload one or more audio/video files to be searched using speech-to-text. Can select from file explorer or drag and drop. Currently supported file-formats: wav, ogv, mp4, avi, mov, mpeg.
- Stampyboi logo: Returns user to search page.
- Quote search bar (top right): Allows user to submit a new general query.
- Video result: Shows thumbnail, title, and list of timestamps that match the query. The user can jump to a specific part of the video by selecting the desired timestamp. Selecting the right-hand tab will link directly to the source video.
- List of timestamps: Selecting the desired timestamp allows the user to seek to a specific part of the video.
- Share this boi (Netflix or YouTube videos only): Allows user to copy the currently selected timestamped link or share the currently selected timestamped link to Facebook, Twitter, or Reddit. YouTube videos also have the option of being converted into gifs.
How Stampyboi works
Stampyboi indexes videos by extracting and storing their timestamped transcripts. When a query is submitted to Stampyboi, it searches its expansive index of over 330,000 videos to find transcripts containing the queried phrase. When a video link is specified, Stampyboi first checks to see if that video is stored in its index. If the video is found, Stampyboi will filter the results to only show that specific video. If not, the video is transcribed, indexed, then searched for the queried phrase (user-uploaded video/audio files are searched and then immediately deleted from the server). That video will now show up in the results when future users make general queries.
Above all else, we faced the challenge of learning to work entirely virtually as opposed to being able to meet and work together in person. On previous projects, we would often meet in person to brainstorm ideas and to help each other solve issues in our projects, but this wasn't possible due to the pandemic. This shift was a challenge for all of us.
As far as technical challenges, we had many when it came to collecting the information we needed to fill our Solr index. One of the most important challenges we had was finding a way to store the transcripts so that each word would be associated with a corresponding timestamp without storing a lot of redundant data. It was also very daunting to go through all of the documentation for Solr and figure out which features had the functionality that we were looking for. Even then, it took a lot of work for us to correctly process the response objects from Solr into a usable format.
For Netflix videos, we originally used 8flix, a database of free transcripts for Netflix shows and movies, but its reach was very shallow and it only contained a small number of videos. Midway through the project, we had to scrap this idea and shift to addic7ed, which came with its own problems as it had limits to the number of transcripts that could be downloaded per user per day. For YouTube, we used web-crawlers to access YouTube's transcripts, but these had to go through many iterations before being able to get us the information we needed in a timely manner.
What We Learned
Before this project, none of us had any experience with CSS and designing a website. Stampyboi challenged us to learn this quickly in order to make Stampyboi function exactly how we imagined.
On top of this, we learned how to set up a Solr index and link it to Stampyboi so that Stampyboi would run quickly and smoothly. We learned how to interact with difficult interfaces to get the results we wanted. And most of all, we learned how to work as a group in a completely virtual environment due to the current state of the world.
Moving forward, we hope to expand Stampyboi's searching capabilities to even more platforms, including Hulu, Prime Video, Disney+ and more. On top of that, we will be adding support for languages besides English, whether translated or native. This will expand the database to greatly diversify the quotes our users can find.