We wanted to find a way to quickly and effectively find specific parts of a video and summarize a video, whether its a news broadcast, an educational video or online lecture, or simply a movie or show.
What it does
Uses NPL, OCR, and STT technologies to analyze every frame and sound clip of a video to index the video into specific segments, allowing for the user to semantically search for any part of the video and easily find what they are looking for.
How we built it
Our application uses a novel 2 layer, 4 tier architecture. The user interacts with an intuitive and user-friendly web-based interface that follows Material UI guidelines and is powered by React. The data is sourced from a Rust processing engine, which connects to the data aggregation service via an efficient network protocol. The Rust processing engine is compiled to WebAssembly and is thus able to efficiently run entirely in-browser, for the optimal user experience.
Challenges we ran into
During the creation of Videtect we faced many challenges. One such challenge was that simply scanning for words spoken in a video did not meet our rigorous standard. We wanted our users to have full control over search, so we utilized OCR processing which scans through a video and locates relevant text.