Today, a single hour-long video can be anywhere between 250 MB to 1.6 GB. With just a handful of videos, it can be nearly impossible to find a certain object or event in heaps of data. Moreover, in urban military operations, the use of a body cam has become widespread. The use of these cameras can produce terabytes worth of data in a matter of days or even hours. To tackle these mass videos we created Fugazi. An automated video summarization application that can detect objects and events in large videos.
What it does
Fugazi allows users to feed in large videos and have a summarized report of events and objects in the video generated. Users are then allowed to search through the generated report and find timestamps of moments that may be of interest. They can also their desired video to the dashboard.
How we built it
Fugazi is built as a web app with Node.js and React.js making up the backend and the frontend. CockroachDB is used as the database of choice to store user information and video information. Once a video is uploaded to Fugazi, it passes through Google's Vision Intelligence API, which then detects moments of interest. Following this, the user is then allowed to search through the video with the timestamps that are returned on their dashboard.
Challenges we ran into
- Setting up the Vision Intelligence API
- Linking front and back end to work seamlessly
- Setting up Cockroach DB
Accomplishments that we’re proud of
- Having a functional prototype
- Working past timezone differences
- Learning and implementing a project with completely new platforms to us, over the course of 2 days
What we learned
- Zafir: React development
- Vidur: Database (Cockroach DB), React development
- Nandini: Node.js, Database (Cockroach DB)
- Jared: React development, Database (CockroachDB)
How we used CockroachDB
- For user authentication & authorization - login, signup.
- For storing analyzed videos. We store them in JSONB format and use the CockroachDB operators like '->' to access key components of the videos.
- We also planned to use the DB for an in-memory cache which would let us access the elements faster than a write-enabled DB. Due to storage of time, we couldn't implement the in-memory cache.
How we used Google Cloud Platform
- Storage of videos. We used the bucket for storing the videos for analysis and public use.
- Cloud Video Intelligence API for object detection. We use the object tracking API to label multiple objects in the video.
Due to the very nature of video files, their large size proves to be a hurdle in any project. The main limitation Fugazi faces is the processing time of videos as their size increases. Due to our limited knowledge of React development, we spent most time learning and doing React which prohibited us from optimizing on the video detection model.
What's next for Fugazi
- Audio analysis of videos
- Improved processing times
- Event detection enhancement
- Improved searching capabilities
- Transcript generator
- Interlacing multiple ground videos as a panorama and detecting in 360.
- Uploading multiple perspectives of the same environment to detect same objects in different videos