Inspiration

In recent history, we have seen a massive growth in the abilities of artificial intelligence models. With this growth, they are able to do a multitude of tasks, and making AI-generated videos is one of them. However, this can lead to malicious actors using these types of tools to spread misinformation and realistic videos that are not based on reality. We want to find a way to find these types of videos and notify the user of any AI-generated content, and whether the information portrayed in the video is factual or not.

What it does

The project lets a user select a video (either by uploading it or providing a link), and the system goes to work. The project then parses the video, providing feedback on whether the video is factually correct and to what extent. The user will also be provided with a "Community Notes" style response that details if the video is factual or not, and provides some sources to back that up.

How we built it

We built this using TwelveLab's and Google Gemini's APIs. TwelveLabs provides a service where we can use an AI-based analysis tool for videos. providing deep insights into the videos that are uploaded. We can use the analysis of the video and send it to Gemini, which can further process the video and come to a conclusion with grounded responses, making sure to mitigate and knowledge cutoffs.

Challenges we ran into

The first big challenge was the Gemini API usage limit. When it came to testing, we were getting rate-limited because we were using up our daily quota. However, we got around this with some information provided by mentors and sponsors to obtain additional credits. We also had a technical challenge on making sure that we could tweak the system to properly pass information from TwelveLab's API to Gemini's API.

Accomplishments that we're proud of

We managed to get an extension working where users can easily use this project on any website that supports videos.

What we learned

We learned that there are many real-world limits on how fast we can work, so making sure to communicate is important to optimize our usage of resources and time.

What's next for ClankerDetector

We will continue to work on the project further, refining UI and usability, making it accessible to anyone with the use of free API tiers.

Built With

Share this project:

Updates

posted an update

We built the project around TwelveLabs’ multimodal video indexing to make our project uniquely evidence-driven: every uploaded video is indexed for transcript, visible text, scene summaries, audio characteristics and visual-quality cues, and those concrete, time-aligned outputs are fed directly into our LLM (Gemini) and web fact‑checker (Tavily) instead of relying on model-only impressions. Coupled with C2PA content‑credentials checks and a SynthID multimodal detector, this lets the system cross-check audio vs. visuals, extract checkable claims, and attach verifiable source snippets to every verdict producing explainable, auditable decisions rather than opaque scores. Operationally, TwelveLabs’ asynchronous asset + index workflow enables robust handling of long videos and scene-level reasoning, and the frontend surfaces the raw transcript/scene evidence alongside trust scores so users can validate findings themselves. TwelveLabs supplies the multimodal evidence layer that turns AI reasoning into verifiable, transparent forensic outputs. We also aim to use the indexing provided by the API to reduce the output time.

Log in or sign up for Devpost to join the conversation.