We built the project around TwelveLabs’ multimodal video indexing to make our project uniquely evidence-driven: every uploaded video is indexed for transcript, visible text, scene summaries, audio characteristics and visual-quality cues, and those concrete, time-aligned outputs are fed directly into our LLM (Gemini) and web fact‑checker (Tavily) instead of relying on model-only impressions. Coupled with C2PA content‑credentials checks and a SynthID multimodal detector, this lets the system cross-check audio vs. visuals, extract checkable claims, and attach verifiable source snippets to every verdict producing explainable, auditable decisions rather than opaque scores. Operationally, TwelveLabs’ asynchronous asset + index workflow enables robust handling of long videos and scene-level reasoning, and the frontend surfaces the raw transcript/scene evidence alongside trust scores so users can validate findings themselves. TwelveLabs supplies the multimodal evidence layer that turns AI reasoning into verifiable, transparent forensic outputs. We also aim to use the indexing provided by the API to reduce the output time.
Log in or sign up for Devpost to join the conversation.