Architectural diagram

Why Gemini

Gemini 3 Integration — Tab by Tab

TAB 1 — Overview

Gemini 3 powers the executive intelligence layer of Omnilytics. It synthesises structured outputs from every downstream analytical module into a decision grade overview that answers five critical questions in seconds: what happened, why it happened, how certain we are, and what would have changed the outcome. Crucially, Gemini is not generating free text. It is constrained to structured causal graphs, uncertainty tensors, and intervention models. This prevents hallucination and ensures every sentence is evidence tethered to frame IDs.

Only Gemini’s long context multimodal reasoning allows simultaneous grounding in video events, measurement uncertainty, and causal topology. This enables executive summaries that are both readable and forensically traceable, something traditional LLMs cannot achieve at this fidelity.

TAB 2 — Evidence & Integrity

Gemini 3 operates as the forensic validator of the pipeline. After FFmpeg extraction and change point detection produce metadata, Gemini classifies video integrity severity and explains degradation risks in regulatory language. It links frame anomalies to downstream causal confidence, ensuring that corrupted evidence cannot silently influence reasoning.

Gemini’s multimodal verification ensures the system remains legally defensible. Rather than assuming footage is trustworthy, the AI explicitly audits provenance, compression artefacts, and temporal continuity before analysis proceeds.

TAB 3 — Perception & Signals

In this tab Gemini functions as a perception auditor, not a detector. Detection and tracking models generate motion signals, but Gemini validates their semantic coherence. It flags perceptual ambiguity, occlusion driven signal collapse, and tracking inconsistencies.

This is critical because causal inference built on unstable perception is invalid. Gemini’s role is to refuse reasoning when signals fall below observability thresholds. This safeguard transforms the system from visually impressive to scientifically reliable.

TAB 4 — Measurements & Dynamics

Gemini 3 interprets physical motion through a physics aware reasoning layer. It reviews derived kinematic signals such as velocity, acceleration, and time to collision, validating whether dynamics obey real world constraints.

It flags impossible accelerations, implausible deceleration curves, and measurement artefacts. It then translates quantitative findings into structured summaries without detaching them from confidence intervals. This ensures numerical analysis remains interpretable without losing rigour.

TAB 5 — Uncertainty & Observability

Gemini models epistemic and perceptual uncertainty as first class analytical objects. It aggregates occlusion, sensor visibility, and signal degradation into node level confidence distributions.

Rather than hiding doubt, Gemini explains where and why the system cannot know. This transparency is essential for regulatory and industrial deployment where overconfidence is risk.

TAB 6 — Causal Graph

Here Gemini performs structural reasoning rather than descriptive reasoning. It analyses directed acyclic graphs constructed from temporal precedence and constraint validation.

Gemini evaluates latent variables, edge stability, and causal sufficiency. It explains not just what caused an outcome, but how robust that causal claim is under graph perturbation. This elevates the system from analytics to true causal inference.

TAB 7 — Counterfactuals

Gemini enables structured counterfactual simulation without fabricating video. It perturbs causal variables within structural causal models and computes divergence pathways.

Because Gemini can reason over graph deltas rather than pixels, it explains how outcomes change while preserving evidentiary grounding. This avoids speculative visual generation while still enabling actionable foresight.

TAB 8 — Interventions

Gemini translates causal insight into operational strategy. It evaluates intervention sets using cost effectiveness, causal leverage, and side effect modelling.

It ranks minimal change actions capable of preventing catastrophic outcomes. This transforms analysis into prevention, which is the core mission of Omnilytics.

TAB 9 — Stress Tests

Gemini stress tests epistemic resilience. By simulating evidence degradation and perceptual loss, it measures survivability of conclusions.

It identifies brittleness thresholds where causal claims collapse. This ensures decisions are not made on fragile analytical foundations.

TAB 10 — Reports & Audit

Gemini converts complex multimodal reasoning into procurement grade documentation. It generates audit ready reports, immutable logs, and executive briefings grounded entirely in system evidence.

This makes outputs deployable across legal, industrial, and regulatory environments.

Inspiration

During a robotics work experience placement, I watched hours of operational footage being reviewed after minor system failures. What struck me was not the technology, but the limitation of hindsight. We could see what went wrong, but never the smallest moment where intervention could have prevented it.

That stayed with me. Disasters rarely come from single catastrophic errors. They emerge from tiny, compounding signals that no one has time to analyse holistically.

I wanted to build something that did not just explain failure, but prevented it.

Omnilytics was born from the question: what is the smallest possible change that could stop the biggest possible consequence?

What it does

Omnilytics is a causal video intelligence platform that transforms raw footage into intervention grade insight.

From a single uploaded video, the system reconstructs events, validates evidence integrity, extracts motion signals, models physical dynamics, quantifies uncertainty, builds causal graphs, simulates counterfactuals, and computes minimal interventions capable of changing outcomes.

It does not generate speculative video. It generates defensible causal reasoning grounded entirely in observed evidence.

How we built it

The entire platform was architected around Gemini 3 via the Gemini API in AI Studio.

Each analytical tab functions as a modular reasoning layer, with structured outputs feeding downstream causal computation. Video perception pipelines generate signals which Gemini validates, contextualises, and reasons over using long context multimodal understanding.

A key engineering challenge was reliability. Every function had to agree. No hallucinations, no speculative leaps, and no reasoning beyond evidentiary support. This required tightly constrained prompting, structured data contracts, and cross tab validation logic.

The result is a system where language generation is never detached from physical or causal evidence.

Challenges we ran into

This domain was entirely new to me.

My prior experience was in productivity applications, not video intelligence, causal inference, or multimodal AI reasoning. Learning how to orchestrate perception systems, physics modelling, and LLM reasoning into a single reliable pipeline required constant iteration.

Ensuring that Gemini reasoned without hallucinating was the hardest challenge. I had to design prompts that constrained interpretation while still leveraging its reasoning power.

Accomplishments that we’re proud of

I am most proud that Omnilytics does not just analyse events. It explains prevention.

The intervention engine identifies the smallest actionable changes capable of avoiding catastrophic outcomes. That shift from hindsight to foresight is what makes the platform meaningful.

I am also proud of the system’s transparency. Every conclusion is traceable to evidence, uncertainty, and causal structure.

What we learned

Building Omnilytics transformed how I think about AI.

I learned that multimodal intelligence is not about generating impressive outputs, but about grounding reasoning in reality. I also learned how to design systems where AI augments scientific analysis rather than replacing it.

Most importantly, I learned that stepping into unfamiliar technical domains is where the most meaningful innovation happens.

What’s next for Omnilytics

The next evolution is counterfactual video simulation, enabling visualisation of how events could have unfolded under alternative interventions.

I also plan to expand beyond single domain incidents into sectors such as industrial safety, autonomous transport, healthcare operations, and infrastructure monitoring.

The long term vision is a universal causal intelligence layer capable of analysing any real world system where video evidence exists.

Omnilytics is not just about understanding the past. It is about engineering safer futures.

Built With

aistudio
react
tailwind
typescript

Updates

Ananya Gulati posted an update — Feb 22, 2026 05:48 AM EST

We have now implemented more domains for this project so it can be used in a wider variety of applications and the full extent of all features may be viewed in different industries.

Log in or sign up for Devpost to join the conversation.

Ananya Gulati started this project — Feb 09, 2026 05:53 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.