OMNIGEN — Project Story

An Adaptive Multimodal AI Agent for Workforce Wellbeing & Performance
Hackathon Writeup · Dec 12, 2025


🌍 Inspiration

Modern executives and knowledge workers live inside fragmented systems:

  • one app for productivity,
  • another for health,
  • another for research,
  • and endless feeds for “staying informed.”

The result isn’t leverage — it’s cognitive overload.

OMNIGEN was inspired by a simple question:

What if an AI didn’t just answer questions, but actively optimized the human operating system in real time?

Instead of building another chatbot, the goal was to create a multimodal executive co-pilot — something that watches posture, understands context, tracks the global environment, supports focus, and intervenes before burnout or distraction takes hold.

The vision:
One dashboard. One agent. Total situational awareness.


🧠 What We Built

OMNIGEN is a proactive, adaptive AI system that combines vision, audio, reasoning, search, and code execution into a single wellbeing-and-performance engine.

At a high level:

[ \text{Human State} + \text{Environmental Context} + \text{Global Signals} \;\;\xrightarrow{\text{Gemini Reasoning}}\;\; \text{Actionable Insight} ]

Core Capabilities

1. OMNI-SENSE — Zero-Latency Biometric Telemetry

  • Uses MediaPipe GPU-accelerated vision at 60 FPS
  • Tracks face distance, head tilt, and posture in real time
  • Generates an “exoskeleton” overlay for instant ergonomic feedback

Key insight: latency matters. Feedback delayed by even 1–2 seconds breaks habit correction loops.


2. Strategic Executive Dashboard

  • Gemini 3 Pro synthesizes live news streams into a Global Pulse Vector
  • Radar chart dimensions:
    • Volatility
    • Innovation
    • Risk
    • Sentiment

Instead of doom-scrolling, users get a compressed macro snapshot in seconds.


3. OmniChat — Research & Code Engine

  • Multimodal Gemini 2.5 / 3 Pro
  • Google Search grounding for factual accuracy
  • Python & JavaScript execution for live charts and analysis

This transforms the chat from “assistant” into a research analyst + junior developer hybrid.


4. FocusVision Pro — Workspace Intelligence

  • Upload a desk photo
  • Gemini Vision analyzes:
    • monitor height
    • lighting
    • ergonomics
    • cable clutter

Actionable recommendations are generated instantly, turning static advice into situational coaching.


5. MoodMap GPS — Vibe-Based Navigation

  • Search places by feeling, not category
  • “Cozy,” “high-energy,” “quiet focus,” etc.
  • Combines Gemini logic with Google Maps grounding

This reframes location search as emotional optimization.


6. Neural Relaxation — AI Audio Comedy

  • Gemini TTS with PCM streaming
  • Context-aware jokes, spoken in a human-like neural voice

A surprisingly powerful stress reset during high-intensity work.


🛠️ How We Built It

Tech Stack

  • Core Intelligence:

    • Gemini 3 Pro (reasoning & synthesis)
    • Gemini 2.5 Flash (low-latency interactions)
  • Vision:

    • MediaPipe Tasks Vision (WASM + GPU acceleration)
  • Frontend:

    • React 19
    • TypeScript
    • Vite
  • UI / UX:

    • Tailwind CSS
    • Glassmorphism + Cyberpunk aesthetic
  • Data Visualization:

    • Recharts (responsive SVG)

Architecture Philosophy

  • Multimodal-first: vision, audio, text are peers — not add-ons
  • Low-latency feedback loops for health-related interventions
  • Composable agents, not monolithic prompts

Each OMNI module operates independently but shares context through a unified reasoning layer.


⚠️ Challenges We Faced

1. Latency vs. Intelligence

High-level reasoning models are powerful but slow.
We solved this by:

  • Using Gemini Flash for real-time feedback
  • Escalating to Gemini 3 Pro only when deep synthesis was needed

2. Signal Overload

Combining biometrics, news, mood, and environment risked overwhelming users.

Solution:

  • Aggressive abstraction
  • Visual compression (radar charts, overlays)
  • “Intervene only when it matters” philosophy

3. Human Trust

Anything that watches posture or environment must feel helpful — not invasive.

We focused on:

  • Transparency
  • Immediate, visible benefits
  • Clear boundaries (no silent monitoring)

📚 What We Learned

  • Wellbeing is a systems problem, not a reminder problem
  • Multimodal AI becomes powerful when modes collaborate, not coexist
  • Executives don’t want more data — they want clarity under pressure

Most importantly:

The future of AI isn’t reactive assistance.
It’s continuous, context-aware optimization of human potential.


🚀 Why OMNIGEN Matters

OMNIGEN isn’t just another productivity app.
It’s an adaptive cognitive exosystem — one that:

  • Protects your body
  • Sharpens your mind
  • Filters the noise of the world
  • And meets you where you are, in real time

One agent. Total awareness. Human-first AI.


Built With

Share this project:

Updates