OMNIGEN — Project Story

An Adaptive Multimodal AI Agent for Workforce Wellbeing & Performance
Hackathon Writeup · Dec 12, 2025

🌍 Inspiration

Modern executives and knowledge workers live inside fragmented systems:

one app for productivity,
another for health,
another for research,
and endless feeds for “staying informed.”

The result isn’t leverage — it’s cognitive overload.

OMNIGEN was inspired by a simple question:

What if an AI didn’t just answer questions, but actively optimized the human operating system in real time?

Instead of building another chatbot, the goal was to create a multimodal executive co-pilot — something that watches posture, understands context, tracks the global environment, supports focus, and intervenes before burnout or distraction takes hold.

The vision:
One dashboard. One agent. Total situational awareness.

🧠 What We Built

OMNIGEN is a proactive, adaptive AI system that combines vision, audio, reasoning, search, and code execution into a single wellbeing-and-performance engine.

At a high level:

[ \text{Human State} + \text{Environmental Context} + \text{Global Signals} \;\;\xrightarrow{\text{Gemini Reasoning}}\;\; \text{Actionable Insight} ]

Core Capabilities

1. OMNI-SENSE — Zero-Latency Biometric Telemetry

Uses MediaPipe GPU-accelerated vision at 60 FPS
Tracks face distance, head tilt, and posture in real time
Generates an “exoskeleton” overlay for instant ergonomic feedback

Key insight: latency matters. Feedback delayed by even 1–2 seconds breaks habit correction loops.

2. Strategic Executive Dashboard

Gemini 3 Pro synthesizes live news streams into a Global Pulse Vector
Radar chart dimensions:
- Volatility
- Innovation
- Risk
- Sentiment

Instead of doom-scrolling, users get a compressed macro snapshot in seconds.

3. OmniChat — Research & Code Engine

Multimodal Gemini 2.5 / 3 Pro
Google Search grounding for factual accuracy
Python & JavaScript execution for live charts and analysis

This transforms the chat from “assistant” into a research analyst + junior developer hybrid.

4. FocusVision Pro — Workspace Intelligence

Upload a desk photo
Gemini Vision analyzes:
- monitor height
- lighting
- ergonomics
- cable clutter

Actionable recommendations are generated instantly, turning static advice into situational coaching.

5. MoodMap GPS — Vibe-Based Navigation

Search places by feeling, not category
“Cozy,” “high-energy,” “quiet focus,” etc.
Combines Gemini logic with Google Maps grounding

This reframes location search as emotional optimization.

6. Neural Relaxation — AI Audio Comedy

Gemini TTS with PCM streaming
Context-aware jokes, spoken in a human-like neural voice

A surprisingly powerful stress reset during high-intensity work.

🛠️ How We Built It

Tech Stack

Core Intelligence:
- Gemini 3 Pro (reasoning & synthesis)
- Gemini 2.5 Flash (low-latency interactions)
Vision:
- MediaPipe Tasks Vision (WASM + GPU acceleration)
Frontend:
- React 19
- TypeScript
- Vite
UI / UX:
- Tailwind CSS
- Glassmorphism + Cyberpunk aesthetic
Data Visualization:
- Recharts (responsive SVG)

Architecture Philosophy

Multimodal-first: vision, audio, text are peers — not add-ons
Low-latency feedback loops for health-related interventions
Composable agents, not monolithic prompts

Each OMNI module operates independently but shares context through a unified reasoning layer.

⚠️ Challenges We Faced

1. Latency vs. Intelligence

High-level reasoning models are powerful but slow.
We solved this by:

Using Gemini Flash for real-time feedback
Escalating to Gemini 3 Pro only when deep synthesis was needed

2. Signal Overload

Combining biometrics, news, mood, and environment risked overwhelming users.

Solution:

Aggressive abstraction
Visual compression (radar charts, overlays)
“Intervene only when it matters” philosophy

3. Human Trust

Anything that watches posture or environment must feel helpful — not invasive.

We focused on:

Transparency
Immediate, visible benefits
Clear boundaries (no silent monitoring)

📚 What We Learned

Wellbeing is a systems problem, not a reminder problem
Multimodal AI becomes powerful when modes collaborate, not coexist
Executives don’t want more data — they want clarity under pressure

Most importantly:

The future of AI isn’t reactive assistance.
It’s continuous, context-aware optimization of human potential.

🚀 Why OMNIGEN Matters

OMNIGEN isn’t just another productivity app.
It’s an adaptive cognitive exosystem — one that:

Protects your body
Sharpens your mind
Filters the noise of the world
And meets you where you are, in real time

One agent. Total awareness. Human-first AI.

Built With

gemini2.5
gemini3
react
tailwindcss
typescript
vite

Submitted to

Gemini 3 Hackathon

Created by

I Worked as Frontend Dev and used React 19,TypeScript, Vite for frontend. It was great experience creating frontend part, solving errors, fixing positions, etc

SACHIN S RAVJI
IT Student | Open to Solve The Problems of IT & Enhance it | MERN Stack | Agentic AI | Gen AI Professional.
Prathamesh Bapu Jagtap
Tushar Ravindra Borse

Updates

Prathamesh Bapu Jagtap started this project — Feb 07, 2026 02:32 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.