OMNIGEN — Project Story
An Adaptive Multimodal AI Agent for Workforce Wellbeing & Performance
Hackathon Writeup · Dec 12, 2025
🌍 Inspiration
Modern executives and knowledge workers live inside fragmented systems:
- one app for productivity,
- another for health,
- another for research,
- and endless feeds for “staying informed.”
The result isn’t leverage — it’s cognitive overload.
OMNIGEN was inspired by a simple question:
What if an AI didn’t just answer questions, but actively optimized the human operating system in real time?
Instead of building another chatbot, the goal was to create a multimodal executive co-pilot — something that watches posture, understands context, tracks the global environment, supports focus, and intervenes before burnout or distraction takes hold.
The vision:
One dashboard. One agent. Total situational awareness.
🧠 What We Built
OMNIGEN is a proactive, adaptive AI system that combines vision, audio, reasoning, search, and code execution into a single wellbeing-and-performance engine.
At a high level:
[ \text{Human State} + \text{Environmental Context} + \text{Global Signals} \;\;\xrightarrow{\text{Gemini Reasoning}}\;\; \text{Actionable Insight} ]
Core Capabilities
1. OMNI-SENSE — Zero-Latency Biometric Telemetry
- Uses MediaPipe GPU-accelerated vision at 60 FPS
- Tracks face distance, head tilt, and posture in real time
- Generates an “exoskeleton” overlay for instant ergonomic feedback
Key insight: latency matters. Feedback delayed by even 1–2 seconds breaks habit correction loops.
2. Strategic Executive Dashboard
- Gemini 3 Pro synthesizes live news streams into a Global Pulse Vector
- Radar chart dimensions:
- Volatility
- Innovation
- Risk
- Sentiment
Instead of doom-scrolling, users get a compressed macro snapshot in seconds.
3. OmniChat — Research & Code Engine
- Multimodal Gemini 2.5 / 3 Pro
- Google Search grounding for factual accuracy
- Python & JavaScript execution for live charts and analysis
This transforms the chat from “assistant” into a research analyst + junior developer hybrid.
4. FocusVision Pro — Workspace Intelligence
- Upload a desk photo
- Gemini Vision analyzes:
- monitor height
- lighting
- ergonomics
- cable clutter
Actionable recommendations are generated instantly, turning static advice into situational coaching.
5. MoodMap GPS — Vibe-Based Navigation
- Search places by feeling, not category
- “Cozy,” “high-energy,” “quiet focus,” etc.
- Combines Gemini logic with Google Maps grounding
This reframes location search as emotional optimization.
6. Neural Relaxation — AI Audio Comedy
- Gemini TTS with PCM streaming
- Context-aware jokes, spoken in a human-like neural voice
A surprisingly powerful stress reset during high-intensity work.
🛠️ How We Built It
Tech Stack
Core Intelligence:
- Gemini 3 Pro (reasoning & synthesis)
- Gemini 2.5 Flash (low-latency interactions)
- Gemini 3 Pro (reasoning & synthesis)
Vision:
- MediaPipe Tasks Vision (WASM + GPU acceleration)
Frontend:
- React 19
- TypeScript
- Vite
- React 19
UI / UX:
- Tailwind CSS
- Glassmorphism + Cyberpunk aesthetic
- Tailwind CSS
Data Visualization:
- Recharts (responsive SVG)
Architecture Philosophy
- Multimodal-first: vision, audio, text are peers — not add-ons
- Low-latency feedback loops for health-related interventions
- Composable agents, not monolithic prompts
Each OMNI module operates independently but shares context through a unified reasoning layer.
⚠️ Challenges We Faced
1. Latency vs. Intelligence
High-level reasoning models are powerful but slow.
We solved this by:
- Using Gemini Flash for real-time feedback
- Escalating to Gemini 3 Pro only when deep synthesis was needed
2. Signal Overload
Combining biometrics, news, mood, and environment risked overwhelming users.
Solution:
- Aggressive abstraction
- Visual compression (radar charts, overlays)
- “Intervene only when it matters” philosophy
3. Human Trust
Anything that watches posture or environment must feel helpful — not invasive.
We focused on:
- Transparency
- Immediate, visible benefits
- Clear boundaries (no silent monitoring)
📚 What We Learned
- Wellbeing is a systems problem, not a reminder problem
- Multimodal AI becomes powerful when modes collaborate, not coexist
- Executives don’t want more data — they want clarity under pressure
Most importantly:
The future of AI isn’t reactive assistance.
It’s continuous, context-aware optimization of human potential.
🚀 Why OMNIGEN Matters
OMNIGEN isn’t just another productivity app.
It’s an adaptive cognitive exosystem — one that:
- Protects your body
- Sharpens your mind
- Filters the noise of the world
- And meets you where you are, in real time
One agent. Total awareness. Human-first AI.
Built With
- gemini2.5
- gemini3
- react
- tailwindcss
- typescript
- vite
Log in or sign up for Devpost to join the conversation.