Oki-Doki is a context-aware AI companion designed to support users in their daily lives by understanding behavior, detecting emotional signals, and providing real-time, meaningful assistance.

Unlike traditional apps that require users to actively seek help, Oki-Doki works continuously in the background—observing patterns, interpreting context, and offering proactive support through conversation and actionable nudges. This makes the system more natural, accessible, and impactful in everyday situations.

At the heart of the system is the Gemini API (Project Number: 742990808149), which serves as the primary intelligence layer of the entire project. Gemini is responsible for reasoning, interpreting user context, and generating meaningful, human-like responses. It enables Oki-Doki to move beyond simple automation into a system that can understand, decide, and act intelligently.

The system integrates AI, real-time processing, and IoT to create a seamless experience across devices. By combining behavioral AI (Me-Do framework) with modern technologies, Oki-Doki bridges the gap between understanding and action—helping users improve their well-being, productivity, and daily habits.

Overall, the project focuses on solving a real-world problem: the lack of continuous, accessible, and proactive support in people’s everyday lives, delivering a solution that is both technically robust and socially impactful.

Github link : https://github.com/AnshumanAtrey/oki-doki Video demo : https://drive.google.com/file/d/1n9O97OijyY3VhWEWojuTXZnZ2BTYCbyY/view

Built With

+ 16 more
Share this project:

Updates

posted an update

Event vocabulary (14 types, 4 routing tiers)

User action Event Route Result
⌘⇧Space HOTKEY BIG_LLM Vision + mood reply + animation
⌘⌃F MEAL_LOG BIG_LLM Nutrition extraction + photo upload + meals row
⌘⌃R MEDICAL_REPORT_LOG BIG_LLM OCR + meds/conditions extracted, auto-merged into profile
⌘⌃D DOCTOR_SUMMARY BIG_LLM 1-page markdown summary → /tmp + clipboard
"Hey Shelly, …" SPEECH → wake BIG_LLM Conversational reply (60s follow-up window)
"chest pain" SPEECH → EMERGENCY BIG_LLM Urgent reply + severity='urgent' observation, bypasses budget
Ambient talk SPEECH LOG Fed into passive tracker
60s speech + 30s silence CONVERSATION_ENDED BIG_LLM Auto-summary + durable memory extraction
Med schedule hits MED_REMINDER CHEAP_LLM Profile-driven medication nudge
First activity ≥6am MORNING CHEAP_LLM Good-morning line
9pm first tick DAILY_SUMMARY BIG_LLM Day rollup
Random 30–90m active CHECK_IN CHEAP_LLM Proactive warm nudge, 5/day cap
Idle >10m → active ACTIVE (with was_idle_seconds) CHEAP_LLM Welcome back
>45m social media APP_DURATION CHEAP_LLM Doomscroll nudge
Night idle ≥4h (observation) Logged as inferred sleep

Rate limit: OMI_LLM_BUDGET_PER_HOUR (default 10). Emergencies bypass it.

Log in or sign up for Devpost to join the conversation.

posted an update

Architecture

┌── On the recipient's Mac ──────────────────────────────────────┐
│                                                                │
│  Mic → WebRTC VAD → faster-whisper (local)                     │
│                                     │                          │
│  Camera (on-demand) ───┐            │                          │
│  Screen/app sampler ───┤            │                          │
│  HID idle detector ────┤            │                          │
│  Hotkeys ⌘⇧Space,⌘⌃F,R,D├─────▶ Event Bus ─▶ Scheduler         │
│  Cron (morning / 9pm /  │                 │                    │
│       random check-ins /│       ┌─────────┴─────────┐          │
│       med reminders) ───┘       │ rules.decide:     │          │
│  Conversation tracker ──────────▶ SKIP/LOG/         │          │
│  (passive capture)              │ CHEAP_LLM/BIG_LLM │          │
│                                  └─────────┬─────────┘         │
│                                            │                   │
│                  ┌─────────────────────────▼──────────────┐    │
│                  │ Gemini 2.5 pro/flash · structured JSON │    │
│                  │   ↑ profile + permanent memories +     │    │
│                  │     semantic-relevant memories injected│    │
│                  │     into every system prompt           │    │
│                  └─────────────────────────┬──────────────┘    │
│                                            │                   │
│            ┌───────────────────────────────┼──────────┐        │
│            ▼                               ▼          ▼        │
│   TTS (pyttsx3 / NSSpeech)         JSON over USB   SQLite      │
│   → Mac speaker                    → ESP32 firmware  + pgvec   │
│                                      (OLED eyes +   embeddings │
│                                       2 flippers)      │       │
└─────────────────────────────────────────────────────┼──────────┘
                                                      │
                                    (every 60s async) │
                                                      ▼
              ┌── Supabase (shared source of truth) ────────────┐
              │ users.profile (JSONB)                            │
              │ memories + vector(1536) + RPC match_memories()   │
              │ meals · observations · messages · activity       │
              │ Storage: photos bucket (food + reports)          │
              └──────────────────────────────────────────────────┘
                                      │
                       ┌──────────────┴──────────────┐
                       ▼                             ▼
             Gifter web (Next.js)           Emergency / daily-summary
             · daily feed                   · FCM push (TODO post-hack)
             · attention feed
             · medical history
             · meal log

Log in or sign up for Devpost to join the conversation.