Inspiration
Staying focused is hard — not because we lack discipline, but because modern work constantly pulls our attention across tabs, apps, and screens. Most focus tools only work if you keep looking at them. The moment you stop, they stop helping. We wanted to explore a different idea: what if focus felt like companionship instead of control?
What it does
You start by chatting with the agent to share what you want to work on and for how long. While you work across different tabs and screens, the companion quietly observes context and checks in only when it matters. It can: Notice when you’re working on the wrong task Recognize when you’re back on track Gently detect prolonged distractions Let you pause, switch, or stop sessions through conversation
How we built it
StayWithMe is built as a web-based experience powered by Gemini 3. Gemini 3 (text) handles conversation, intention setting, session management, and gentle reminders. Gemini 3 (vision) is used selectively to analyze screen context and reason about whether the current activity aligns with the user’s stated goal. A lightweight agent loop coordinates intent, session state, and tool calls. A picture-in-picture desktop companion provides ambient presence and subtle feedback without requiring the user to stay on the app page. We carefully limit when vision analysis runs to keep the experience respectful and efficient.
Challenges we ran into
Balancing awareness and intrusiveness: Knowing when to check in without annoying the user required careful timing and throttling. Limited system access in the browser: We had to creatively infer context using screen sharing and visual cues. Model availability and quota constraints: We designed fallback behaviors and manual controls so the system remains usable even when AI services are temporarily unavailable.
Accomplishments that we're proud of
Building a working context-aware focus agent, not just a timer Designing a companion that reacts gently instead of aggressively Integrating multimodal reasoning in a way that feels human-centered Creating a demo that shows real understanding, not scripted behavior
What we learned
Focus is emotional as much as technical. Multimodal AI is powerful, but restraint matters more than frequency. Small design choices — tone, timing, silence — shape user trust. AI systems feel more helpful when they stay with you instead of supervising you.
What's next for StayWithMe
A native desktop app for deeper system awareness Personalization over time based on user habits Richer emotional expression for the companion Support for creative work, studying, and deep reading Exploring long-term “focus memory” and reflection
Built With
- css
- gemini
- html
- javascript
- picture-in-picture
- react
- typescript
- using-google-gemini-3-(text-+-vision)-via-google-ai-studio
- vite
Log in or sign up for Devpost to join the conversation.