Inspiration
Many productivity tools require users to constantly log tasks, manage timers, or reflect manually at the end of the day. Over time, this friction often leads to burnout or abandonment.
We were inspired by a simple question:
What if your computer could quietly understand your day and help you reflect on it—without asking you to do anything extra?
Instead of optimizing for productivity scores, we wanted to design a system focused on reflection, care, and context. Mosaic explores how passive signals from everyday computer use can be transformed into a meaningful narrative of one’s day.
What it does
Mosaic is a backend system that turns raw desktop activity into a structured daily story.
It:
- Passively captures desktop screenshots at a fixed interval
- Automatically pauses capture when blacklisted apps or windows are visible
- Uses Gemini to interpret screenshots into human-readable activities
- Merges frames into a coherent, time-aware daily timeline
- Generates lightweight feedback events
- Produces a daily reflection report and a stylized “redraw of the day” image
All outputs are locally saved as stable artifacts (JSON + images), making Mosaic easy to extend with future web or desktop interfaces.
How we built it
Mosaic is implemented as a modular, end-to-end pipeline.
Capture Stage
The system periodically captures desktop screenshots and saves them with real timestamps.
To respect user boundaries, capture is automatically paused whenever a blacklisted application or window is detected.
Timeline Analysis (Gemini-powered)
Captured screenshots are analyzed using Gemini’s multimodal reasoning capabilities to infer high-level activities.
Rather than labeling each frame independently, Mosaic:
- merges consecutive screenshots into timeline segments,
- smooths transitions using midpoint boundaries,
- and detects idle periods when screenshots are far apart and visually similar.
The result is a structured Timeline JSON representing how the day unfolded over time.
Behavior Feedback
From the timeline, Mosaic generates low-frequency feedback events (roughly hour-level).
These are designed to be supportive and reflective rather than corrective or intrusive.
Daily Report (Gemini-powered)
At the end of the day, Gemini is used again to synthesize:
- a reflective daily summary,
- a caring, human-readable message,
- and a stylized “redraw of the day” image capturing the overall vibe of the day.
System Architecture Overview
Mosaic follows a clear, stage-based architecture from passive capture to daily reflection.
Pipeline stages:
- Capture screenshots with blacklist-aware pausing
- Analyze activity with Gemini
- Merge frames into a time-aware timeline
- Generate lightweight feedback events
- Produce a daily report and visual redraw
- Store all outputs as stable JSON artifacts
Optionally, Mosaic can sync relevant information with Google Calendar and Tasks, integrating with existing workflows without becoming intrusive.
Challenges we ran into
Ambiguity of visual signals
A single screenshot rarely provides enough context, requiring careful temporal merging.Maintaining time accuracy
Screenshot intervals can vary due to pauses and system conditions, so the pipeline needed to remain robust to irregular timing.Balancing insight and intrusion
We intentionally avoided constant nudges or productivity pressure.Structured output from LLMs
Ensuring reliable JSON from Gemini required careful prompt design and defensive parsing.
Accomplishments that we're proud of
- Built a complete multimodal pipeline within a hackathon timeframe
- Used Gemini for both reasoning and creative synthesis
- Designed the system around extensible artifacts instead of a fixed UI
- Created a reflection-focused experience rather than a productivity tracker
What we learned
- Passive systems are more sustainable when they minimize user friction
- Temporal reasoning is as important as classification accuracy
- Multimodal models become far more powerful when paired with thoughtful post-processing
- Designing for extensibility improves both clarity and robustness
What's next for Mosaic: Piece together your life
- A lightweight web or desktop UI built on top of the generated artifacts
- Personalized reflection styles and emotional tones
- Long-term trend analysis across multiple days
- Local-first or on-device inference for stronger privacy guarantees
Log in or sign up for Devpost to join the conversation.