Inspiration

Many productivity tools require users to constantly log tasks, manage timers, or reflect manually at the end of the day. Over time, this friction often leads to burnout or abandonment.

We were inspired by a simple question:

What if your computer could quietly understand your day and help you reflect on it—without asking you to do anything extra?

Instead of optimizing for productivity scores, we wanted to design a system focused on reflection, care, and context. Mosaic explores how passive signals from everyday computer use can be transformed into a meaningful narrative of one’s day.


What it does

Mosaic is a backend system that turns raw desktop activity into a structured daily story.

It:

  • Passively captures desktop screenshots at a fixed interval
  • Automatically pauses capture when blacklisted apps or windows are visible
  • Uses Gemini to interpret screenshots into human-readable activities
  • Merges frames into a coherent, time-aware daily timeline
  • Generates lightweight feedback events
  • Produces a daily reflection report and a stylized “redraw of the day” image

All outputs are locally saved as stable artifacts (JSON + images), making Mosaic easy to extend with future web or desktop interfaces.


How we built it

Mosaic is implemented as a modular, end-to-end pipeline.

Capture Stage

The system periodically captures desktop screenshots and saves them with real timestamps.
To respect user boundaries, capture is automatically paused whenever a blacklisted application or window is detected.

Timeline Analysis (Gemini-powered)

Captured screenshots are analyzed using Gemini’s multimodal reasoning capabilities to infer high-level activities.
Rather than labeling each frame independently, Mosaic:

  • merges consecutive screenshots into timeline segments,
  • smooths transitions using midpoint boundaries,
  • and detects idle periods when screenshots are far apart and visually similar.

The result is a structured Timeline JSON representing how the day unfolded over time.

Behavior Feedback

From the timeline, Mosaic generates low-frequency feedback events (roughly hour-level).
These are designed to be supportive and reflective rather than corrective or intrusive.

Daily Report (Gemini-powered)

At the end of the day, Gemini is used again to synthesize:

  • a reflective daily summary,
  • a caring, human-readable message,
  • and a stylized “redraw of the day” image capturing the overall vibe of the day.

System Architecture Overview

Mosaic follows a clear, stage-based architecture from passive capture to daily reflection.

Pipeline stages:

  1. Capture screenshots with blacklist-aware pausing
  2. Analyze activity with Gemini
  3. Merge frames into a time-aware timeline
  4. Generate lightweight feedback events
  5. Produce a daily report and visual redraw
  6. Store all outputs as stable JSON artifacts

Optionally, Mosaic can sync relevant information with Google Calendar and Tasks, integrating with existing workflows without becoming intrusive.


Challenges we ran into

  • Ambiguity of visual signals
    A single screenshot rarely provides enough context, requiring careful temporal merging.

  • Maintaining time accuracy
    Screenshot intervals can vary due to pauses and system conditions, so the pipeline needed to remain robust to irregular timing.

  • Balancing insight and intrusion
    We intentionally avoided constant nudges or productivity pressure.

  • Structured output from LLMs
    Ensuring reliable JSON from Gemini required careful prompt design and defensive parsing.


Accomplishments that we're proud of

  • Built a complete multimodal pipeline within a hackathon timeframe
  • Used Gemini for both reasoning and creative synthesis
  • Designed the system around extensible artifacts instead of a fixed UI
  • Created a reflection-focused experience rather than a productivity tracker

What we learned

  • Passive systems are more sustainable when they minimize user friction
  • Temporal reasoning is as important as classification accuracy
  • Multimodal models become far more powerful when paired with thoughtful post-processing
  • Designing for extensibility improves both clarity and robustness

What's next for Mosaic: Piece together your life

  • A lightweight web or desktop UI built on top of the generated artifacts
  • Personalized reflection styles and emotional tones
  • Long-term trend analysis across multiple days
  • Local-first or on-device inference for stronger privacy guarantees

Built With

Share this project:

Updates