Inspiration
Developer workflows are often complex, entangled with layers of tooling, evolving codebases, and frequent interruptions (like meetings, breaks, or weekends). Thus, maintaining well-established context becomes central to reasoning through tasks and making software decisions.
Today, large language models are great at answering questions, but they're fundamentally stateless. In Turing City, we imagine a future where AI assistants can automatically operate alongside your work, enabling prompts like "Pick up where I left off last week." To actualize this vision, we need a way to capture high-quality context from real workflows, which is where Iconic Memory is born.
What it does
Iconic Memory is a developer observability and recontextualization tool for your personal workflow. It can:
- Track activity in real time: After starting a session, a vision language model continuously observes the screen and produces structured insights about your work
- Compress outputs into structured memory: Instead of storing raw video or even the verbose outputs of the vision language model, we apply a custom compression algorithm to produce concise input for downstream LLM use
- Store knowledge as a queryable timeline: Compressed insights are indexed for search, enabling re-entry into past work.
How we built it
We built an end-to-end system with three core pieces:
Frontend Web UI
- Built with TypeScript, React, and Tailwind CSS.
- Allows developers to start and stop workflow sessions with one click
- Renders a session timeline where past insights can be reviewed and searched
Overshoot Vision Model Integration
- Built with Javascript and Overshoot SDK
- Streams screen frames for continuous analysis
- Produces structured natural-language observations about on-screen activity
Data Compression Algorithm
- Built with TypeScript on the backend
- Leverages intuition that closely-spaced vision model inference calls may produce similar outputs
- Embeds natural language descriptions into vectors and determines cosine similarity of adjacent outputs
- Filters similar outputs to produce concise input for downstream LLM use
Challenges we ran into
- Streaming Reliability: Going through an entire workflow requires keeping a long-running stream stable, so we needed to configure the Overshoot processing parameters accordingly to adapt to our use case.
- Large Volumes of Data: For long workflows, the vision language model often produced too much data for a downstream LLM to reliably handle, so we needed to devise a compression algorithm that ensured scalability while preserving meaning.
Accomplishments that we're proud of
- Configuring the web UI, vision model, and data compression algorithm into a cohesive application
- Enabling cross-application functionality beyond just code (screen display can observe any app)
- Structuring data in a compressed manner for efficient usage
What we learned
- Working with vision language model outputs even when raw outputs are often very verbose
- Applying LLM reasoning to long-term developer workflows, rather than simple chat interfaces
What's next for Iconic Memory
- Adaptive Focus Modes: Modify Overshoot processing parameters based on the application currently in use (i.e., higher precision for VS Code, lower for Slack, etc.)
- Applying Workflow Data to Personalized AI: Aggregate high-quality, privacy-preserving workflow patterns to be used for training/evaluation data for personalized AI systems
Built With
- css
- gpt
- javascript
- openrouter
- overshoot
- react
- typescript
- vite
Log in or sign up for Devpost to join the conversation.