Inspiration
While working on my final year project report, I spent a lot of time researching online. I opened dozens of tabs, visited many websites, and read through documentation, articles, and tutorials. At the time, it felt productive, but later I realized something frustrating. I could not easily find some of the information again.
Browser history only stores URLs and timestamps. It does not capture the meaning of what you read, and searching through it is difficult when you don’t remember the exact site or keywords. On top of that, browsing activity can be scattered across different browsers, devices, or accounts.
Even though I knew I had seen useful information somewhere before, rediscovering it later was surprisingly difficult. This experience inspired me to build Retrace.
What it does
Retrace is an AI-powered memory layer for browsing. Instead of relying on traditional browser history, it captures page visits and highlights while you browse and stores them as searchable memory events.
Using semantic embeddings and AI, users can ask natural language questions about their past browsing activity.
For example, users can ask:
- “What did I read about vector databases?”
- “What did I highlight yesterday?”
- “What pages did I visit on the 6th?”
Retrace searches stored browsing events semantically and returns relevant results with links to the original pages.
In simple terms, Retrace turns browsing history into something you can search and interact with using AI.
How I built it
Retrace consists of three main components:
Browser Extension
A lightweight browser extension records page visits automatically and allows users to save highlights from any webpage.
Backend API (FastAPI)
The backend stores browsing events as memory records and exposes endpoints for querying the stored data.
Semantic Search Layer
Browsing events are converted into embeddings and stored in a vector database (Chroma), enabling semantic search based on meaning rather than exact keywords.
For AI-powered responses, Retrace uses Amazon Bedrock, specifically Nova Lite for answering questions and Titan Text Embeddings V2 for generating embeddings used in semantic search.
This architecture enables a Retrieval-Augmented Generation (RAG) workflow where user queries retrieve relevant browsing events before generating a contextual response.
Challenges I ran into
One challenge was dealing with the unstructured nature of web content. Every webpage has a different structure, so instead of trying to store everything in a rigid format, the system stores flexible memory events such as page visits and highlights.
Another challenge was balancing usefulness and privacy. Since browsing activity can contain sensitive information, it is important to design the system so that users control what gets saved and highlighted.
Finally, implementing semantic search required designing a pipeline that converts browsing data into embeddings while still keeping the system lightweight enough to run locally for demonstration purposes.
Accomplishments that I am proud of
I successfully built a working end-to-end prototype that demonstrates how AI can turn browsing activity into a searchable memory system.
Some accomplishments include:
- Building a browser extension that captures browsing events and highlights
- Implementing semantic search over browsing data using embeddings
- Integrating Amazon Bedrock models for AI-powered query responses
- Creating a working interface where users can ask questions about their browsing history
The result is a system where browsing history becomes interactive and searchable using natural language.
What I learned
Through building Retrace, I gained practical experience working with:
- Browser extension development
- Semantic embeddings and vector databases
- Retrieval-Augmented Generation (RAG)
- Amazon Bedrock models, such as Nova Lite and Titan embeddings
More importantly, this project showed how AI can transform everyday activity data into a useful knowledge system that helps people rediscover information they previously encountered online.
What's next for Retrace
Retrace is currently a local prototype, but there are several exciting directions for future development.
One main improvement is adding user accounts and synchronization, allowing browsing memory to be securely accessed across multiple devices instead of being limited to a single machine.
Another direction is smarter memory selection, where AI models help determine which pages are worth storing. Retrace could automatically ignore sensitive contexts such as login pages, password fields, private dashboards, incognito sessions, and payment flows, while also filtering out low-value pages like generic search results or error pages. Explicit user highlights would always be preserved, ensuring users retain full control over what becomes part of their browsing memory while protecting privacy and sensitive information.
Retrace could also generate lightweight summaries of important pages so that users can quickly understand the context of what they previously read without needing to reopen every page.
Over time, the system could evolve into a richer personal knowledge system that connects browsing events, highlights, and topics into a structured memory of a user's learning and exploration on the web.
Built With
- chromadb
- css
- fastapi
- html
- javascript
- manifestv3
- novalite
- sqlite
- titantextembeddingsv2
Log in or sign up for Devpost to join the conversation.