Stalk
Inspiration
Shopping algorithms only learn from what you click. But the strongest desire signal isn't a click — it's spending 3 minutes staring at a coat, closing the tab, and coming back 4 days later across 3 different sites. That passive behavior is invisible to every recommendation system that exists today.
We also noticed that 90% of wishlist items never get purchased — not because you stopped wanting them, but because something blocked you (price, availability, condition) and nobody acted on it. The intent was there. The data was there. Nobody was doing anything with it.
Stalk started from one question: what if your browser already knows what you want, and the only missing piece is something to act on it?
What it does
Stalk is a Chrome extension paired with an autonomous buying agent.
The extension runs silently in the background and captures passive desire signals from every product page you visit — how long you dwell, whether you come back within 4 days, whether you're comparing the same item across multiple sites simultaneously. No clicks required. No wishlists. Just browsing.
These signals feed a desire graph that scores every item you've looked at from 0–100. The higher the score, the more you want it — even if you never saved it anywhere.
The agent runs 24/7 in the background. It detects when you're blocked from buying something (price too high, wrong size, bad condition), finds a stylistically identical alternative from across the resale web that removes that specific blocker, and queues it for one-tap approval. It doesn't surface random recommendations — it surfaces the exact thing you already want, just in a version you can actually buy.
How we built it
Chrome Extension (Manifest V3) — vanilla JS, no build step. Tracks dwell time with an active-tab timer, return visits using per-domain visit history in chrome.storage.local, and cross-tab comparisons by querying all open tabs every 30 seconds. Injects a live shadow DOM overlay onto every product page showing the item's current desire score.
Backend (FastAPI + Python) — 22 REST endpoints, SQLite in WAL mode for concurrent-safe reads/writes, APScheduler running the agent loop as a daemon thread inside the FastAPI process.
Dual-layer desire intelligence engine:
We designed a dual-layer desire intelligence engine. Layer 1 continuously learns from implicit behavioral signals — dwell time, return visits, cross-site comparisons — using Alternating Least Squares matrix factorization, the same algorithm class powering Spotify Discover Weekly and Netflix. ALS decomposes the full user × item interaction matrix into compressed embeddings — 32 latent dimensions per user and per item — that encode taste without any explicit labels. Scores update in real time with no retraining lag, normalized across price tiers so a $45 hoodie doesn't dominate a $4,800 bag, and penalized by session competition — because attention is finite and the model knows it.
The model trains continuously. Every browsing session adds new cells to the interaction matrix. Every product page view triggers a background scrape that seeds 5–15 similar items into the catalog — so the matrix grows silently as you browse, the same way production recommendation systems at scale pipe streaming event data into periodic model refreshes. Every approval or rejection tightens the embedding constraints: approvals add high-confidence cells the model must satisfy; rejections suppress item vectors away from the user embedding. The model never sees a static dataset — it's always retraining against a matrix that reflects exactly what you've been obsessing over.
Layer 2 maps every item into an 83-dimensional aesthetic embedding space spanning color, material, silhouette, style, brand, and price tier. As you browse, the system continuously updates your taste centroid — your learned position in aesthetic space — using desire scores as confidence weights. Any new item that lands close to your centroid gets surfaced, even if you've never seen it. It's the difference between recommending what you clicked and understanding what you're drawn to.
Agent — blocker detection queries live listings for price, condition, and size constraints. Alternative finding scores each candidate two ways: cosine similarity on explicit aesthetic features (color, material, silhouette), and a dot product against the user's ALS embedding — which captures implicit patterns that feature extraction misses, like a consistent preference for European resellers or a revealed price sweet spot the user never explicitly set. The final ranking is a weighted blend of both scores. ALS catches what features can't describe. GPT-4o then generates a one-sentence explanation naming the specific blocker resolved. Fires only when a real blocker is detected and a real alternative is found — zero wasted API calls.
Frontend (React + Vite + Tailwind + D3) — force-directed desire graph where node size is desire score, live signal feed, taste profile panel, and an agent queue showing discovery cards with match percentage and GPT-4o reasoning.
Challenges we ran into
The cold start problem. With no prior purchase history, we had no ground truth labels to train against. We solved this by using dwell time and return behavior as direct proxies for intent — treating the signal itself as the label rather than predicting a downstream outcome. This let us ship a working desire model on day one with zero historical data.
Session competition. A user with 10 tabs open shouldn't have every item score equally high — attention is diluted. We built a session competition penalty that discounts each signal by √n competing items in the same 30-minute window, which meaningfully separates focused desire from casual browsing.
Generalizing to unseen items. ALS matrix factorization only scores items a user has already interacted with. To surface catalog items the user has never seen, we built the taste vector layer — a weighted average position in aesthetic feature space that can rank any item by proximity to the user's learned preferences.
Making the agent not fire constantly. Early versions spammed the queue with low-quality matches. We tightened the pipeline so the agent only queues something when a specific blocker is detected AND a listing is found that resolves that exact blocker. This made the output feel surgical rather than noisy.
Accomplishments that we're proud of
The signal scoring formula. Price-tier normalization plus session competition penalty is a genuinely novel framing — to our knowledge nobody has applied attention dilution modeling to passive browsing signals before.
The blocker-resolution framing for the agent. Most recommendation systems ask "what else might you like?" Ours asks "what's stopping you from buying what you already want, and how do we fix it?" That reframe made the output feel meaningfully different from any recommender we've seen.
The whole thing runs locally with zero infrastructure. No deployment, no cloud, no Redis, no queues. SQLite in WAL mode handles concurrent reads from the daemon and the API simultaneously. The entire stack from extension to agent runs on localhost.
What we learned
Passive signals are dramatically underexploited. The gap between "user clicked add-to-cart" and "user stared at this for 4 minutes across 3 separate sessions" is enormous in terms of signal quality, and nobody is capturing the latter.
The cold start problem is solvable without historical data if you reframe what you're modeling. Instead of predicting purchase probability (which requires purchase labels), measure desire directly from behavior. The signal is the label.
Blocker detection is more useful than ranking. Surfacing the 10th version of something you might like is much less valuable than identifying the one thing stopping you from buying something you already want.
What's next for Stalk
Real purchase labels. Once users approve agent recommendations, those approvals become ground truth. The path to a proper supervised model — XGBoost or a two-tower neural net — is just collecting enough of those labels. The feature set is already right.
Multi-user taste graphs. Right now the desire graph is single-user. The natural extension is social — people with overlapping taste vectors influencing each other's scores, or collaborative filtering across users who stalk the same items.
Proactive price tracking. The agent currently fires when it detects a blocker. The next version fires before the blocker is fully formed — alerting when a price is trending down toward your threshold, not after it's already there.
Native mobile. Browser history on mobile is where the most passive signal lives. A Safari extension or iOS share sheet could capture the same dwell and return signals on the device where most browsing actually happens.
Log in or sign up for Devpost to join the conversation.