Anchor

Inspiration

We built this for ourselves. As students, we know what it feels like to sit down with every intention of getting work done and then just... disappear into your phone or a random YouTube rabbit hole. For people with ADHD this happens constantly, and the worst part is every existing tool treats it like a discipline problem. Blockers, timers, alarms. They add friction and shame to something that already feels overwhelming. We wanted to build something that actually understood the problem, something that felt like a friend checking in on you rather than a system punishing you.

What it does

Anchor is an AI focus companion that lives on your computer during work sessions. You tell it your task and it gets to work. In the background it reads your active window and uses a combination of a Markov chain model and Gemini to classify whether what you are doing is actually related to your task or if you have drifted, while your webcam simultaneously picks up physical cues like grabbing your phone, looking away, or going idle using just body landmarks with nothing ever recorded. When drift is detected, K2 Think V2 steps in as the reasoning brain, analyzing your full session context to figure out not just that you drifted but why, what your state is, and what the right response is for that exact moment. Your Smiski buddy then walks onto the screen and checks in with a warm personalized message that is different every single time. You can take a break, get pulled back, or end the session. Over time Anchor builds a picture of your personal drift patterns using super memory, learning your specific triggers and getting ahead of them before they even happen. Mid session you can also just say "Hey Buddy" and talk to Smiski, and it logs your thoughts automatically. At the end you get a full session summary with your focus streaks, drift patterns, and all your notes packaged into a downloadable PDF so nothing gets lost. Everything follows you across every browser tab through our Chrome extension too.

How we built it

The backend is a FastAPI server in Python running the full pipeline. Window monitoring reads your active window title and feeds it through two layers of classification, a Markov chain model that learns transition probabilities between your app switching behavior, and Gemini 2.5 Flash that understands the semantic meaning of what you are doing relative to your specific task. The core reasoning brain of Anchor is K2 Think V2. When drift is detected, the full session context including your task, elapsed time, drift count, complete behavioral timeline, past session profile, and HMM state prediction gets sent to K2. It performs deep multi step reasoning to analyze drift patterns, infer your psychological state like whether you are fatigued or avoidant, evaluate break urgency, and then select the optimal intervention strategy from options like gentle redirect, task chunking, suggest break, or stay silent. It then crafts a personalized message referencing your actual task and the specific app you drifted to. A LangChain agent and Gemini serve as fallbacks if K2 is ever unavailable, but K2 is always the primary decision maker. We use super memory to persist drift patterns across sessions so the system builds a long term picture of each user's behavior. Voice nudges come through ElevenLabs TTS and voice input is transcribed so users can talk to Smiski mid session and have their thoughts logged automatically. The webcam activity monitor uses MediaPipe Tasks to detect phone use, idle states, and head position in real time. The frontend is React with TypeScript and Tailwind, connected to the backend via WebSocket for real time events. The Chrome extension injects a content script into every page so the Smiski buddy can appear anywhere in the browser.

Challenges we ran into

One of the biggest technical challenges was syncing the Smiski character with the ElevenLabs voice output. Getting the buddy to visually react and animate in sync with what it was actually saying in real time was a lot harder than we expected, since audio playback timing and frontend state updates do not naturally align and required careful coordination between the WebSocket events and the UI layer.

Building the MediaPipe activity monitor was another major challenge. Getting reliable real time detection of phone use, head position, and idle states from just body landmarks without false positives required a lot of tuning. Small things like someone leaning back or looking at a second monitor would incorrectly trigger drift detection, so we had to carefully calibrate thresholds and add smoothing logic to make it actually usable.

Making the AI agent feel human and not robotic was harder than expected too. Early versions repeated the same message every single time because Gemini at temperature zero is completely deterministic. We fixed this by building a dedicated buddy message endpoint with randomized tone prompts and higher temperature so every response feels fresh. We also ran into Gemini API quota limits on the free tier mid-build which forced us to debug fast and switch models. On the frontend, getting the Smiski character to only fire its session start greeting once and not re-trigger every time the user went on break took more work than expected. And merging two teammates working on the same component simultaneously also created some interesting conflicts we had to carefully resolve.

Accomplishments that we're proud of

Getting the full pipeline working end to end is something we are genuinely proud of. Screen monitoring, webcam activity detection, Markov chain and Gemini classification, K2 Think V2 reasoning, ElevenLabs voice, Smiski companion, voice note logging, and the Chrome extension all talking to each other in real time is no small thing for a hackathon timeline.

We are really proud of how deep the reasoning goes. K2 does not just detect drift, it infers why you drifted, what your psychological state is, and picks the intervention that actually fits that moment. That level of personalization is something no existing focus tool comes close to.

We are also proud of the personalized drift pattern learning. Using super memory the system builds a picture of your behavior across sessions and starts getting ahead of your triggers before they happen, which feels like a genuinely meaningful step toward something that adapts to you as an individual rather than treating everyone the same.

And we are proud of how the companion actually feels. The messages are warm, varied, and contextual in a way that does not feel like a notification or a system. It genuinely feels like something is watching out for you, and getting that right took a lot of iteration.

What we learned

That ADHD is not a willpower problem and building for it requires a completely different design philosophy. We learned how to integrate a deep reasoning model like K2 Think V2 into a real time pipeline without introducing latency that breaks the user experience. We learned a lot about running LangChain agents without letting them become too aggressive, how to use MediaPipe Tasks for real time body landmark detection, and how to keep a WebSocket driven UI feeling snappy. We also learned that making AI feel human is way harder than making it functional, and that the emotional design of a product matters just as much as the technical architecture.

What's next for Anchor

We want to add Google Calendar integration so Anchor knows when your deadlines are and adjusts how hard it nudges you. We are building a mood check-in at the start of each session so Smiski adapts its tone to how you are actually feeling that day. A parent and coach dashboard is on the roadmap for younger users so they can share session summaries with someone who supports them without any privacy concerns. The bigger vision is a focus layer that learns your personal drift patterns over time and gets ahead of them before they even happen.