WatchDog: The Agentic Browser Safeguard

Inspiration

As students, we’ve all experienced the "tab creep" phenomenon. You start by researching a simple linear algebra concept, and thirty minutes later, you're six videos deep into a YouTube rabbit hole. But, traditional website blockers are too binary, they either block everything or nothing, often failing to account for the fact that sometimes you need a "distracting" site for legitimate research.

What if we could build a tool that doesn't just block URLs, but actually understands your academic context?

Inspired, we built WatchDog, an agent browser safeguard that keeps you on task by analyzing your intent.

What it does

Watchdog is a live task engine that:

  • Monitors active browser tabs using a high frequency async loop
  • Analyzes the content of sites to determine if its educational or a distraction
  • Autonomously corrects task drifting by triggering a pop-up that initializes an effective workspace for your next task
  • Gathers Notion data and user inputs to optimize web navigation experience

How we built it

WatchDog is built as an autonomous State Machine powered by Python and the Browser Use SDK.

  • Core Architecture: Used async loops and Browser Use SDK to continuously monitor active tabs, compares URLs against a SQLite database linking URLs, tasks, and task groups
  • Productivity Agent: Fed page urls to Gemini API to analyze content, distinguishing between productive and distracting sites
  • Workspace Automation: autonomously organizes Chrome for the user through window management that allows the agent to group and label tabs based on the task
  • Persistence: SQLite database maps specific assignments and seed URLs to higher level course groups, allowing the agent to determine whether new tabs align with active tasks

Challenges

  • The False Positive Problem: Blocking "youtube.com" is easy, but blocking "procrastination" is hard. We struggled with sites that are "dual-use" (like YouTube or Reddit). We solved this by implementing the Productivity Agent, which performs a "second look" at the page content before taking action.
  • DOM Complexity: Sites like Canvas and Notion use heavy nesting and dynamic loading. We had to fine-tune our agent's "Execution Protocol" to handle login redirects and extract deep-linked assignment data without getting stuck in infinite loops.
  • State Management: Keeping the browser state, the local database, and the LLM's context in sync during fast-paced browsing required a robust async event loop that could handle interruptions gracefully.

Takeaways

We learned that Agentic Browsing is the next frontier of productivity. Moving from "Search" (user finds info) to "Mission" (agent gathers context) changes how we interact with the web. We also learned the power of structured prompting (XML) to get deterministic and reliable behavior out of non-deterministic LLMs in a high-stakes environment like focus management.

Built With

Share this project:

Updates