We all have that moment: "I'll just watch one YouTube video," and suddenly it's 3 hours later.

I realized that when I take an online proctored exam, I am hyper-focused. I don't dare open a new tab because I know the system is watching. I wanted to bring that same "ruthless accountability" to my daily deep work sessions.

The existing tools (website blockers) are too easy to bypass or too annoying to set up. I wanted something that felt like a human proctor: intelligent enough to see what I'm doing and strict enough to call me out instantly.

🤖 What it does

MindTussle is an AI-powered tactical focus shield.

  1. You define a Mission: "I want to solve 5 LeetCode problems."
  2. You define Allowed Tools: "LeetCode, StackOverflow, Spotify."
  3. The AI Watches You: The app uses your screen sharing stream + a Chrome Extension to monitor your activity in real-time.

If you open an unauthorized site (like Twitter/X or Netflix), MindTussle triggers a Red Alert immediately, penalizing your "Integrity Score". It's not just a blocker; it's an Active Defense System for your attention.

⚙️ How we built it

We built a dual-layer detection system to ensure 100% reliability:

Layer 1: The "Visual Cortex" (Gemini 1.5 Flash)

We use the Google Gemini API (1.5 Flash) to analyze screenshots of the user's screen every 5 seconds.

  • It "reads" tab titles, URL bars, and even application windows.
  • It understands context: dealing with complex "Allowed" rules that regex can't handle.
  • Challenge: Disabling the defaults "Safety Filters" was crucial because screen text often triggers false positives!

Layer 2: The "Nervous System" (Chrome Extension)

For absolute precision, we built a Manifest V3 Chrome Extension that acts as a background agent.

  • It monitors all open tabs (even hidden ones).
  • If it detects a blacklisted domain, it sends a PROCTOR_VIOLATION message directly to the React frontend.
  • This creates a Cross-Context Bridge between the isolated extension world and our Next.js app.

Tech Stack:

  • Frontend: Next.js 14, React, Framer Motion (for the Cyberpunk UI).
  • AI Core: Google Gemini 1.5 Flash (Multimodal Vision).
  • Extension: Chrome Extension API (Tabs, Runtime, Messaging).
  • Styling: Tailwind CSS.

🧠 Challenges we ran into

  1. "AI Unavailable" Errors: We initially struggled with 404 errors from older Gemini models. We had to implement a robust fallback logic and switch to the stable gemini-1.5-flash model.
  2. Safety Filters: The AI kept refusing to analyze screenshots of text-heavy coding sites, flagging them as "unsafe". We had to manually override the safety thresholds to BLOCK_NONE to get it to work as a proctor.
  3. Real-Time Latency: Sending images to an API is slow. We optimized this by adding the secondary Chrome Extension Layer which provides instant (<100ms) feedback for tab switching, while the AI handles the deeper visual context.

🏆 Accomplishments that we're proud of

  • Building a "Ruthless Proctor" mode that genuinely feels like having an invigilator watching you.
  • Successfully bridging the gap between a Web App (Next.js) and a Browser Extension for seamless two-way communication.
  • Creating a UI that feels "Tactical" and "Gamified" rather than boring and restrictive.

🚀 What's next for MindTussle

  • Eye Tracking: Using the webcam to detect when you look away from the screen (using TensorFlow.js).
  • Team Mode: "Multiplayer Focus" where you and your friends can see each other's Integrity Scores in real-time.
  • Hardware Integration: Maybe a physical red light that turns on when you get distracted!

Built With

  • mern
  • next
Share this project:

Updates