We all have that moment: "I'll just watch one YouTube video," and suddenly it's 3 hours later.
I realized that when I take an online proctored exam, I am hyper-focused. I don't dare open a new tab because I know the system is watching. I wanted to bring that same "ruthless accountability" to my daily deep work sessions.
The existing tools (website blockers) are too easy to bypass or too annoying to set up. I wanted something that felt like a human proctor: intelligent enough to see what I'm doing and strict enough to call me out instantly.
🤖 What it does
MindTussle is an AI-powered tactical focus shield.
- You define a Mission: "I want to solve 5 LeetCode problems."
- You define Allowed Tools: "LeetCode, StackOverflow, Spotify."
- The AI Watches You: The app uses your screen sharing stream + a Chrome Extension to monitor your activity in real-time.
If you open an unauthorized site (like Twitter/X or Netflix), MindTussle triggers a Red Alert immediately, penalizing your "Integrity Score". It's not just a blocker; it's an Active Defense System for your attention.
⚙️ How we built it
We built a dual-layer detection system to ensure 100% reliability:
Layer 1: The "Visual Cortex" (Gemini 1.5 Flash)
We use the Google Gemini API (1.5 Flash) to analyze screenshots of the user's screen every 5 seconds.
- It "reads" tab titles, URL bars, and even application windows.
- It understands context: dealing with complex "Allowed" rules that regex can't handle.
- Challenge: Disabling the defaults "Safety Filters" was crucial because screen text often triggers false positives!
Layer 2: The "Nervous System" (Chrome Extension)
For absolute precision, we built a Manifest V3 Chrome Extension that acts as a background agent.
- It monitors all open tabs (even hidden ones).
- If it detects a blacklisted domain, it sends a
PROCTOR_VIOLATIONmessage directly to the React frontend. - This creates a Cross-Context Bridge between the isolated extension world and our Next.js app.
Tech Stack:
- Frontend: Next.js 14, React, Framer Motion (for the Cyberpunk UI).
- AI Core: Google Gemini 1.5 Flash (Multimodal Vision).
- Extension: Chrome Extension API (Tabs, Runtime, Messaging).
- Styling: Tailwind CSS.
🧠 Challenges we ran into
- "AI Unavailable" Errors: We initially struggled with 404 errors from older Gemini models. We had to implement a robust fallback logic and switch to the stable
gemini-1.5-flashmodel. - Safety Filters: The AI kept refusing to analyze screenshots of text-heavy coding sites, flagging them as "unsafe". We had to manually override the safety thresholds to
BLOCK_NONEto get it to work as a proctor. - Real-Time Latency: Sending images to an API is slow. We optimized this by adding the secondary Chrome Extension Layer which provides instant (<100ms) feedback for tab switching, while the AI handles the deeper visual context.
🏆 Accomplishments that we're proud of
- Building a "Ruthless Proctor" mode that genuinely feels like having an invigilator watching you.
- Successfully bridging the gap between a Web App (Next.js) and a Browser Extension for seamless two-way communication.
- Creating a UI that feels "Tactical" and "Gamified" rather than boring and restrictive.
🚀 What's next for MindTussle
- Eye Tracking: Using the webcam to detect when you look away from the screen (using TensorFlow.js).
- Team Mode: "Multiplayer Focus" where you and your friends can see each other's Integrity Scores in real-time.
- Hardware Integration: Maybe a physical red light that turns on when you get distracted!
Built With
- mern
- next
Log in or sign up for Devpost to join the conversation.