Inspiration

As students, we often struggle to stay focused during online classes or self-study sessions. Distractions like drowsiness, yawning, or simply looking away from the screen can go unnoticed, leading to lost productivity. We wanted to build a tool that gives students real-time feedback on their focus, using only their webcam and a machine learning model trained from scratch, no black-box APIs, no shortcuts.

What it does

  • Tracks student distraction in real time using the webcam.
  • Detects eye closure (drowsiness, micro-sleep) and yawning (fatigue) using custom-trained CNNs.
  • Ignores normal blinks (blink smoothing) so only real distraction is flagged.
  • Shows live overlays for eyes and mouth, a distraction timeline, and session stats (focused/distracted time, streaks).
  • All ML models are trained from scratch—no LLMs, no pre-trained APIs, no wrappers.
  • Works entirely in the browser and backend—no data leaves your device except for cropped eye/mouth images sent to your own backend.

How we built it

  • Datasets:
    • MRL Eye Dataset (Kaggle) for eye state (open/closed).
    • Yawn Eye Dataset (Kaggle) for yawn detection.
  • ML Models:
    • Trained two lightweight CNNs (PyTorch): one for eye state, one for yawn detection.
    • All training scripts are open source and reproducible.
  • Backend:
    • FastAPI (Python) serves a /api/frame endpoint.
    • Loads our trained models and runs inference on left eye, right eye, and mouth crops sent from the frontend.
    • Implements blink smoothing and session stats.
  • Frontend:
    • HTML/JS with Mediapipe Face Mesh for live landmark detection and region cropping.
    • Sends left eye, right eye, and mouth crops to the backend every 0.5s.
    • Shows overlays, distraction timeline, and session stats.
  • Deployment:
    • Backend on Render (free tier), frontend on GitHub Pages or Vercel.
    • All code and training scripts are public and reproducible.

Challenges we ran into

  • Dataset diversity: Real-world webcam conditions are very different from curated datasets. We had to tune our models and logic to handle normal blinks, lighting, and head pose.
  • Blink smoothing: Naively marking every closed-eye frame as “distracted” led to false positives. We implemented temporal smoothing to ignore normal blinks.
  • Deployment: Making sure the backend and frontend could talk to each other across different platforms, and handling Python import paths for both local and cloud deployment.

Accomplishments that we're proud of

  • End-to-end real ML web app: All models trained from scratch, no black-box APIs.
  • Robust, real-time distraction detection: Works in real-world conditions, ignores normal blinks, and provides actionable feedback.
  • Open source and reproducible: Anyone can retrain the models, run the app, and extend it for their own needs.
  • Professional UI/UX: Live overlays, timeline, and stats make the app easy and fun to use.

What we learned

  • How to train and deploy lightweight CNNs for real-time inference.
  • How to use Mediapipe Face Mesh for robust, browser-based landmark detection.
  • The importance of temporal logic (blink smoothing) for real-world usability.
  • How to build, document, and deploy a full-stack ML web app for hackathon judging.

What's next for Student Productivity Tracker

  • Add gaze direction detection to track if the student is looking away from the screen.
  • Personalized feedback and analytics for students and teachers.
  • Mobile support for distraction tracking on any device.
  • More distraction cues: hand-to-face, phone usage, etc.
  • Open source community: Invite contributions and new features from other students and developers.

Built With

Share this project:

Updates