Inspiration
MetaCog OS stemmed to solve a problem I saw frequently around me. The problem with productivity apps and study tools. I get timers, to do lists, or unnecessarily complicated applications that take me 20 Youtube tutorials to figure out how to optimize. It feels as if the apps are made with money or fame in mind, not student experience and productivity.
There was one particular aspect in my mind while building, AI tutors. There are dozens of websites who claim to have a "personalized AI tutors", but its just reactive LLMs or agents that simply give away the answers to all your questions. No real accountability. No cognitive load to your brain.
Finally, after being frustrated by having to play hundreds of dollars to talk to a chatbot and not finding tools that actually help me stay on track, I decided to build MetaCog OS. Simple features, but with clever twists that deliver results. For free.
What Problem Does MetaCog OS Solve?
Students today face two critical challenges that existing tools fail to address:
1. Distraction epidemic: Phones and tab-switching destroy focus sessions, but traditional blockers are either too rigid or easily bypassed. Students need real-time accountability that adapts to their workflow.
2. Passive learning trap: AI tutors that directly answer questions create dependency, not understanding. Students need tools that force active recall and genuine cognitive engagement, not chatbots that do the thinking for them.
MetaCog OS tackles both: AI-powered distraction detection that holds you accountable in real-time, and a learning system designed around active recall and gradual guidance, not instant answers.
What it does
MetaCog OS is currently split into two cognitive aspects + one unification portal. There is cognitive discipline and cognitive training along with a portal that analyzes performance across both aspects and generate a weekly report, provides insights about what MetaCog OS is, and helps user sign up for/log in to the app.
Cognitive Discipline:
Features:
1. Real-Time Distraction Detection:
- AI-powered computer vision detects phone usage during focus sessions
- Detects tab switches to keep you on task
- Toggle detection on/off anytime via settings
- Instant accountability with timer pause and alert sounds
2. Comprehensive Analytics Dashboard:
- Weekly focus data visualization with bar graphs
- Best/worst day performance comparisons
- Session duration analysis
- Focus-to-distraction ratios via interactive donut charts
3. Built-In Productivity Tools:
- Integrated to-do lists, no need for external apps
- Seamless workflow without app-switching
- AI-powered text summarization (Gemini API) for research and study sessions
Cognitive Training:
Features:
1. Context-Aware AI Chat (Saathi):
- Powered by Gemini 2.5 Flash for low latency and reliable reasoning
- Breaks down complex STEM problems into digestible steps
- Never gives direct answers, it guides you to the solution
- Context-aware reasoning for accurate, meaningful responses
2. Instant Quiz Generation:
- Upload PDFs, .txt, .docx, or paste text
- Generates 5-20 question MCQ quizzes in seconds
- Supports mixed input (text + document) for comprehensive quizzes
- Perfect for active recall practice
3. Intelligent Performance Tracking:
- Chart.js visualizations showing accuracy trends across your last 10, 15, 20, 25, and 30 quizzes
- Overall statistics dashboard: total quizzes, questions answered, current streak
- Brutal reality check: see exactly where you stand
4. Automatic Revision System:
- After 15 wrong answers, Saathi auto-generates a revision quiz
- Forces you to confront your weak spots
- No more avoiding what you don't know
5. Personalized Learning Prompts
- After each quiz, get a custom prompt explaining what you missed and why
- Copy-paste into Saathi's chat for targeted explanations
- Bridges the gap between practice and understanding
Portal:
MetaCog OS isn't just two separate apps, it's a unified system.
Weekly AI Reports
- Generated every Sunday
- Analyzes performance across both Cognitive Training and Cognitive Discipline
- Identifies patterns, strengths, and areas for improvement
- Actionable insights to optimize your learning workflow
How I built it
Tech Stack
Backend: Django Database: PostgreSQL and Supabase for auth AI Models:
- Gemini 2.5 Flash
- Cerebras
- PyTorch (Fine tuned VGG16 model for phone detection) Frontend: HTML, Tailwind CSS, JavaScript, Chart.js (Built using Stitch) Computer Vision: OpenCV, PyTorch, Ray Tune, Custom generated dataset for both train and test due to lack of similar datasets online.
I used multiple apps in Django to efficiently manage all the endpoints, templates, and static files. Django was chosen for built-in CSRF protection, SQL injection prevention.
I started out by training the model (which reached a precision of 92% and F1 score of 71%). I used Raytune to find optimal hyperparameters, created a custom dataset that I tried to make as diverse as I could with the limited resources, fine tuned VGG 16 with the hyperparams and custom dataset. To reduce false positives and the impact of the average performance, I implemented a multi-layer validation system to only pause the timer if the average of the past 10 confidence score > 0.6, etc.
I built the frontend using Stitch while also developing the backend with Django.
Challenges I ran into
1. Supabase RLS policies: It prevents unauthenticated users from inserting data into tables like profiles. But new users need a profile row created before they can authenticate. This meant new users couldn't create profiles because they weren't authenticated, but couldn't authenticate without a profile existing. So, instead of using Supabase for cloud storage, I connected Postgres SQL to Django, and made a local database.
2. Model overfitting: Initially, the model performance was below what I expected and I had no idea how to approach it besides scaling, and at some point, scaling didn't work either. That's when I got to researching and found about RayTune + things like dropout and data augmentation.
3. Quiz Parsing: No matter what prompt was used, Gemini did not always follow it and occasionally returned textual response which broke the quiz generation since using json.loads() failed instantly. Manually parsing and structuring into dictionary containing lists could be done, but there still wouldn't be surety about the structure of the response. After a LOT of googling, I came across the documentation that told me I could just change the response type. Now, the model returns only and only a JSON object every time.
Accomplishments that I'm proud of
1. Building the whole app, completely alone
2. Learning so much along the way, and genuinely having fun with it
3. Built something that helps students, something I know for sure me and the people around me need
4. Fine tuned a model from scratch, with a custom dataset. That was NOT an easy task for sure
What I learned
1. Learnt to implement the Raytune library, how to define the search space and objective function, etc etc.
2. Learnt a lot about the importance of git and GitHub, how easily it allows to see different versions of the code and go back to any version. I had to see the past code a lot of times during building the website to compare efficiency/check for bugs, etc.
3. Learnt how to implement vector databases, especially using langchain, how to build an agentic AI using langchain and define tools for the same
4. Also realized problems fellow students face. I took a lot of their struggles as things I should work on. For eg, one friend of mine told me that even when he puts away his phone, he struggles with the urge to resist switching to other tabs
5. How simple features with just a little twist can make a change. I always thought I need to have a revolutionary, mind blowing idea. But its these little changes you make in surroundings around you that can nudge the world in the right direction
What's Next for MetaCog OS
My goal is to achieve production-ready model performance through user testing and data collection. Here's how I'm planning to get there:
Data Collection & Enhancement
Volunteer Program Expansion I'm planning to launch a structured volunteer program where students from different educational backgrounds can contribute phone and non phone samples. This will help me capture the natural variation in study spaces, lightings, facial features, etc.
Aggressive Data Augmentation Beyond standard transformations, I want to implement context-aware augmentation that simulates real-world conditions: phone far away from the webcam, phone with different cases or shapes, and way more.
Strategic Web Scraping I'll be gathering diverse samples from educational resources, public datasets, and academic repositories to ensure the model sees writing styles from different countries, age groups, and educational systems.
Beta Testing Phase
Diverse Test Group I'm recruiting beta testers from multiple countries and educational backgrounds, high school students, university students, and working professionals. The goal is to ensure MetaCog OS works regardless of where you're from.
Performance Benchmark I'm setting a strict target: 95% accuracy on a geographically diverse test dataset.
Iterative Feedback Loop Beta testers will use MetaCog OS in their actual study workflows, and I'll collect both quantitative metrics (accuracy rates, processing speed) and qualitative feedback (user experience pain points, feature requests) to refine the system.
Pre-Launch Validation
Before launch, I want to validate that MetaCog OS handles edge cases gracefully: mixed languages in notes, heavy annotations and diagrams, different paper types, and various scanning conditions. The model needs to be robust enough for real students in real study situations.
Infrastructure Scaling
Migration to Groq API
Currently, I'm limited by rate restrictions and inference speed with my existing setup. I'm planning to migrate to Groq's API, which will provide significantly faster inference times and higher rate limits. This is critical for scaling MetaCog OS to handle multiple concurrent users without lag, students need real-time feedback, not delayed detection that interrupts their flow state.
Log in or sign up for Devpost to join the conversation.