Brain Brew

brain-brew-logo

Inspiration

Graduate CS is brutal — not because any single concept is impossible, but because the semester is a distributed system with no central monitor. You're tracking 4 courses, 40+ assignments, 12 concepts, and a knowledge graph of dependencies spread across Canvas, Slack, your notes, and your memory. By the time you realize something is at risk, it's already too late.

We wanted to build something that treated the fragmented semester as a live reasoning problem — not a task list or a to-do app. Every student we knew used ChatGPT, but they were manually pasting context every session, losing continuity, and getting shallow answers with no grounding in their actual coursework. We asked: what if an AI had complete, structured knowledge of your semester and could reason across it proactively?

That question became Brain Brew.

What it does

Brain Brew is a reasoning-first academic mission control system for graduate students. It maintains a living model of your entire semester — courses, assignments, deadlines, lecture concepts, and how those concepts relate — and uses it to power three interconnected capabilities:

Command Center (Triage Dashboard) A real-time feed that prioritizes what's at risk, surfaces course announcements, and surfaces the danger/warning/OK status of every assignment — before you ask. A voice briefing reads your day's priority alerts out loud using Kokoro TTS.
Semester-Aware Chat A chat interface powered by K2 Think V2 that has full context injection: your courses, syllabi, upcoming deadlines, assignment triage, and knowledge graph state. It reasons across all of it. Ask it "which of my in-progress assignments has the most cross-course dependencies?" — it actually knows. Responses stream token-by-token via Server-Sent Events. You can speak your questions and hear responses.
Study Mission Lab Generate custom flashcards, adaptive quizzes, and structured study guides for any topic — grounded in your actual course material, not Wikipedia. Quiz performance auto-updates your mastery tier (Novice → Familiar → Proficient → Expert). A full audit trail tracks every mastery change and its source.
Knowledge Graph A 3D force-directed visualization of concepts across all four courses, with explicit labeled edges (e.g., "NP-Completeness motivates approximation tradeoffs in ML Generalization"). Nodes are colored by mastery tier. Click any node to inspect the concept and launch a topic-scoped reasoning context.
University Ingestion Pipeline Upload a PDF transcript, match it against official Rutgers or Princeton course catalogs (scraped from official sources), and link your enrolled courses to university records. Built for future expansion to degree-progress tracking across 25+ universities.

How we built it

Stack AI / Reasoning

K2 Think V2 (MBZUAI-IFM/K2-Think-v2) as the primary reasoning engine for all multi-step academic reasoning (chat, study generation, concept extraction, graph analysis) Google Gemini 2.5 Flash as an automatic fallback — if K2 errors, the system switches without any user-facing disruption ElevenLabs Scribe V2 for speech-to-text transcription Kokoro (open-weight TTS) for text-to-speech, running locally with the af_heart voice Backend

FastAPI (async Python) with Uvicorn HTTPX for async K2/Gemini API calls pdfplumber for PDF transcript extraction 11 specialized prompt builders that assemble dynamic semester context for every K2 call Pydantic for schema validation of all model outputs Sophisticated JSON recovery heuristics: control-character sanitization, markdown fence stripping, preferred-start-position detection — because LLMs don't always output clean JSON Frontend

React 19 + TypeScript + Vite Zustand for state management (courses, assignments, concepts, chat, mastery history, voice state) TailwindCSS 4 + shadcn/Radix UI for component primitives Framer Motion for animations React Force Graph 3D + Three.js for the knowledge graph visualization Server-Sent Events for streaming chat responses react-dropzone for file upload iOS App

SwiftUI with xcodegen project generation Custom APIClient.swift over HTTP (works on localhost and LAN for physical device testing) Native AVFoundation voice recording pipeline → ElevenLabs transcription Architecture The backend is structured in three layers: Routers (API endpoints), a Service Layer (K2/Gemini clients, university scrapers, voice pipeline), and a Prompt Layer (11 specialized context builders). The data layer represents a complete mock semester for a Rutgers MSCS student across 4 real courses, with 12 concepts, 13 explicit graph edges, and 40+ assignments.

The knowledge graph is not just a visualization — the graph edges are explicitly injected into K2's context during chat, so it can reason about cross-course concept dependencies. K2's internal chain-of-thought reasoning (when it leaks into responses) is filtered out before reaching the client via a reasoning-leak detector that checks for metadata patterns.

Challenges we ran into

Context engineering at scale. Giving K2 "the full semester" sounds simple, but the semester has overlapping concerns: deadlines, mastery gaps, concept dependencies, syllabi, announcements. We spent significant time building the context_builder.py that assembles a structured knowledge document injected into every prompt, without exceeding context limits or diluting the signal.

LLM output robustness. K2 returns rich, multi-step reasoning, but that comes with occasional quirks: markdown fences around JSON, escaped control characters, prose wrapped around structured data. We built a multi-strategy JSON parser with fallback heuristics so generation never fails silently.

Mastery drift. Early versions of mastery tracking had continuous updates that drifted — a 61% quiz score barely moved the needle, but a 59% score wouldn't drop you. We solved this with tier snapping: quiz scores above 60% advance you to the next tier boundary, below 40% drop you one tier, with a "proficiency maintained" message in between. This makes mastery feel meaningful.

K2 Think V2 reasoning leaks. K2 sometimes surfaces its own chain-of-thought metadata in responses (references to "system message", "we have"). We added a response sanitizer in the chat router that detects these patterns and returns a safe fallback, keeping the UX clean.

iOS ↔ Backend on physical devices. Simulator testing is one thing; getting the iOS app talking to the FastAPI backend on a real iPhone across a local network required making the backend IP configurable in APIClient.swift and ensuring CORS was permissive enough for LAN origins.

Building a full 3D force graph with mastery coloring. Connecting Zustand concept mastery state to React Force Graph 3D's node rendering in real time required carefully managing prop reactivity across Three.js's rendering loop.

Accomplishments that we're proud of

We didn’t just build features — we built a coherent academic reasoning system that actually closes the loop between understanding and action.

We turned a fragmented semester into a single structured reasoning graph, where assignments, concepts, and mastery are all connected and actively updated. We successfully made a large reasoning model (K2 Think V2) operate as a stateful academic decision engine, not a one-shot chatbot. We built real dual-model resilience (K2 → Gemini) that fails silently without breaking user experience — something most hackathon systems don’t even attempt. We shipped a full voice-to-reasoning-to-voice pipeline that feels natural enough to use in real student workflows, not just demos. We implemented a live-updating 3D knowledge graph tied to actual mastery state, not static visualization — every interaction changes the system’s internal understanding of the student. We connected four platforms in one system (web, iOS, backend, AI pipeline) within a 24-hour window while maintaining a shared state model. Most importantly: we made the system proactive instead of reactive — it doesn’t wait for questions; it surfaces risk and prioritization automatically.

What we learned

The hardest problem is not AI — it’s structure. Models like K2 are powerful, but they only become useful when the surrounding data is deeply structured. Context engineering mattered more than prompt engineering. Stateful AI systems are fundamentally different from chatbots. Once you maintain persistent academic state (mastery, deadlines, graph relationships), the model stops being a responder and becomes a reasoning layer over reality. Reliability is an AI feature, not a backend concern. JSON failures, malformed outputs, and partial responses are not edge cases — they are the default in real LLM systems. Robust parsing was as important as model selection. Voice changes usage patterns more than expected. Students don’t “open apps to study” — they use voice when they are already in motion. That shifted how we designed the entire interface. Graph structure unlocks cross-domain reasoning. Once concepts were explicitly connected across courses, we saw emergent reasoning like identifying shared theoretical dependencies between algorithms and ML without being explicitly prompted. Hackathon systems break at integration boundaries, not features. Most of our debugging time went into iOS ↔ backend networking, streaming sync, and state consistency — not AI or UI.

What's next for Brain Brew

We’re moving from a “semester assistant” to a full academic operating system for students.

Real Canvas LMS integration Replace mock data with live assignments, grades, announcements, and submissions from actual university systems. Degree-aware planning engine Convert transcripts + catalogs into a structured degree graph that answers: “What do I need to graduate, and what’s the optimal path?” Predictive risk modeling Go beyond current triage to predict: “This assignment will likely slip based on your current trajectory.” Personalized study scheduling engine Auto-generate weekly plans based on deadlines, mastery gaps, and cognitive load across courses. Multi-university deployment layer Expand our ingestion pipeline into a scalable system that supports multiple universities with minimal configuration. Long-term retention system (SRS integration) Tie mastery graph → spaced repetition so concepts don’t decay after exams. Advisor collaboration layer Enable professors/advisors to view structured student understanding instead of raw grades. From assistant to autonomous academic agent Eventually, Brain Brew won’t just answer questions — it will actively manage academic workload, surface risks early, and recommend interventions.