Inspiration
Most AI tutors start the same way: here's a syllabus, now go learn. The problem is that working engineers learning something new on the side aren't starting from zero — they have partial knowledge that generic tutorials completely ignore. They sit through content they already know and never reach the gaps that actually matter.
We were also inspired by research from Dartmouth showing that short, targeted quizzes can map a learner's conceptual knowledge topology before teaching begins. That idea — interrogate first, then teach — became the entire design premise. Khanmigo and DeepTutor both do Socratic interaction, but neither interrogates the learner's prior knowledge before building a plan. That gap is what LearnFlow AI fills.
What it does
LearnFlow AI is a Socratic learning coach that runs a full learning session in under 20 minutes:
- Diagnostic assessment — before teaching anything, the coach asks 2–5 targeted questions to map what you already know, your learning goal, and how deep you want to go.
- Personalized plan generation — based on your answers, it generates a structured learning plan (phases, lessons, estimated time) shown live in the right panel.
- Socratic quiz loop — the coach quizzes you one concept at a time. Correct answers get a deepening follow-up. Partial answers get clarification then a nudge. Wrong answers get a brief explanation and a retry. If you miss the same concept twice, the coach surfaces it in chat and slows down.
- Live plan tracking — lesson status updates automatically as you progress (⬜ → 🟡 → ✅), with a real-time progress bar. The coach owns all progress updates — you never have to click anything.
How we built it
Stack: Next.js 16 (App Router), TypeScript, Tailwind CSS, Groq (llama-3.3-70b-versatile via OpenAI-compatible API), Vercel.
Architecture: The core is a pure coachReducer — a single-source-of-truth state machine with three domains (session, chat, plan) and explicit actions for every transition. This keeps the UI deterministic: no component manages its own state, no race conditions.
Phase-keyed prompts: The system prompt changes based on the current subPhase (diagnostic → planning → quiz). Each phase gives the model a completely different role and set of constraints. This is what makes the coach feel pedagogically deliberate rather than free-chat.
Server-side tool buffering: During the quiz, the model emits update_plan tool calls alongside its text response. Rather than applying plan updates mid-stream (which creates race conditions), the server accumulates all tool calls in a buffer, validates them against the current plan state at stream end, and sends a single atomic STATE_PATCH event. The client applies the patch after committing the streamed text — plan and chat stay consistent.
Streaming: SSE token-by-token from a Next.js route handler. The client parses the stream line by line via useChatStream, dispatching APPEND_STREAM deltas and a final COMMIT_STREAM + APPLY_STATE_PATCH on the done event.
The entire build was driven by spec-driven development — scope doc → PRD → technical spec → build checklist — before a single line of code was written.
Challenges we ran into
Plan generation reliability — the planning phase asks the model to return pure JSON with a strict schema. Getting that to be consistent without structured output support required tight prompt engineering and careful fallback handling when parse fails.
Phase transition detection — moving from diagnostic to planning happens when the model signals {"action":"generate_plan","ready":true} inline in its text response. Detecting and stripping that marker from the streamed text without showing it to the user required careful SSE parsing logic.
Atomic plan updates — our first instinct was to apply tool call events as they arrived mid-stream. That created visual flickers and state inconsistencies. The server-side buffer + atomic patch approach solved it, but it took a few iterations to get right.
Layout bugs — a flex-col on body in the root layout was fighting the h-screen flex on main, causing the two-panel layout to stack vertically instead of side by side. Invisible input text in dark mode was another early surprise — CSS variables resolving to near-white required explicit text-gray-900 bg-white.
Accomplishments that we're proud of
The tool-buffered STATE_PATCH pattern — applying plan updates atomically after stream commit was a non-obvious architectural decision that made the UI rock-solid. It's the kind of thing that feels obvious in hindsight but requires deliberate design upfront.
The full interrogation-first arc works end-to-end — entering a topic, getting diagnosed, receiving a personalized plan, and getting Socratically quizzed with live plan updates all flow together seamlessly in a single session.
Spec-driven development as a repeatable process — we went from idea to working app in one session, with no architectural revisions during the build. Every piece fit exactly as designed because the spec was detailed enough to eliminate ambiguity before coding started. That's the meta-accomplishment.
What we learned
Structured prompting produces qualitatively different agents than free chat. Phase-keyed system prompts — where the model's entire role, constraints, and output format change based on session state — give you control over the teaching arc that a single system prompt never can.
State machines and LLMs pair well. The coachReducer keeps the AI's non-determinism quarantined to the network layer. Everything the UI renders comes from a pure state machine — the model only influences state through validated actions, not directly.
Tool calls as structured side effects. Using update_plan tool calls for plan progress (rather than trying to parse intent from text) made the plan-tracking feature reliable in a way that text-based parsing never would have been.
**Spec quality determines build quality. **The three architectural decisions that most improved the final product (reducer domain split, tool buffering, phase-keyed prompts) were all made during the spec phase, not the build phase.
What's next for LearnFlow AI
- Cross-session persistence — save progress so learners can pick up where they left off across multiple 20-minute sessions.
- Weakness detection dashboard — a structured view of concepts a learner has consistently missed, built from session history across multiple topics.
- Daily guidance — "Today's goal: 2 lessons" driven by session history, learning velocity, and spaced repetition intervals.
- Lesson click-through — clicking a lesson in the right panel surfaces a structured brief (explanation + key takeaways) without leaving the session flow.
- Multi-topic paths — chaining related topics automatically ("You finished Kubernetes basics — next up: Helm") to support longer learning arcs.
Built With
- anthropic-claude
- next.js
- tailwind-css
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.