Pachu

Inspiration

We've all experienced the same cycle: take detailed notes during a lecture, feel confident in the moment, and then realize days later that almost nothing stuck. The bottleneck isn't motivation — it's the friction between raw notes and actually studying them. Tools like Anki require you to manually create every card. AI flashcard generators often hallucinate definitions that were never in your notes. And none of them adapt to what you already know.

We wanted to build something that removes that friction entirely: paste your notes, and get a personalized study session powered by the science of spaced repetition — without trusting the AI blindly.


What it does

Pachu is a mobile-first study app that turns your own notes into interactive puzzles using a local LLM. Here's the flow:

  1. Paste or upload your notes. Pachu ingests them and runs an LLM pipeline to extract the key terms — with a built-in anti-hallucination verifier that rejects any term not literally found in your text.
  2. Choose your puzzle. Three formats are available: Flashcards, Cloze (fill-in-the-blank), and a Crossword — all generated from your extracted terms.
  3. Study and get scheduled. Every answer you give is rated (Again / Hard / Good / Easy) and fed into the FSRS spaced repetition algorithm, which schedules each term individually so you review it right before you'd forget it.
  4. Difficulty adapts. Terms you know well get freshly generated cloze sentences instead of the original wording, forcing deeper recall. Brand-new terms always show the exact sentence from your notes.
  5. Stuck? Ask the coach. A real-time AI coach (over WebSocket) offers three tiers of hints: a structural hint (letter count, first/last letter — no LLM, can't leak the answer), a contextual nudge written in the tone of your own notes, and a definition reveal as a last resort.

How we built it

  • Backend: Bun + TypeScript, with a SQLite store via better-sqlite3. The LLM adapter connects to a locally-running Ollama instance (tested on gemma4:26b), keeping all data on-device with no API costs.
  • LLM pipeline: A multi-stage pipeline handles notes ingestion (chunked at paragraph boundaries), term extraction, span verification, cloze generation, crossword clue styling, and coach hint generation — each with its own prompt and verifier.
  • Anti-hallucination system: Two tiers of verification run on every LLM output. Tier 1 checks that every extracted term and its source sentence are literal substrings of the input notes. Tier 2 checks that generated cloze sentences contain no proper nouns, numbers, or dates not found in the source chunk.
  • Spaced repetition: The ts-fsrs library powers the scheduling. Each term's FSRS card is serialized to SQLite. The stability router uses the card's stability value to decide whether the next cloze sentence should be anchored (verbatim from notes) or generated (LLM-fresh).
  • App: React Native + Expo, with a custom dithered visual style. Real-time progress during term extraction streams over WebSocket, so users see exactly what the LLM is doing. The coach also runs over a persistent WebSocket connection.
  • Monorepo: Bun workspaces with three packages — shared (types + constants), backend, and app.

Challenges we ran into

  • Hallucination is subtle. Getting the LLM to return verbatim source spans without silently paraphrasing them took significant prompt iteration. The verifier rejects a span with even a single added comma or a missing word — which is the right behavior, but required the prompt to be very explicit with examples of both correct and incorrect outputs.
  • Crossword layout on mobile. Fitting a crossword grid that's both solvable and readable on a phone screen required tuning answer-length bounds and targeting a total-letter budget across all entries, not just a count.
  • Cloze sentence quality. LLM-generated fill-in-the-blank sentences at high temperature introduced invented facts (dates, names) not in the source. The grounding verifier was built specifically to catch this and silently fall back to the anchored sentence.
  • FSRS serialization. The ts-fsrs library's Card type includes Date fields; storing those round-trip through SQLite required a careful serialize/revive step to avoid corrupting the scheduler state.
  • Register mimicry. Making hints and generated sentences feel like they came from the learner's own notes — not generic textbook prose — required threading a styleAnchor (the original sentence containing the term) through every LLM prompt that writes user-facing text.

Accomplishments that we're proud of

  • A two-tier anti-hallucination pipeline that actually works: hallucinated terms and invented facts are caught and dropped before they ever reach the user.
  • A spaced repetition system that adapts puzzle difficulty based on real memory science — not just "hard/easy" buckets.
  • A coach that gives contextually relevant hints without ever leaking the answer, written in the same voice as the learner's notes.
  • A fully local, offline-capable study tool — no cloud, no subscriptions, no data leaving the device.
  • Real-time streaming progress for the LLM extraction step, so the wait feels transparent rather than opaque.

What we learned

  • Prompt engineering and code-level verification are complementary, not substitutes. A clearer prompt reduces rejections; the verifier is the actual safety gate. Relying on either alone produces worse results than both together.
  • FSRS is dramatically more principled than simple interval doubling. Exposing the stability metric as a first-class value (the stability router) unlocked a natural difficulty progression that would have been impossible with a simpler scheduler.
  • Style anchoring is underrated. Feeding the original sentence back into every LLM prompt that generates user-facing content produces outputs that feel personal and natural — a detail that meaningfully improves the study experience.

What's next for Pachu

  • Multi-format imports: PDF and image (OCR) support, so students can upload photos of handwritten notes or slide decks directly.
  • Cloud sync (optional, end-to-end encrypted): Let users sync their spaces and review history across devices without giving up the privacy guarantee.
  • Collaborative spaces: Share a space with a study group so everyone reviews from the same note set, with individual FSRS schedules.
  • Streak and progress dashboard: A full history view with retention curves per term, so learners can see which topics they've mastered and which are still fragile.
  • More puzzle types: Matching exercises, multiple-choice, and short-answer with LLM-graded responses.
  • Smarter term picker: Incorporate forgetting-curve predictions to surface terms at the exact moment of maximum learning gain, not just "due now."
Share this project:

Updates