OpenTutor: AI-Powered Personalized Learning Platform

Inspiration

Learners today face persistent challenges: lack of personalization, scattered resources, passive content consumption, inefficient study approaches, unclear learning paths, and difficulty integrating their own materials. OpenTutor addresses these problems through adaptive AI, multi-agent systems, document intelligence, and retrieval-augmented learning.


What It Does

OpenTutor builds personalized learning paths, generates structured educational content, and provides interactive study tools through coordinated AI agents operating on a real-time backend.

Intelligent Roadmap Generation

  • Understands goals from natural language
  • Asks clarification questions to refine objectives
  • Supports file uploads + OCR to incorporate external context
  • Generates hierarchical learning roadmaps
  • Visualizes paths using React Flow
  • Adapts dynamically based on user progress

Dynamic Content Generation

  • Creates slide-based lessons: theory, examples, exercises, summaries
  • Outputs in Markdown + LaTeX
  • RAG-enhanced using user-uploaded documents + web research
  • Adjusts content complexity based on learner level

Interactive Study Tools

Flashcards – automatically generated, LaTeX-compatible, animated
Quizzes – MCQs, T/F, short answers, explanations, analytics
Notes – concise topic summaries

Deep Research Agent

  • Uses Perplexity and Tavily for multi-source research
  • Aggregates and verifies sources
  • Supports both 20–30 and 70–80 source deep research modes
  • Produces structured research reports with citations
  • Note: During development, MCP servers inside Kiro IDE (Firecrawl, Fetch, Context7, Browser MCP, etc.) were used **only to fetch documentation and contextual information, not as part of the production agent pipeline.

Document Management & RAG

  • PDF, image OCR, text extraction
  • Semantic chunking with overlap
  • Gemini embeddings (768D)
  • Fast Convex vector search
  • Metadata-aware retrieval for higher relevance

Workspace System

  • Multiple workspaces
  • Real-time synchronization
  • Role-based collaboration
  • Organizes roadmaps, content, quizzes, notes, documents

MiniDrona Assistant

Provides contextual guidance, concept explanations, navigation help, and learning support throughout the platform.

Modern UI

  • Dark mode
  • Responsive layout
  • React Flow visualizations
  • Smooth animations via Framer Motion
  • Accessible, study-friendly interface
  • UI elements designed using Spec Mode in Kiro IDE

Technical Architecture

Frontend

  • Next.js 16
  • React 19
  • Tailwind CSS
  • Shadcn UI
  • Framer Motion
  • React Flow
  • KaTeX for mathematical rendering
  • UI prototyping accelerated with Kiro IDE’s Spec Mode

Backend

  • Convex real-time database
  • Vector index for semantic retrieval
  • Live subscriptions
  • Long-running actions for agent workflows

AI & Agents

  • LangGraph multi-agent workflow orchestration
  • LangChain tools for structured LLM interaction
  • Gemini + GPT models for content generation
  • OCR engines for documents
  • Perplexity + Tavily for research
  • MCP servers were used **only during development inside Kiro IDE* for documentation lookup, context retrieval, and faster iteration—not in production.*

How We Built It

Development Environment

OpenTutor was built end-to-end inside Kiro IDE, using:

  • Spec Mode to prototype UI and UX components rapidly
  • MCP servers (Context7, Firecrawl, Fetch, Browser MCP, etc.) strictly for development-time tasks, such as:
    • Fetching documentation
    • Looking up API references
    • Doing architectural research
    • Accelerating coding and debugging
  • Local + cloud orchestration for agents during development

These MCP tools are NOT part of the production runtime.

Architecture Foundation

Designed for multi-agent orchestration, real-time sync, semantic retrieval, and scalable content generation.

Backend Development

  • Comprehensive schema for documents, chunks, embeddings, roadmaps, slides, quizzes, notes
  • Vector indexing with metadata filters
  • Real-time subscriptions via Convex

Multi-Agent System

  • Roadmap Agent: OCR → Clarification → Research → Roadmap generation
  • Content Agent: RAG → Research → Slide planning → Slide generation → Quiz creation
  • Research Agent: Queries → Verification → Citations → Report
  • All agents operate consistently with LangGraph flow logic

RAG Pipeline

  • Semantic chunking
  • Page-aware metadata
  • Similarity search and fallback logic
  • Batch embeddings for performance

Frontend Development

  • Modular UI components
  • Interactive roadmap viewer
  • Flashcard, quiz, and notes interfaces
  • Real-time updates with Convex hooks
  • Spec Mode accelerated design iteration

Testing & Integration

  • Progress indicators
  • Error boundaries
  • Retry mechanisms
  • UI rendering and performance tuning

Deployment

  • Vercel for frontend hosting
  • Convex backend
  • Production environment setup and monitoring

Challenges We Overcame

Content Truncation

Solved via retry logic, validation checks, and improved prompting.

Agent Coordination

Addressed with LangGraph’s state-machine workflows and strict state validation.

RAG Quality

Improved chunking logic, metadata preservation, and filtering rules.

Real-time Sync

Added optimistic updates, loading states, and progress mapping.

Document Processing

Batch embeddings, async processing, and better feedback indicators.

JSON Failures

Built a JSON-repair + validation pipeline to handle malformed outputs.

Multi-API Search Complexity

Unified Perplexity + Tavily workflows; MCP tools used only for dev-time research.

Roadmap Visualization Scaling

Used hierarchical layouts and virtualization in React Flow.

LaTeX Rendering Issues

Integrated remark-math and rehype-katex for consistent rendering.

Permissions & Security

Implemented role-based workspace access control.


Accomplishments We're Proud Of

Multi-Agent AI System

Roadmap, Content, Notes, Flashcards, Quizzes, and Research agents working in harmony.

Development Acceleration via MCP (Dev-Only)

Used Kiro IDE’s MCP servers only during development to fetch fresh documentation and contextual data—significantly speeding up engineering.

Robust RAG Implementation

Seamless integration of user documents with semantic search and contextual retrieval.

Polished UI

Dark mode, smooth animations, accessibility, and Spec Mode–designed components.

Real-time Collaboration

Multi-workspace system with roles, permissions, and live updates.

Deep Research Engine

Produces academic-style reports with verified sources.

Strong Architecture

Modular, scalable, maintainable, and ready for production.

Error Resilience

Validation layers, retry logic, fallbacks, and graceful degradation throughout.


What We Learned

Technical Insights

  • Effective LangGraph orchestration
  • Deep integration of RAG pipelines
  • Real-time architecture with Convex
  • Importance of structured LLM output and retries
  • OCR and document extraction quirks
  • MCP tools dramatically speed up development, but don’t belong in production runtime

Architectural Insights

  • Modularity and isolation
  • Observability and debugging techniques
  • Performance optimization strategies

Product Insights

  • Progress indicators reduce friction
  • Structured content boosts understanding
  • Personalization increases engagement
  • Accessibility is crucial

Process Insights

  • Incremental development
  • Value of documentation
  • Coordinating multiple tools + APIs
  • Testing across the entire pipeline

What's Next for OpenTutor

Upcoming Features (3–6 Months)

  • Adaptive personalization
  • Learning analytics dashboards
  • Video lessons + interactive simulations
  • Text-to-speech + multi-language output
  • Peer review, study groups, collaborative notes
  • Mobile apps with offline learning
  • Voice-based tutoring

Mid-Term (6–12 Months)

  • Adaptive testing
  • Integrations: LMS, calendars, Notion, Anki
  • Enterprise workspace support
  • Community content marketplace
  • Advanced research tooling

Long-Term (1–2 Years)

  • Fully multimodal AI tutor
  • VR/AR immersive learning
  • Knowledge graphs + predictive analytics
  • Offline-first architecture
  • University & institutional partnerships

Technical Improvements

  • Edge computing
  • Multi-layer caching
  • SDKs & plugin system
  • Enhanced security & privacy
  • Automated testing frameworks

Business Expansion

  • Freemium + premium tiers
  • API monetization
  • Educational and enterprise partnerships

Conclusion

OpenTutor blends AI agents, retrieval-augmented generation, real-time collaboration, and a beautifully designed interface into a powerful personalized learning platform. Built using Kiro IDE, Spec Mode, and modern web technologies—with MCP servers used only during development to streamline engineering—it represents a complete AI-powered learning ecosystem designed for depth, adaptability, and real educational impact.

Built with ❤️ by the OpenTutor team.

Built With

Share this project:

Updates