OpenTutor: AI-Powered Personalized Learning Platform

Inspiration

Learners today face persistent challenges: lack of personalization, scattered resources, passive content consumption, inefficient study approaches, unclear learning paths, and difficulty integrating their own materials. OpenTutor addresses these problems through adaptive AI, multi-agent systems, document intelligence, and retrieval-augmented learning.

What It Does

OpenTutor builds personalized learning paths, generates structured educational content, and provides interactive study tools through coordinated AI agents operating on a real-time backend.

Intelligent Roadmap Generation

Understands goals from natural language
Asks clarification questions to refine objectives
Supports file uploads + OCR to incorporate external context
Generates hierarchical learning roadmaps
Visualizes paths using React Flow
Adapts dynamically based on user progress

Dynamic Content Generation

Creates slide-based lessons: theory, examples, exercises, summaries
Outputs in Markdown + LaTeX
RAG-enhanced using user-uploaded documents + web research
Adjusts content complexity based on learner level

Interactive Study Tools

Flashcards – automatically generated, LaTeX-compatible, animated
Quizzes – MCQs, T/F, short answers, explanations, analytics
Notes – concise topic summaries

Deep Research Agent

Uses Perplexity and Tavily for multi-source research
Aggregates and verifies sources
Supports both 20–30 and 70–80 source deep research modes
Produces structured research reports with citations
Note: During development, MCP servers inside Kiro IDE (Firecrawl, Fetch, Context7, Browser MCP, etc.) were used **only to fetch documentation and contextual information, not as part of the production agent pipeline.

Document Management & RAG

PDF, image OCR, text extraction
Semantic chunking with overlap
Gemini embeddings (768D)
Fast Convex vector search
Metadata-aware retrieval for higher relevance

Workspace System

Multiple workspaces
Real-time synchronization
Role-based collaboration
Organizes roadmaps, content, quizzes, notes, documents

MiniDrona Assistant

Provides contextual guidance, concept explanations, navigation help, and learning support throughout the platform.

Modern UI

Dark mode
Responsive layout
React Flow visualizations
Smooth animations via Framer Motion
Accessible, study-friendly interface
UI elements designed using Spec Mode in Kiro IDE

Technical Architecture

Frontend

Next.js 16
React 19
Tailwind CSS
Shadcn UI
Framer Motion
React Flow
KaTeX for mathematical rendering
UI prototyping accelerated with Kiro IDE’s Spec Mode

Backend

Convex real-time database
Vector index for semantic retrieval
Live subscriptions
Long-running actions for agent workflows

AI & Agents

LangGraph multi-agent workflow orchestration
LangChain tools for structured LLM interaction
Gemini + GPT models for content generation
OCR engines for documents
Perplexity + Tavily for research
MCP servers were used **only during development inside Kiro IDE* for documentation lookup, context retrieval, and faster iteration—not in production.*

How We Built It

Development Environment

OpenTutor was built end-to-end inside Kiro IDE, using:

Spec Mode to prototype UI and UX components rapidly
MCP servers (Context7, Firecrawl, Fetch, Browser MCP, etc.) strictly for development-time tasks, such as:
- Fetching documentation
- Looking up API references
- Doing architectural research
- Accelerating coding and debugging
Local + cloud orchestration for agents during development

These MCP tools are NOT part of the production runtime.

Architecture Foundation

Designed for multi-agent orchestration, real-time sync, semantic retrieval, and scalable content generation.

Backend Development

Comprehensive schema for documents, chunks, embeddings, roadmaps, slides, quizzes, notes
Vector indexing with metadata filters
Real-time subscriptions via Convex

Multi-Agent System

Roadmap Agent: OCR → Clarification → Research → Roadmap generation
Content Agent: RAG → Research → Slide planning → Slide generation → Quiz creation
Research Agent: Queries → Verification → Citations → Report
All agents operate consistently with LangGraph flow logic

RAG Pipeline

Semantic chunking
Page-aware metadata
Similarity search and fallback logic
Batch embeddings for performance

Frontend Development

Modular UI components
Interactive roadmap viewer
Flashcard, quiz, and notes interfaces
Real-time updates with Convex hooks
Spec Mode accelerated design iteration

Testing & Integration

Progress indicators
Error boundaries
Retry mechanisms
UI rendering and performance tuning

Deployment

Vercel for frontend hosting
Convex backend
Production environment setup and monitoring

Challenges We Overcame

Content Truncation

Solved via retry logic, validation checks, and improved prompting.

Agent Coordination

Addressed with LangGraph’s state-machine workflows and strict state validation.

RAG Quality

Improved chunking logic, metadata preservation, and filtering rules.

Real-time Sync

Added optimistic updates, loading states, and progress mapping.

Document Processing

Batch embeddings, async processing, and better feedback indicators.

JSON Failures

Built a JSON-repair + validation pipeline to handle malformed outputs.

Multi-API Search Complexity

Unified Perplexity + Tavily workflows; MCP tools used only for dev-time research.

Roadmap Visualization Scaling

Used hierarchical layouts and virtualization in React Flow.

LaTeX Rendering Issues

Integrated remark-math and rehype-katex for consistent rendering.

Permissions & Security

Implemented role-based workspace access control.

Accomplishments We're Proud Of

Multi-Agent AI System

Roadmap, Content, Notes, Flashcards, Quizzes, and Research agents working in harmony.

Development Acceleration via MCP (Dev-Only)

Used Kiro IDE’s MCP servers only during development to fetch fresh documentation and contextual data—significantly speeding up engineering.

Robust RAG Implementation

Seamless integration of user documents with semantic search and contextual retrieval.

Polished UI

Dark mode, smooth animations, accessibility, and Spec Mode–designed components.

Real-time Collaboration

Multi-workspace system with roles, permissions, and live updates.

Deep Research Engine

Produces academic-style reports with verified sources.

Strong Architecture

Modular, scalable, maintainable, and ready for production.

Error Resilience

Validation layers, retry logic, fallbacks, and graceful degradation throughout.

What We Learned

Technical Insights

Effective LangGraph orchestration
Deep integration of RAG pipelines
Real-time architecture with Convex
Importance of structured LLM output and retries
OCR and document extraction quirks
MCP tools dramatically speed up development, but don’t belong in production runtime

Architectural Insights

Modularity and isolation
Observability and debugging techniques
Performance optimization strategies

Product Insights

Progress indicators reduce friction
Structured content boosts understanding
Personalization increases engagement
Accessibility is crucial

Process Insights

Incremental development
Value of documentation
Coordinating multiple tools + APIs
Testing across the entire pipeline

What's Next for OpenTutor

Upcoming Features (3–6 Months)

Adaptive personalization
Learning analytics dashboards
Video lessons + interactive simulations
Text-to-speech + multi-language output
Peer review, study groups, collaborative notes
Mobile apps with offline learning
Voice-based tutoring

Mid-Term (6–12 Months)

Adaptive testing
Integrations: LMS, calendars, Notion, Anki
Enterprise workspace support
Community content marketplace
Advanced research tooling

Long-Term (1–2 Years)

Fully multimodal AI tutor
VR/AR immersive learning
Knowledge graphs + predictive analytics
Offline-first architecture
University & institutional partnerships

Technical Improvements

Edge computing
Multi-layer caching
SDKs & plugin system
Enhanced security & privacy
Automated testing frameworks

Business Expansion

Freemium + premium tiers
API monetization
Educational and enterprise partnerships

Conclusion

OpenTutor blends AI agents, retrieval-augmented generation, real-time collaboration, and a beautifully designed interface into a powerful personalized learning platform. Built using Kiro IDE, Spec Mode, and modern web technologies—with MCP servers used only during development to streamline engineering—it represents a complete AI-powered learning ecosystem designed for depth, adaptability, and real educational impact.

Built with ❤️ by the OpenTutor team.