-
-
landing page (dark mode)
-
landing page (dark mode)
-
landing page (dark mode)
-
landing page (dark mode)
-
homepage (dark mode)
-
dashboard (light mode)
-
sample question bank (light mode)
-
sample question analysis (light mode)
-
question bank (light mode)
-
quiz mode (light mode)
-
test mode (light mode)
-
quiz summary (light mode)
-
quiz summary with ai tutor (light mode)
-
flashcards (light mode)
-
wrong question notebook (light mode)
-
achievements (light mode)
-
analytics dashboard (light mode)
-
analytics dashboard (light mode)
Inspiration
It started with a frustrated friend. She was preparing for her midterm exams and complained that the sample questions her professor provided were useless - the difficulty level was way off, the question style didn't match what would actually be on the exam, and she felt completely unprepared despite studying for hours.
She tried using ChatGPT to generate practice questions, but it was painfully inefficient. She had to copy-paste her notes, carefully craft prompts, manually format the output, and the questions it generated were often too generic or missed the key concepts her professor emphasized. Other study websites like Quizlet or Anki required her to manually create every flashcard - they didn't offer AI generation that could match her professor's testing style.
I realized there was a gap: no tool existed that could analyze your study materials AND learn the specific style of questions you need. Not just generic quiz generation, but questions that match the difficulty, format, and focus areas of your actual exams.
The Kiroween "Frankenstein" theme pushed me further. I decided to stitch together technologies that seem incompatible: 1987 spaced repetition algorithms (SM-2 from SuperMemo), modern multi-agent AI (8 specialized agents working together), Web3 blockchain verification (Base L2 + IPFS for permanent achievement records), and RAG pipelines with semantic chunking. The result is a chimera of educational technology - each piece from a different era, but together they create something more powerful than any single approach.
What it does
StudyForge transforms any document into personalized study materials that match YOUR professor's testing style:
Core Features:
- Document Upload & Processing: Upload PDFs, images, Word docs, or even handwritten notes. The AI extracts and understands the content.
- Style-Matched Question Generation: Upload sample questions from your professor, and the Analysis Agent learns the style - difficulty level, question format, emphasis areas. All future questions match that style.
- 8 Specialized AI Agents:
- Controller Agent: Orchestrates the workflow
- Analysis Agent: Learns question styles from samples
- Generation Agent: Creates questions matching learned styles
- Grading Agent: Provides partial credit with detailed feedback
- Handwriting Agent: Extracts text from handwritten notes (GPT-4o Vision)
- Explanation Agent: Generates concept explanations when you're stuck
- Chapter Agent: Organizes documents into logical sections
- SLM Router: Routes simple tasks to cheaper models for cost optimization
Study Modes:
- Quiz Mode: Multiple choice, true/false, short answer, written response - with AI grading and partial credit
- Flashcard Mode: Spaced repetition scheduling using SM-2 algorithm for optimal retention
- Notebook: Tracks your wrong answers for targeted review
Blockchain-Verified Achievements:
- Earn badges for study milestones (accuracy streaks, volume, mastery)
- Certificates stored permanently on IPFS
- Proof anchored to Base L2 blockchain (~$0.001 per verification)
- Verifiable forever - even if StudyForge disappears, your achievements persist
Analytics Dashboard:
- Learning Score: AI-calculated metric based on accuracy, consistency, and improvement
- Performance trends over time with interactive charts
- Hardest questions identification for focused review
- Category-by-category progress tracking
How I built it
Frontend:
- React 18 with TypeScript for type safety
- Vite for fast development builds
- Tailwind CSS for responsive styling
- ECharts for interactive analytics visualizations
- Framer Motion for smooth animations
- React Router for navigation
Backend:
- FastAPI (Python 3.11) with async/await throughout
- SQLAlchemy 2.0 with async support for database operations
- PostgreSQL 15 with pgvector extension for semantic search
- Alembic for database migrations
- Pydantic 2.5 for request/response validation
- Structlog for structured logging
AI Integration:
- Multi-provider support: Claude (Anthropic), GPT-4o (OpenAI), Groq (Llama 3.3 70B), Moonshot
- Intelligent routing: Simple tasks → cheap models, complex tasks → powerful models
- RAG pipeline with semantic chunking (MC-Indexing/SCAN algorithms)
- Vector embeddings for concept similarity matching
Blockchain:
- Pinata for IPFS uploads (certificate storage)
- Base L2 for on-chain anchoring (proof of achievement)
- Self-send transactions with IPFS hash in data field (minimal gas cost)
Infrastructure:
- Docker & Docker Compose for local development
- AWS Elastic Beanstalk for backend hosting
- AWS RDS for PostgreSQL database
- AWS S3 for frontend hosting
- Sentry for error tracking and performance monitoring
Kiro Integration:
- 12 steering documents guiding architecture decisions
- 11 automation hooks for testing, migrations, code quality
- Extensive vibe coding sessions for rapid prototyping
Challenges I ran into
1. Multi-Agent Coordination Getting 8 specialized agents to work together without conflicts was harder than expected. The Controller Agent needed careful design to route tasks appropriately and combine results. I solved this with a clear interface contract defined in the steering docs.
2. Style Learning Accuracy The Analysis Agent needed to extract subtle patterns from sample questions - not just "multiple choice" but the specific way a professor phrases distractors, the depth of knowledge tested, the balance of recall vs. application questions. I iterated through many prompt versions before achieving reliable style extraction.
3. Handwriting Recognition GPT-4o Vision is powerful but inconsistent with messy handwriting. Mathematical notation, arrows, and diagrams were particularly challenging. I added preprocessing guidance and structured prompts to improve accuracy.
4. Blockchain Gas Optimization Initially, I considered minting NFTs for achievements, but gas costs were prohibitive for a study app. So I pivoted to a simpler approach: store the full certificate on IPFS (free), then anchor just the IPFS hash to Base L2 using a minimal self-send transaction (~$0.001). Same verifiability, 100x cheaper.
5. RAG Pipeline Tuning Finding the right chunk size and overlap for question generation was tricky. Too small and questions lack context; too large and the AI gets overwhelmed. I implemented adaptive chunking based on document structure (headers, paragraphs) rather than fixed character counts.
6. Cost Management AI API costs add up quickly. I implemented SLM (Small Language Model) routing - simple tasks like formatting or basic classification go to Groq's Llama 3.3 70B (fast and cheap), while complex reasoning tasks go to Claude or GPT-4o. This reduced the API costs by ~60%.
Accomplishments that I'm proud of
- 8 specialized AI agents working in harmony with clear interfaces
- Style-matched question generation - questions that actually feel like your professor wrote them
- Blockchain-verified achievements - first study platform with permanent, verifiable on-chain proof of learning
- Handwriting recognition that handles real student notes, not just perfect text
- < $0.001 per achievement verification cost on Base L2
- 23 Kiro configuration files (12 steering docs + 11 hooks) demonstrating deep platform integration
- Production deployment on AWS with full monitoring and error tracking
- 60% cost reduction through intelligent SLM routing
- Sub-second response times for most operations despite complex AI pipelines
What I learned
Steering docs are game-changers: Documenting my architecture patterns in
.kiro/steering/meant Kiro consistently generated code that fit my conventions. No more "that's not how I do it here" corrections.Agent hooks save hours daily: The
check-api-contractshook alone probably prevented 20+ bugs by catching frontend-backend mismatches before they hit production.Multi-agent > monolithic prompts: Breaking AI tasks into specialized agents (analysis, generation, grading) produced better results than one massive prompt trying to do everything.
Web3 + Education has real potential: Students genuinely care about verifiable credentials. The blockchain verification feature got the most excited reactions in user testing.
Small models punch above their weight: Groq's Llama 3.3 70B handles 80% of the workload at 10% of GPT-4 costs. Smart routing matters more than always using the biggest model.
Kiro accelerates iteration speed dramatically: What would have taken weeks of manual coding was accomplished in days through vibe coding sessions with proper steering context.
What's next for StudyForge
Browser-Based Local AI (Transformers.js/WebLLM): Run small models directly in the browser for offline study mode and zero-latency hints. Students could study on airplanes or in areas with poor connectivity.
Voice Quiz Mode: Take quizzes hands-free using Web Speech API. Perfect for studying while commuting, exercising, or doing chores.
NFT Achievement Badges: Mint actual NFTs on Base for legendary achievements. Share your verified accomplishments on social media or LinkedIn.
Collaborative Study Groups: Share categories with classmates, compete on leaderboards, and see aggregated analytics for your study group.
Professor Dashboard: Let professors upload their question bank style, then students automatically get style-matched practice questions.
Mobile App: React Native version for true mobile-first studying with push notification reminders for spaced repetition reviews.
Built With
- alembic
- amazon-web-services
- anthropic-claude
- aws-elastic-beanstalk
- aws-rds
- base-l2
- docker
- echarts
- fastapi
- framer-motion
- groq
- ipfs
- openai-gpt-4o
- pgvector
- pinata
- postgresql
- pydantic
- python
- react
- sentry
- sqlalchemy
- structlog
- tailwind-css
- typescript
- vite
Log in or sign up for Devpost to join the conversation.