📖 About the Project

🎯 Inspiration

In Peru, the digital divide is more than a statistic; it's a barrier locking millions of rural students out of modern education. While connectivity in rural areas has grown, it still only reaches 55% of the population in 2024. Projects to expand the network are years away (2025-2026), but students need to learn today.

We were inspired by a simple, urgent question: Why should a student's education depend on their internet access? We decided to build a solution that bypasses the connectivity problem entirely, delivering a high-quality, intelligent learning experience to anyone with a basic smartphone.

When Google announced the Built-in AI Challenge 2025 and introduced Gemini Nano, I saw an incredible opportunity to solve these problems while respecting user privacy. The idea of running AI models directly in the browser opened up possibilities for creating a truly private, fast, and accessible learning tool.

💡 What It Does

The AI-Powered Learning Assistant is a comprehensive study companion that helps students and educators:

Core Features (Powered by Gemini Nano - Local Processing)

  1. 📄 PDF Summarization: Upload any study material and get intelligent summaries highlighting key concepts, main ideas, and important takeaways - all processed locally in your browser.

  2. 📝 Exercise Generation: Create five different types of practice exercises:

    • Multiple Choice questions with detailed explanations
    • True/False statements for quick review
    • Fill in the Blanks for active recall
    • Short Answer questions for deeper understanding
    • Matching exercises to connect related concepts

Each exercise includes difficulty levels, correct answers, explanations, and learning objectives.

  1. 🎴 Flashcard Creation: Generate smart flashcards from any document with:
    • Automatic key term extraction
    • Topic and subtopic organization
    • Difficulty levels
    • Tags for categorization
    • Detailed explanations

Additional Features (Powered by Gemini API)

  1. 🗺️ Learning Roadmaps: Create personalized learning paths with milestone tracking and prerequisite mapping.

  2. 🎮 Educational Games: Generate interactive word searches and crossword puzzles for fun, engaging review sessions.

🛠️ How I Built It

Technology Stack

Frontend:

  • Next.js 14 with App Router for a modern, performant web application
  • TypeScript for type-safe code and better developer experience
  • Tailwind CSS for responsive, beautiful UI design
  • Gemini Nano via Chrome's Built-in AI API for local processing
  • Gemini API for cloud-based features

Backend:

  • FastAPI (Python) for high-performance REST APIs
  • LangChain for managing AI prompts and chains
  • PyPDF2 & pdfplumber for PDF text extraction
  • Google Generative AI SDK for Gemini API integration

Architecture Decisions

  1. Hybrid AI Approach: I designed a hybrid architecture where privacy-sensitive features (summarization, exercises, flashcards) use Gemini Nano for local processing, while less sensitive features (roadmaps, games) leverage the cloud-based Gemini API for more complex operations.

  2. Modular Structure: Following SOLID principles, I created a modular structure with:

    • Reusable prompt templates
    • Structured JSON responses
    • Type-safe interfaces across frontend and backend
    • Utility functions for response parsing and validation
  3. Response Parser: Built a robust JSON parser that handles markdown-wrapped responses from AI models, ensuring clean data extraction.

Development Process

Phase 1: Research & Planning

  • Studied Gemini Nano's capabilities and limitations
  • Designed the architecture to maximize local processing benefits
  • Created TypeScript interfaces matching backend structures

Phase 2: Core Implementation

  • Set up Next.js frontend with TypeScript
  • Built FastAPI backend with LangChain integration
  • Implemented Gemini Nano client wrapper
  • Created prompt templates for each feature

Phase 3: Feature Development

  • PDF processing and text extraction
  • Exercise generation with 5 different types
  • Flashcard creation with intelligent formatting
  • Response parsing and validation utilities

Phase 4: Integration & Testing

  • Connected frontend with Gemini Nano
  • Tested on Chrome Canary
  • Validated JSON response structures
  • Refined prompts for better output quality

🚧 Challenges I Faced

1. Gemini Nano Availability & Setup

Challenge: Getting Gemini Nano to work required Chrome Canary, enabling experimental flags, and manually downloading the model.

Solution: Created comprehensive documentation in the README with step-by-step instructions. Added clear browser requirement sections to guide users through the setup process.

2. Response Format Inconsistency

Challenge: AI responses sometimes came wrapped in markdown code blocks (`json ...), making JSON parsing unreliable.

Solution: Built a robust parseJSONResponse utility that:

  • Detects and removes markdown formatting
  • Handles both wrapped and raw JSON
  • Provides clear error messages
  • Validates response structure before returning
export function parseJSONResponse<T>(response: string): T {
    // Remove markdown code blocks if present
    const markdownRegex = /^```(?:json)?\s*\n?([\s\S]*?)\n?```$/;
    const match = cleanedResponse.match(markdownRegex);

    if (match) {
        cleanedResponse = match[1].trim();
    }

    return JSON.parse(cleanedResponse) as T;
}

3. Type Safety Across Frontend & Backend

Challenge: Maintaining consistent data structures between TypeScript frontend and Python backend.

Solution: Created matching interfaces/models:

  • TypeScript interfaces in gobal.ts
  • Pydantic models in backend
  • Shared structure definitions
  • Comprehensive type checking

4. Prompt Engineering for Multiple Exercise Types

Challenge: Each exercise type requires different JSON structures and instructions.

Solution:

  • Created dedicated prompt templates for each type
  • Used LangChain's PromptTemplate for variable substitution
  • Included clear structure examples in prompts
  • Iteratively refined prompts based on output quality

5. PDF Text Extraction Quality

Challenge: Different PDF formats produced varying quality of extracted text.

Solution: Implemented multiple extraction strategies:

  • Primary: PyPDF2 for standard PDFs
  • Fallback: pdfplumber for complex layouts
  • Text cleaning and normalization
  • Error handling for corrupted files

6. Local Processing Limitations

Challenge: Gemini Nano has context window and capability limitations compared to cloud models.

Solution:

  • Optimized prompts to be concise but effective
  • Chunked large documents for processing
  • Set realistic expectations in UI
  • Used cloud API for complex features (roadmaps, games)

🎓 What I Learned

Technical Skills

  1. On-Device AI: Deep understanding of browser-based AI capabilities and limitations
  2. Hybrid Architecture: Designing systems that balance local and cloud processing
  3. Prompt Engineering: Crafting effective prompts for consistent, structured outputs
  4. Type Safety: Building type-safe applications across different languages
  5. Error Handling: Creating robust parsers and validation systems

Design Principles

  1. Privacy by Design: Putting user privacy first in architectural decisions
  2. Progressive Enhancement: Core features work locally, enhanced features use cloud
  3. User Experience: Making complex AI features simple and intuitive
  4. Performance: Optimizing for instant responses with local processing

AI & ML Insights

  1. Model Capabilities: Understanding what's possible with lightweight vs. full models
  2. Prompt Design: The importance of clear instructions and structure examples
  3. Output Validation: Never trust AI responses without validation
  4. Context Management: Working within context window limitations

🌟 What's Next

Planned Features

  • 📊 Progress Tracking: Monitor student performance over time
  • 🤝 Collaboration: Share study sets with classmates
  • 🎯 Adaptive Learning: AI-powered difficulty adjustment based on performance
  • 📱 Mobile App: Native mobile experience with offline capabilities
  • 🌍 Multi-language Support: Support for more languages in content generation
  • 🔊 Audio Summaries: Text-to-speech for accessibility
  • 📈 Analytics Dashboard: Insights for educators and students

Technical Improvements

  • Implement caching for frequently accessed content
  • Add progressive web app (PWA) capabilities
  • Optimize bundle size for faster loading
  • Add comprehensive unit and integration tests
  • Implement real-time collaboration features

🎯 Impact & Vision

This project demonstrates that powerful AI capabilities can run locally, respecting user privacy while providing instant, helpful features. The vision is to make quality educational tools accessible to everyone, regardless of internet connectivity or privacy concerns.

By leveraging Gemini Nano, we're proving that the future of AI isn't just in the cloud - it's right here in your browser, protecting your data while empowering your learning journey.


Acknowledgments

Special thanks to:

  • Google Chrome Team for creating the Built-in AI APIs
  • Google AI Team for Gemini Nano and Gemini API
  • The open-source community for amazing tools and libraries
  • Educators and students who provided feedback and inspiration

Built with ❤️ for the Google Built-in AI Challenge 2025

Built With

  • fastapi
  • geminiapi
  • gemininano
  • langchain
  • next.js
  • pdfplumber
  • pypdf2
  • taildwindcss
  • typescript
Share this project:

Updates