Learnable.ai Hackathon Project Description Inspiration The idea for Learnable.ai was born from the challenge of pursuing a Master's degree in Germany while juggling a part-time job. To make studying more efficient, I explored AI tools and identified a need for a solution that could transform diverse educational content into digestible resources. This led to Learnable.ai, a tool designed to accelerate learning by leveraging AI to create structured study materials. What it does Learnable.ai is an AI-powered educational tool that processes multimodal inputs—text (PDF, DOCX, plain text), video (YouTube, MP4, Zoom recordings), and audio (MP3, WAV, M4A)—using the Google Gemini API. It generates:
Summaries: Concise overviews of input content. Quizzes: Multiple-choice questions (MCQs), true/false, and fill-in-the-blank formats. Flashcards: Key terms with definitions and explanations for quick review. Mind Maps: Hierarchical JSON structures visualized using Mermaid.
Gemini’s multimodal capabilities handle all input types, producing tailored learning resources to enhance study efficiency. How we built it Learnable.ai was built with a streamlined tech stack centered around the Google Gemini API:
Frontend: React for a dynamic, user-friendly interface. Backend: Django Ninja for efficient API management. Multimodal Processing: Gemini API processes text, audio, and video inputs directly, handling transcription, summarization, quiz generation, flashcard creation, and JSON mind map generation. Mind Map Visualization: Mermaid renders interactive mind maps from Gemini-generated JSON structures. Storage: Firebase for scalable data storage. Setup: Integrated Gemini API using pip install google-generativeai and Python scripts to process inputs and generate outputs.
The development focused on leveraging Gemini’s multimodal strengths to simplify the pipeline, eliminating the need for additional tools like ffmpeg or Whisper for audio/video processing. Challenges we ran into
Input Diversity: Ensuring Gemini consistently processed varied inputs (e.g., low-quality audio or lengthy videos) required careful prompt engineering. JSON Consistency: Generating well-structured JSON for mind maps that Mermaid could render accurately was challenging. Performance Optimization: Balancing Gemini’s processing speed with output quality for large inputs was a hurdle. UI Integration: Rendering Mermaid mind maps seamlessly in the React frontend required fine-tuning. Hackathon Time Constraints: Prioritizing core features while ensuring a polished user experience within the deadline was demanding.
Accomplishments that we're proud of
Built a fully multimodal tool using Gemini to process text, audio, and video inputs into diverse learning resources. Successfully generated and visualized hierarchical mind maps using Gemini’s JSON output and Mermaid. Created an intuitive React interface for easy user interaction. Streamlined the tech stack by relying on Gemini’s multimodal capabilities, reducing dependency on external processing tools. Delivered a scalable solution with FastAPI and Firebase, ready for future expansion.
What we learned
Multimodal AI: Gained expertise in using Gemini’s multimodal capabilities for text, audio, and video processing. Prompt Engineering: Learned to craft precise prompts to ensure consistent Gemini outputs for summaries, quizzes, flashcards, and JSON structures. Visualization: Mastered integrating Mermaid with Gemini-generated JSON for interactive mind maps. Efficient Development: Developed strategies for rapid prototyping and debugging under hackathon constraints. User Experience: Understood the importance of seamless UI for educational tools to maximize usability.
What's next for Learnable.ai
Feature Expansion: Add real-time collaboration, adaptive quizzes, and personalized learning paths. Enhanced Multimodal Support: Improve Gemini’s handling of longer or noisier audio/video inputs. Mobile App: Develop iOS and Android apps for accessible learning. Multilingual Capabilities: Leverage Gemini for non-English content processing and output. Analytics: Introduce a dashboard to track study progress and optimize resource generation. Community Engagement: Open-source parts of the codebase to foster collaborative development.
Log in or sign up for Devpost to join the conversation.