System architecture

Inspiration

educational videos are everywhere, but real learning often isn’t. Learners tend to passively watch without retaining or truly understanding the content. At the same time, self-driven learners have unique goals, paces, and preferences—but most platforms treat them all the same. This insight sparked the idea for Cognito Stream—a platform that reimagines video learning. By breaking down content into interactive, personalized learning experiences, we’re transforming we’re transforming passive viewing into active mastery.

What it does

Cognito Stream ingests any educational video and creates a personalized learning experience around it: Key Features Include • Transcribing the video using Gemini models • Extract key concepts and learning objectives • Generate concise summaries • Build personalized quizzes and flashcards • Recommend adaptive learning paths • Maintain a persistent learner profile • Offer multilingual support and accessibility options

How we built it

"Under the hood, our platform is built on a robust and modular architecture. We utilize the YouTube API for seamless video input and transcript extraction. The core intelligence comes from Google Agent Development Kit and Google's Gemini models, which we've tailored with custom prompting for education-focused output. This allows us to intelligently chunk content, generate questions, and craft personalized recommendations. The user interface is powered by Streamlit, providing a clean, responsive, and user-friendly experience. We've implemented a persistent user profile and learning history, ensuring your progress and preferences are saved across sessions. One of our key achievements was integrating these diverse AI capabilities into a coherent, real-time learning ecosystem, making complex AI accessible for everyday learning." We deployed into Google Cloud App Engine and secured using Identity Aware Proxy (IAP)

Challenges we ran into

Developing the AI-Enhanced Video Learning Platform within the demanding environment of a hackathon presented several significant challenges that pushed our team's problem-solving capabilities:

Ensuring Robust Data Persistence & Scalability: Our primary hurdle was establishing true, cross-session user data persistence. While our vision included comprehensive progress tracking and personalized learning paths, integrating a full-fledged backend database solution (as described in our long-term roadmap) was beyond the scope of a hackathon timeframe. This meant making a conscious trade-off, relying on Streamlit's session state, which resets, limiting the long-term user experience and the platform's scalability for a large user base.
Taming Generative AI for Structured Output: Leveraging Google's Gemini models was central to our platform, but integrating generative AI for highly structured outputs (like JSON for quizzes, summaries, and learning paths) proved complex. We faced challenges in consistently prompting the models to return the exact format required, handling potential malformed responses, and building robust fallback mechanisms to ensure a smooth user experience even when the AI didn't perfectly adhere to the schema.
Managing Content Processing at Scale: Processing extensive video transcripts and feeding them into AI models introduced challenges related to context window limitations and API rate limits. For longer videos, we had to devise strategies like intelligent chunking and sampling to ensure the AI could process the content effectively without exceeding token limits, which sometimes meant making compromises on the depth of analysis for very lengthy materials.
Balancing Ambitious Vision with Hackathon Constraints: Our product vision was expansive, aiming for a truly adaptive and personalized learning ecosystem. The compressed timeline of the hackathon meant we had to prioritize core functionalities. This led to some features, such as advanced video metadata retrieval, full UI customization application, and comprehensive data management logic (e.g., the "reset marker file"), being designed but not fully implemented to the depth outlined in our initial concept. #Accomplishments that we're proud of Building the AI-Enhanced Video Learning Platform within the hackathon's intense environment allowed us to achieve several milestones that we are particularly proud of: • Successfully Harnessing Advanced AI for Dynamic Content Transformation: We are incredibly proud of our ability to leverage Google Gemini AI models to not just generate text, but to intelligently analyze video content and transform it into structured, actionable learning materials. This includes multi-level summaries, interactive quizzes, dynamic flashcards, and personalized learning path recommendations, fundamentally changing passive consumption into active engagement. • Developing a Robust and Scalable Modular Architecture: Our agent-based design allowed us to build a complex system with clear separation of concerns. This modularity not only accelerated our development within the hackathon's tight constraints but also ensures the platform is highly extensible and maintainable for future enhancements and integration of new AI capabilities. • Delivering an Intuitive and Engaging User Experience: Despite the sophisticated AI operating in the background, we prioritized a clean, user-friendly interface. Our rapid iteration with Streamlit allowed us to create an intuitive workflow that makes complex AI features accessible and enjoyable for learners of all levels, demonstrating that powerful technology can also be user-friendly. • Rapid Prototyping and End-to-End Deployment: From concept to a fully functional, AI-powered web application deployed securely on Google App Engine with Identity-Aware Proxy (IAP), we demonstrated exceptional agility. Delivering a comprehensive solution that integrates multiple advanced technologies within a hackathon timeframe is a testament to our team's efficiency, technical prowess, and collaborative spirit. • Pioneering Personalized Learning: We successfully demonstrated the potential for AI to adapt educational content to individual learning styles and goals, moving beyond a one-size-fits-all approach. This personalization is a core differentiator and a significant step towards our vision for the future of education. #What we learned
The Art of Prompt Engineering is Paramount: We learned that the success of AI-driven applications heavily relies on meticulous prompt engineering. Crafting clear, concise, and well-structured prompts, especially when aiming for specific JSON outputs, was an iterative and often challenging process. We gained a deeper appreciation for how subtle changes in phrasing or instruction can significantly impact the quality and format of AI-generated content.
Balancing Vision with Practical Constraints: A hackathon environment forces critical decision-making. We had an ambitious vision for a fully persistent and scalable platform, but quickly learned to prioritize core functionalities that could be delivered within the tight timeframe. This meant making strategic trade-offs, such as initially relying on session state for user data rather than a full-fledged database, to ensure a working prototype.
Modularity Accelerates Development: Our agent-based architecture proved invaluable. By encapsulating distinct AI functionalities into separate modules (e.g., QuizAgent, SummarizerAgent), we could develop, test, and debug components independently. This modularity not only streamlined our workflow but also made the codebase more manageable and easier to extend.
Streamlit's Power and Limitations: Streamlit was a game-changer for rapid prototyping and building an interactive UI with Python. It allowed us to focus on the AI logic and immediate user feedback. However, we also learned about its limitations for complex, multi-user applications requiring robust data persistence and intricate UI customizations, which will inform our future architectural decisions.
The Importance of Robust Error Handling: When working with external APIs and generative AI, unexpected responses and errors are inevitable. We learned the critical importance of implementing comprehensive error handling and graceful fallback mechanisms to ensure the application remains stable and provides a coherent user experience even when external services don't behave as expected.
Agile Iteration is Key: The hackathon's fast pace reinforced the value of agile development. We continuously iterated on features, gathered feedback (even if just internal), and adapted our plans based on discoveries and challenges encountered, leading to a more refined and functional product. #What's next for Cognito Stream
Enhanced Content Versatility: o Direct File Uploads & Multi-Platform Support: Beyond YouTube, we aim to integrate support for direct video file uploads (e.g., MP4, MOV) and expand compatibility to other popular video hosting platforms (e.g., Vimeo, educational portals). This will significantly broaden the range of content users can leverage. o PDF Document Processing Integration: To create a truly comprehensive learning hub, we plan to extend our AI processing capabilities to static documents like PDFs. This would allow users to upload textbooks, research papers, or lecture notes and generate summaries, quizzes, and flashcards directly from text-based content.
Advanced Learning & Collaborative Features: o Custom Quiz Creation Tools: Empowering users to create their own quizzes from processed content or even from scratch, fostering active recall and knowledge consolidation. o Collaborative Learning Capabilities: Introducing features that allow users to share learning paths, collaborate on study materials, or engage in group discussions, transforming individual learning into a community experience. o Gamification & Achievement System: Implementing badges, points, and leaderboards to further motivate learners and provide tangible recognition for their progress and milestones.
Robustness & Scalability: o Database Integration for Persistent User Data: This is a critical next step. Moving from temporary session state to a robust backend database will enable true persistence of user profiles, learning history, progress, and personalized recommendations across sessions and devices, fulfilling a core promise of the platform. o Improved AI Model Management: Implementing more sophisticated model selection and fine-tuning capabilities, allowing for dynamic switching between models based on task complexity, user preference (e.g., speed vs. accuracy), and cost efficiency.
User Experience & Accessibility Enhancements: o Mobile-Responsive UI: Optimizing the user interface for seamless experience across various devices, including smartphones and tablets, ensuring learning is accessible anytime, anywhere. o Offline Functionality: Exploring options for users to download processed learning content (summaries, flashcards) for offline study, enhancing flexibility and accessibility in low-connectivity environments.

Built With

appengine
gemini
googleadk
iap
streamlit
youtube

Updates

Sampann Nigam posted an update — Jun 23, 2025 10:19 PM EDT

Besides deploying on Google App Engine, we also deployed it on Google Cloud run for better scalability Cloud run version: https://video-learning-831489983730.us-central1.run.app/

Log in or sign up for Devpost to join the conversation.

Ipsita Nanda started this project — Jun 23, 2025 07:47 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.