Inspiration
Cramming for exams or looking for a detail? Struggling to find the perfect point in the video on YouTube?
We hear you. While YouTube is a treasure trove of knowledge, sifting through the entire video for specific topics during quick revision can be a pain.
This is why we envisioned an application that transforms videos into concise notes, making revision a breeze.
What it does
This application tackles the challenge of video-based learning by automatically converting YouTube videos into transcripts. These transcripts are then intelligently transformed into well-formatted markdown files.
Markdown's popularity in note-taking and writing stems from its simplicity and clean aesthetics, making it a perfect choice for presenting key information gleaned from videos.
How we built it
- Extracting the Essence: Upon receiving a YouTube video link, the application cleverly extracts the audio track.
- Transcription Magic: The extracted audio is then fed into AssemblyAI, a popular and reliable service, for accurate transcription.
- From Speech to Structured Notes: The transcribed text is passed on to the mighty Gemini 1.5, a state-of-the-art AI model, which transforms it into well-organized markdown notes.
- Seamless Delivery: Finally, the generated markdown file is automatically downloaded and presented to you, ready for efficient revision.
- Deployment: Deployed the application using streamlit.
Challenges we ran into
- Taming the Audio Stream: Downloading YouTube audio directly can be tricky.
- The Quest for Accurate Transcription: Finding an AI model that consistently delivered high-fidelity transcripts proved to be a task. We experimented with various options before settling on AssemblyAI, known for its impressive accuracy.
- From Verbatim to Valuable Notes: Transforming lengthy transcripts into concise and well-structured markdown notes required a powerful AI with exceptional context comprehension. Thankfully, Gemini 1.5 stepped up to the challenge, transforming raw text into notes that make revision a breeze.
Accomplishments that we're proud of
- Cracking the Transcription Challenge: Overcoming the hurdle of transcribing YouTube video content is a win in itself.
- Taming the Power of Gemini 1.5: We successfully harnessed the capabilities of Gemini 1.5, a powerful AI model, even without the aid of specific libraries like langchain and llama-index. This demonstrates our ability to work with cutting-edge technology and achieve our goals.
What we learned
- AssemblyAI in Action: We delved into the world of AssemblyAI, a powerful tool for audio transcription, and gained insights into its functionalities and potential applications.
- Unlocking Gemini 1.5's Potential: We explored the capabilities of Gemini 1.5, Google's latest AI model, and discovered effective methods for utilizing it, even without relying on specific libraries like langchain and llama-index. This broadened our understanding of cutting-edge AI models and their potential to solve real-world problems. ## What's next for DoneNote
The Future of DoneNote
- Interactive Learning: We envision transforming DoneNote into a more interactive and engaging platform. Flask, a popular Python framework, will be instrumental in achieving this goal.
- Visual Learning Powerhouse: We're exploring the potential of image generation to create visual representations alongside the markdown notes. This will cater to learners who benefit from a multimodal approach to grasping information.
Built With
- assemblyai
- python
- streamlit






Log in or sign up for Devpost to join the conversation.