Inspiration

Cramming for exams or looking for a detail? Struggling to find the perfect point in the video on YouTube?

We hear you. While YouTube is a treasure trove of knowledge, sifting through the entire video for specific topics during quick revision can be a pain.

This is why we envisioned an application that transforms videos into concise notes, making revision a breeze.

What it does

This application tackles the challenge of video-based learning by automatically converting YouTube videos into transcripts. These transcripts are then intelligently transformed into well-formatted markdown files.

Markdown's popularity in note-taking and writing stems from its simplicity and clean aesthetics, making it a perfect choice for presenting key information gleaned from videos.

How we built it

  • Extracting the Essence: Upon receiving a YouTube video link, the application cleverly extracts the audio track.
  • Transcription Magic: The extracted audio is then fed into AssemblyAI, a popular and reliable service, for accurate transcription.
  • From Speech to Structured Notes: The transcribed text is passed on to the mighty Gemini 1.5, a state-of-the-art AI model, which transforms it into well-organized markdown notes.
  • Seamless Delivery: Finally, the generated markdown file is automatically downloaded and presented to you, ready for efficient revision.
  • Deployment: Deployed the application using streamlit.

Challenges we ran into

  1. Taming the Audio Stream: Downloading YouTube audio directly can be tricky.
  2. The Quest for Accurate Transcription: Finding an AI model that consistently delivered high-fidelity transcripts proved to be a task. We experimented with various options before settling on AssemblyAI, known for its impressive accuracy.
  3. From Verbatim to Valuable Notes: Transforming lengthy transcripts into concise and well-structured markdown notes required a powerful AI with exceptional context comprehension. Thankfully, Gemini 1.5 stepped up to the challenge, transforming raw text into notes that make revision a breeze.

Accomplishments that we're proud of

  1. Cracking the Transcription Challenge: Overcoming the hurdle of transcribing YouTube video content is a win in itself.
  2. Taming the Power of Gemini 1.5: We successfully harnessed the capabilities of Gemini 1.5, a powerful AI model, even without the aid of specific libraries like langchain and llama-index. This demonstrates our ability to work with cutting-edge technology and achieve our goals.

What we learned

  • AssemblyAI in Action: We delved into the world of AssemblyAI, a powerful tool for audio transcription, and gained insights into its functionalities and potential applications.
  • Unlocking Gemini 1.5's Potential: We explored the capabilities of Gemini 1.5, Google's latest AI model, and discovered effective methods for utilizing it, even without relying on specific libraries like langchain and llama-index. This broadened our understanding of cutting-edge AI models and their potential to solve real-world problems. ## What's next for DoneNote

The Future of DoneNote

  • Interactive Learning: We envision transforming DoneNote into a more interactive and engaging platform. Flask, a popular Python framework, will be instrumental in achieving this goal.
  • Visual Learning Powerhouse: We're exploring the potential of image generation to create visual representations alongside the markdown notes. This will cater to learners who benefit from a multimodal approach to grasping information.

Built With

Share this project:

Updates