Inspiration
I like to learn and I wanted to learn more efficiently. So I thought of some way to pre-study the video material faster, so I can see if the video really interests me and to have some idea of the video content. The core idea was to shift from a linear video timeline to a non-linear Knowledge Graph, allowing users to quickly navigate through the video's content.
What it does
MelMap transforms YouTube videos into interactive Knowledge Graphs, making it easier to digest complex information.
- Visual Knowledge Compass: It converts video transcripts into a hierarchical graph of nodes (Chapters, Concepts, Evidence), allowing you to virtually "walk" through the video's ideas.
- Navigation: Instead of scrubbing a timeline, you click on nodes to see the specific context and timestamp.
- AI Tutor: An AI-powered chat assistant that has the full context of the video. It acts as a personal tutor, answering questions and clarifying specific nodes.
- Interactive Quizzes: The app generates multiple-choice questions to test your understanding of the relationships between two concepts.
- Summaries: Automatically generates narrative summaries for specific clusters of information to give you the gist before you dive deep.
How it was built
The project connects a modern frontend visualization with a powerful AI backend. Backend:
- FastAPI: Used for building a high-performance, asynchronous REST API.
- Google Gemini (Flash): The brain of the operation. It processes massive transcripts, extracts structured knowledge (JSON), generates quizzes, and powers the "Professor Proxy" chatbot.
- YouTube Transcript API: To extract the raw text data from videos.
- Pydantic: For robust data validation and defining the Graph schema (Nodes, Edges).
Frontend:
- React + Vite: For a fast, responsive user interface.
- React Flow: The core library used to render the interactive, zoomable knowledge graph.
- TailwindCSS: For modern, clean, and responsive styling.
- Dagre: Utilized for the automatic layout algorithms that organize the graph nodes logically.
Challenges I ran into
- Prompt Engineering: Tuning the Gemini system prompt to consistently output valid JSON graphs with the correct hierarchy (Chapters -> Concepts -> Evidence) was tricky.
- Graph Visualization: Preventing the visual "Spaghetti Monster" effect where too many nodes create a chaotic web.
- Context Management: Ensuring the "AI Tutor" had enough context from the transcript without exceeding token limits or hallucinating information was a balancing act.
Accomplishments that I'm proud of
- Video-to-Graph Conversion: Successfully creating a pipeline that takes a URL and outputs a navigable map.
- Educational Utility: The "Quiz" feature tests the understanding of how concepts are related.
What's next for MelMap
- Deployment & Proxying: Currently local-only. The next step is setting up a robust backend proxy for the YouTube API to avoid rate limits and IP blocks in a production environment.
- Persistent Storage: Adding a database (like Neo4j or PostgreSQL) to save generated maps so users don't have to re-process the same video.
- Multi-Video Maps: Linking related videos into a larger "super-graph" for entire courses or playlists.
Built With
- css3
- dagre
- fastapi
- google-gemini-api
- html5
- pydantic
- python
- react
- react-flow
- tailwind-css
- typescript
- vite
- youtubetranscriptapi
Log in or sign up for Devpost to join the conversation.