Inspiration We've all been there — halfway through a 2-hour lecture or podcast, realising we've absorbed almost nothing. The problem isn't attention span, it's signal density. Long-form video content is emotionally flat on the surface but rich underneath: a professor's tone shifts when something is truly critical, a debater's cadence tightens when they're uncertain, an interviewer leans into excitement at a breakthrough moment. None of that subtext is visible in a transcript or a progress bar. We built ToneTrack because we wanted to watch smarter, not harder — to know how something is being said, not just what.
What it does ToneTrack is a Chrome extension that runs silently alongside any YouTube video. The moment you land on a watch page, it injects a sleek glassmorphic sidebar to the right of the player and gets to work. It pulls the video's auto-generated transcript directly from YouTube's page data (no API key, no scraping friction), chunks it into ~2-minute segments, and sends each chunk to the Claude API. Claude tags every segment with an emotional tone — cautious, analytical, excited, hopeful, concerned, or neutral — and returns a one-line summary. The sidebar then renders the full transcript colour-coded by emotion: amber for cautious moments, coral for concern, teal for hope, purple for dense analytical content. As the video plays, the active line scrolls into view and highlights in sync. A live summary card at the top updates every two minutes. At the bottom, an emotion timeline bar maps the entire video's emotional arc at a glance. Everything is stored locally via chrome.storage.local, keyed by video ID. No login. No backend. Nothing leaves your device except the transcript text sent to the API.
How we built it We used Lovable to build ToneTrack as a Chrome extension. The extension scrapes the transcript directly from any YouTube video, chunks it into segments, and sends each chunk to the Gemini API for emotional analysis. The results render as a colour-coded sidebar that syncs live with video playback. Everything runs in the browser — no backend, no server.
Challenges we ran into Getting the transcript scraping to work reliably was harder than expected — YouTube's caption data is inconsistently structured and breaks in edge cases. The Gemini API needed significant prompt tuning before it returned clean, consistent emotion labels. Building a sidebar that sits alongside YouTube's player without breaking the page layout also took multiple attempts.
Accomplishments that we're proud of ToneTrack works on real YouTube videos with no login and no backend. The moment where you click a highlighted transcript line and the video jumps to that exact moment is genuinely satisfying. We shipped a fully functional, installable Chrome extension in under 5 hours — built by a team of four who'd never made a Chrome extension before.
What we learned Chrome extensions have strict rules that a normal web app doesn't — we learned that the hard way. Prompting an AI model for structured output is a skill in itself; the wording of your instructions matters enormously. And Lovable made it possible to go from idea to working demo in a single hackathon session.
What's next for ToneTrack We want to add speaker identification so emotions are tagged per person, not just per time segment. Expanding beyond YouTube to Zoom recordings and podcast platforms is a natural next step. Longer term, a personal emotion library — your entire watching history mapped by tone — would let you search across hundreds of hours of content by how something was said, not just what was said.
Log in or sign up for Devpost to join the conversation.