Inspiration

It wasn't a high-tech deepfake that fooled my parents; it was a simple, misleading thumbnail. My father shared a YouTube video in our family chat with a thumbnail showing a famous politician in handcuffs and a "Breaking News" banner. He was genuinely shocked. But when I clicked the video, it was just a 10-minute commentary full of rumors, with no actual arrest having taken place. The thumbnail was a lie designed to get a click. I realized that for my parents, the thumbnail is the news. They don't have the habit—or the energy—to cross-check every sensational image against official reports. I'm not a digital activist trying to save the world. Honestly, I was just tired of being the family fact-checker every time a link popped up in our chat. I realized that the problem wasn't necessarily "AI-generated video," but "Clickbait Engineering." I built Qolor AI simply to handle this repetitive verification task. I wanted a tool that automatically compares the "hook" (Thumbnail/Title) with the "reality" (Web News), so my parents can see—at a glance—if they are being informed or just being fished.

What it does

Qolor AI is a Chrome extension that serves as a real-time credibility co-pilot for YouTube. It works silently in the background, analyzing the video you are currently watching to provide an instant, objective assessment of its trustworthiness. The Traffic Light Verdict: We simplified complex analysis into a UI anyone can understand. A Green/Yellow/Red indicator sits right next to the video title, offering an immediate signal of safety. Multimodal "Bait" Detection: It doesn't just read the title; it "sees" the thumbnail. It detects if the image contains fabricated scenarios or if the text in the thumbnail contradicts the actual video content (clickbait). Fact-Grounding: It cross-references the video's claims against real-time data from Google News, flagging content that presents rumors as confirmed facts.

How we built it

We built Qolor AI using JavaScript, HTML, and CSS within the Chrome Extension (Manifest V3) architecture. Core Logic: The extension extracts metadata from the active YouTube tab. AI Integration: We utilized the Google Gemini API to process natural language. Gemini plays a dual role: first, to understand the video's context and formulate effective search queries; and second, to synthesize the search results into a concise credibility report. Backend: We focused on a client-side approach to ensure privacy and speed, leveraging Gemini's fast inference capabilities.

Challenges we ran into

(1) The "Real-Time" Paradox Our biggest hurdle was latency and cost. Users won't wait 10 seconds for a verification badge, and calling three different APIs for every single view would be prohibitively expensive.

Solution: We implemented a Firestore caching layer to eliminate redundant analysis and prevent API waste. Our system checks the database first to ensure that the same video is never analyzed twice. Since viral content is viewed by thousands, this means only the very first viewer triggers the expensive AI pipeline, while all subsequent users receive the result instantly from the cache—optimizing both speed and resource efficiency.

(2) Hallucinations vs. Verification Early versions of the AI would sometimes confidently hallucinate a verdict without evidence.

Solution: We rigidly structured Gemini’s prompt. We forced it to output strictly in JSON format and required it to "cite" the specific search result from Serper that supported its decision. If no evidence was found, the system defaults to a "Neutral" state rather than guessing.

(3) Defining "Fake" & Scope "Fake" is subjective. Is a prank video fake news? We needed to prevent the AI from over-analyzing harmless entertainment.

Solution: We implemented a strict Category Whitelist in our backend logic. We defined variables containing only specific high-risk categories (e.g., "News & Politics"). The system checks the video's metadata first; if the category doesn't match our target list, the analysis is bypassed entirely, and the UI defaults to "Yellow" (Neutral). This simple logic gate prevents false positives on gaming or comedy videos while conserving API usage.

Accomplishments that we're proud of

The ACCA Algorithm: We are most proud of moving beyond simple prompt engineering to "Algorithm Engineering." We successfully quantified abstract concepts like "Fabricated Urgency" into a mathematical penalty score (0 to -100).

Multimodal Integrity: We solved the "Bait-and-Switch" problem. Many videos have innocent titles but misleading thumbnails. By using Vision API to "read" the thumbnail, our system catches inconsistencies that text-only models miss.

Family-Approved UI: I showed the prototype to my parents, and they understood the "Green/Red" signal immediately without explanation. That was our biggest victory.

What we learned

prompting is Programming: We learned that natural language prompts are insufficient for production apps. We had to treat our prompt like code—defining variable types, logic gates, and fallback scenarios to get consistent JSON outputs.

The Importance of Context: A text claiming "Aliens landed" is fake news on a news channel, but fiction on a movie review channel. We learned that context classification must happen before fact-checking.

The Power of Serverless: Integrating Firebase allowed us to focus 90% of our time on the AI logic and 10% on infrastructure, which was crucial for a hackathon timeline.

What's next for Qolor AI

Our vision is to evolve Qolor AI from a browser utility into a ubiquitous layer of digital protection that lives wherever users consume content.

Beyond the Browser (Mobile Expansion): Currently, we operate within the safe confines of a Chrome Extension. Our immediate next step is to develop a mobile-native application (utilizing Android Accessibility Services) that can overlay our "Traffic Light" UI on top of native apps. We want Qolor AI to be platform-agnostic, protecting users whether they are on a laptop or a smartphone.

Conquering the "Shorts" Ecosystem: Misinformation spreads fastest in short-form content (YouTube Shorts, Instagram Reels, TikTok) where context is often stripped away. We plan to optimize our ACCA algorithm for high-velocity, vertical video feeds, ensuring that safety checks can keep up with the rapid scroll speed of modern consumption.

A Universal Standard: Ultimately, we don't just want to be an "app." We aim to establish Qolor AI as a standard protocol for digital literacy—a "digital seatbelt" that everyone, from teenagers to the elderly, implicitly relies on for a safer internet experience.

Gemini integration

Our service operates on a three-tier architecture that maximizes efficiency. The frontend Chrome extension extracts YouTube video IDs and passes them to the backend. To optimize resources, the backend first retrieves data from Firestore. For previously analyzed videos, the stored results are returned immediately, minimizing latency and API usage.

Our AI pipeline only analyzes specific categories, such as "News & Politics", "Nonprofits & Activism", "Science & Technology", "Education", "People & Blogs", "Howto & Style", "Auto & Vehicles" when existing data is unavailable.

This analysis is performed through three API steps:
Vision API: Extracts thumbnail OCR and converts visual data into text.
Serper API: Gather evidence by performing Google News or web searches based on extracted text.
Gemini 3 Flash API: A core engine that generates optimized search queries and leverages multimodal capabilities to accurately assess video credibility.

The collected raw data is evaluated using our proprietary ACCA algorithm, and a final confidence score is calculated. The results are stored in Firestore, completing the data cycle. Users can immediately view the intuitive confidence score UI.

By integrating Gemini 3 Flash, you can derive reliable insights in real time. This architecture ensures a seamless, data-driven experience for users.

Built With

Share this project:

Updates