yt-sentiment-analysis

A YouTube comment sentiment analysis application leveraging Large Language Models (LLMs). This project is specifically designed to analyze sentiment in YouTube comments on videos related to Malaysian politics, aiming to provide insights into public opinion regarding political entities and topics.

The project consists of two main components:

  • gui_app.py: A user-friendly graphical interface for managing inputs (such as YouTube video URLs), initiating the sentiment analysis process, and viewing the summarized results.
  • scraper_v2.py: A powerful backend script responsible for fetching YouTube comment data, utilizing AI models for filtering relevant comments and performing sentiment analysis, and generating detailed output reports.

Demo

https://github.com/user-attachments/assets/3f6d7fd2-c67b-44eb-8538-7f8f6732b394

Features:

  • User-Friendly Interface: A GUI (gui_app.py) for easy interaction, input management, and results visualization.
  • Targeted Video Fetching: Retrieves videos from a specified YouTube channel.
  • AI-Powered Political Video Filtering: Utilizes LLMs to identify and select videos relevant to Malaysian politics based on titles and descriptions.
  • Automated Audio Processing: Downloads audio from selected videos.
  • Speech-to-Text Transcription: Transcribes video audio using AI (Fireworks Whisper).
  • Content Summarization: Generates concise contextual summaries of video content using LLMs.
  • In-Depth Comment Analysis:
    • Fetches comments from YouTube videos.
    • Performs targeted sentiment analysis on individual comments using LLMs.
    • Identifies sarcasm and provides reasoning.
    • Determines the primary political entity targeted by the comment.
    • Assigns a sentiment score (positive, negative, neutral) and a numerical score (-1.0 to 1.0).
    • Provides reasoning for sentiment and entity identification.
    • Estimates the confidence level of the analysis.
  • Comprehensive Reporting:
    • Generates an overall sentiment summary across multiple videos and comments.
    • Creates a visual plot of sentiment distribution.
    • Saves detailed results for each video in a structured folder format.
  • Flexible API Key Management: Supports API key input via GUI or a .env file.
  • Customizable Analysis Parameters: Allows users to define the maximum number of videos to scan and comments to analyze.
  • CLI Availability: Offers a command-line interface (scraper_v2.py) for backend operations and scripting.

Tech Stack

  • Programming Language:
    • Python (specifically Python 3.9)
  • Graphical User Interface (GUI):
    • Tkinter
    • customtkinter (for modern theming)
  • Data Acquisition & Web Scraping:
    • google-api-python-client (for YouTube Data API v3)
    • yt-dlp (for downloading YouTube video audio)
  • Artificial Intelligence & Machine Learning (LLM):
    • langchain (for orchestrating LLM interactions)
    • Fireworks AI API (for accessing LLM models)
      • Models used: accounts/fireworks/models/qwen3-235b-a22b (for filtering, summarization, sentiment analysis), whisper-v3 (for audio transcription via Fireworks' OpenAI-compatible endpoint)
    • openai (Python client, used for Fireworks AI's Whisper-compatible API)
  • Data Handling & Manipulation:
    • Standard Python libraries (JSON, OS, re, etc.)
  • Data Visualization:
    • matplotlib (for generating sentiment plots)
  • Environment & Configuration:
    • python-dotenv (for managing API keys from .env files)
  • Utilities:
    • Pillow (PIL - Python Imaging Library, used by customtkinter and potentially matplotlib)

Built With

Share this project:

Updates