posted an update

To automate the "rewatching" and learning process without explicit human feedback, we implemented an Auto-Learning Feedback Loop. This system treats user edits (after the video is live) as the "Ground Truth" of their preferences.

If the AI posted a video with "Natural" grading, but the user manually changed it to "Teal and Orange" on YouTube, the system should treat that delta as a high-priority instruction to update its internal models.

The Auto-Learning Architecture

The process involves three distinct stages: Observation, Diffing, and Reinforcement.


1. The Observation Engine

We use a background task to periodically poll the YouTube API for the current state of a video. We compare the "Live" version against the "Snapshot" stored in our MemoryAgent at the time of upload.

def check_for_user_edits(video_id: str, stored_metadata: dict):
    # Fetch current state from YouTube API
    live_metadata = youtube_api.get_video_details(video_id)

    # 1. Detect Metadata Edits (Titles/Tags)
    if live_metadata['title'] != stored_metadata['title']:
        update_preference_cluster('title_style', live_metadata['title'])

    # 2. Detect Color/Visual Edits 
    # This requires downloading a single frame and comparing color histograms
    live_frame = download_thumbnail(video_id)
    original_frame = get_stored_thumbnail(video_id)

    visual_delta = compare_visual_profiles(original_frame, live_frame)
    if visual_delta['change_detected']:
        # Extract specific grading preference (e.g., increased saturation)
        apply_visual_learning(visual_delta['new_profile'])


2. Preference Fingerprinting

Instead of just recording one-off changes, the system builds a Preference Fingerprint. If a user edits 5 videos in a row to be "Unlisted" despite the AI suggesting "Public," the PrivacyAgent should receive a permanent weight adjustment.

Edit Detected Learning Action Impact
Title Change NLP Keyword Extraction Updates suggested_title generation logic.
Color Grading Histogram Shift Detection Adjusts default LUT or saturation parameters.
Privacy Toggle Binary Update Strongest signal: Overrides PrivacyAgent weights.
Tag Removal Negative Association Prunes specific tags from future analysis results.

3. Integrated with learning_agent.py

LearningAgent handles these "Implicit Feedback."

class PreferenceLearningAgent(LearningAgent):
    """
    learn from user-initiated edits (Implicit Feedback).
    """
    def process_implicit_feedback(self, video_id: str):
        # 1. Diff the current YouTube state vs our memory
        diff = self.memory.get_edit_diff(video_id)

        if not diff:
            return # No changes made by user

        # 2. Convert edits into "Rewards"
        # If the user changed our decision, it's a negative reward for the old model
        # and a positive reward for the new user-provided state.
        for field, new_value in diff.items():
            self.memory.update_weights(
                feature=field,
                value=new_value,
                weight_increment=0.25 # Incremental learning
            )

        self.security_logger.log_policy_update(f"Auto-learned preference for {field}")

Explanation

  1. Thumbnail Diffing: OpenCV compares the color distribution of the original upload versus the live YouTube thumbnail. This is the fastest way to detect "Color Grading" changes without downloading the whole video.
  2. Scheduled Polling: file_watcher.py (or a separate cron job) "checks-in" on videos 24 hours and 7 days after upload.

OpenCV identifies if a user has applied a warmer or cooler color grade to their post-upload video.

Log in or sign up for Devpost to join the conversation.