To automate the "rewatching" and learning process without explicit human feedback, we implemented an Auto-Learning Feedback Loop. This system treats user edits (after the video is live) as the "Ground Truth" of their preferences.
If the AI posted a video with "Natural" grading, but the user manually changed it to "Teal and Orange" on YouTube, the system should treat that delta as a high-priority instruction to update its internal models.
The Auto-Learning Architecture
The process involves three distinct stages: Observation, Diffing, and Reinforcement.
1. The Observation Engine
We use a background task to periodically poll the YouTube API for the current state of a video. We compare the "Live" version against the "Snapshot" stored in our MemoryAgent at the time of upload.
def check_for_user_edits(video_id: str, stored_metadata: dict):
# Fetch current state from YouTube API
live_metadata = youtube_api.get_video_details(video_id)
# 1. Detect Metadata Edits (Titles/Tags)
if live_metadata['title'] != stored_metadata['title']:
update_preference_cluster('title_style', live_metadata['title'])
# 2. Detect Color/Visual Edits
# This requires downloading a single frame and comparing color histograms
live_frame = download_thumbnail(video_id)
original_frame = get_stored_thumbnail(video_id)
visual_delta = compare_visual_profiles(original_frame, live_frame)
if visual_delta['change_detected']:
# Extract specific grading preference (e.g., increased saturation)
apply_visual_learning(visual_delta['new_profile'])
2. Preference Fingerprinting
Instead of just recording one-off changes, the system builds a Preference Fingerprint. If a user edits 5 videos in a row to be "Unlisted" despite the AI suggesting "Public," the PrivacyAgent should receive a permanent weight adjustment.
| Edit Detected | Learning Action | Impact |
|---|---|---|
| Title Change | NLP Keyword Extraction | Updates suggested_title generation logic. |
| Color Grading | Histogram Shift Detection | Adjusts default LUT or saturation parameters. |
| Privacy Toggle | Binary Update | Strongest signal: Overrides PrivacyAgent weights. |
| Tag Removal | Negative Association | Prunes specific tags from future analysis results. |
3. Integrated with learning_agent.py
LearningAgent handles these "Implicit Feedback."
class PreferenceLearningAgent(LearningAgent):
"""
learn from user-initiated edits (Implicit Feedback).
"""
def process_implicit_feedback(self, video_id: str):
# 1. Diff the current YouTube state vs our memory
diff = self.memory.get_edit_diff(video_id)
if not diff:
return # No changes made by user
# 2. Convert edits into "Rewards"
# If the user changed our decision, it's a negative reward for the old model
# and a positive reward for the new user-provided state.
for field, new_value in diff.items():
self.memory.update_weights(
feature=field,
value=new_value,
weight_increment=0.25 # Incremental learning
)
self.security_logger.log_policy_update(f"Auto-learned preference for {field}")
Explanation
- Thumbnail Diffing:
OpenCVcompares the color distribution of the original upload versus the live YouTube thumbnail. This is the fastest way to detect "Color Grading" changes without downloading the whole video. - Scheduled Polling:
file_watcher.py(or a separate cron job) "checks-in" on videos 24 hours and 7 days after upload.
OpenCV identifies if a user has applied a warmer or cooler color grade to their post-upload video.
Log in or sign up for Devpost to join the conversation.