FearNot!

Inspired by SkipIt's IP where people can skip scenes in movies that trigger their personal trauma; we designed a Proof of Concept that allows people to determine customizable triggers. Limited to OD.

Comment

Trigger Detect — Architecture Overview

This document summarizes the current proof‑of‑concept architecture and the flow we built.

High‑Level Flow

User describes triggers in free‑form text.
LLM extracts tags from that text.
User edits tags (add/remove/update).
User uploads a video.
User starts playback on the live‑processed view.
Backend processes frames in 5‑second batches and streams base64 JPEG frames to the client.
Frontend buffers 10 seconds and then renders frames to a <canvas> at the original FPS.

Backend Components

1) Tag Extraction

Route: POST /extract-tags
File: user_input.py
Uses OpenAI to return tags as a list of strings.

2) Tag Editing

Route: PATCH /edit-tags
File: user_input.py
Accepts edited list of tags, cleans them, stores in session.

3) Video Upload

Route: POST /upload-video
File: video_upload.py
Stores video as temp_videos/temp.mp4.

4) Live Frame Streaming

Route: POST /stream/start
- Initializes a stream session.
- Reads FPS, width, height from the original video.
- Starts a background worker.
Route: GET /stream/next?session_id=...&last_index=...
- Returns the next processed 5‑second batch of frames as base64 JPEG.
- Includes FPS and video metadata for playback.
Route: POST /stream/stop?session_id=...
- Cancels the current stream and allows replay.

Processing Worker

Reads the uploaded video using OpenCV.
Splits into 5‑second batches using the source FPS.
Runs YOLO‑World on each frame via process_frames().
Encodes annotated frames as JPEG and pushes them into a queue.

Frontend Components

1) Trigger Editor

Template: templates/index.html
JS: static/app.js
Calls /extract-tags and /edit-tags.
Shows editable tag “chips”.

2) Video Upload Page

Template: templates/video_upload.html
Uploads a file and then navigates to the play page.

3) Live Playback (Canvas)

Template: templates/play_video.html
Uses <canvas> for the edited video.
Uses <video> for the original side‑by‑side.
Implements two loops:
- Producer: fetches batches via /stream/next.
- Consumer: draws frames at the original FPS.
Buffers 10 seconds before playback.

Data Model

Each processed frame is encoded as base64 JPEG and returned in JSON:

{
  "status": "ready",
  "index": 0,
  "fps": 30,
  "width": 1280,
  "height": 720,
  "frames": ["...base64...", "...base64..."]
}

Key Files

app.py — Flask app setup, registers blueprints.
user_input.py — Tag extraction + editing routes.
video_upload.py — Upload + streaming pipeline.
video_processing.py — YOLO‑World model + frame processing.
templates/index.html — Trigger input UI.
templates/video_upload.html — Upload UI.
templates/play_video.html — Playback UI.
static/app.js / static/styles.css — Frontend logic + styling.

Notes / Limitations

Base64 frame streaming is heavy at high FPS (demo‑friendly only).
Canvas rendering at 300 FPS may exceed browser limits.
Streaming is in memory; no persistent queue is used.
Replay resets the session and restarts processing.

Future Improvements

Switch to binary streaming (multipart or WebCodecs) to reduce payload size.
Backpressure / queue limits to avoid memory spikes.
WebWorkers for decoding frames to avoid blocking UI thread.
Persist detection metadata to allow highlighting / skipping.

Built With

Updates

Kunj Joshi started this project — Mar 15, 2026 11:07 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.