VidScan

An advanced solution that takes video inputs, processes them to identify products introduced in the video, and directs users to TikTok Shop to purchase similar products.

Comment

Inspiration

When users watch a video on TikTok, they are often captivated by the content, which can spark an interest in related products. However, finding these products within the TikTok Shop can be cumbersome due to the lack of a personalized, streamlined recommendation system that bridges the gap between video content and relevant product suggestions.

What it does

Extract key frames and videos transcripts with Phi-3 and WhisperAI
Process visual and textual data with our AI system, utilizing Llama3 for precise keyword generation
Display relevant products on TikTok videos to enhance discoverability
Redirect users from keywords to product pages

How we built it

Multi-modal LLM (Phi-3-Vision): Generate description and understand videos by extracting & analysing video frames.
Video frames: FFmpeg to extract set amount of video frames
Audio transcript: OpenAI Whisper Large Model
Llama3: Chain of Thought (CoT), Multi Prompt

Challenges we ran into

Lack of Compute Power: We ran our LLM locally on consumer grade hardware, resulting in long processing times
Prompt Engineering: Difficulty in fine tuning the LLM to avoid hallucinations and extract relevant keywords

What we learned

Prompt Engineering
Event Messaging Architecture
Video Streaming

What's next for VidScan

Scrape Comment Sections: Obtain product keywords from user comments to enhance the accuracy of product recognition and recommendation
User Feedback Loop: Incorporate user feedback to refine and personalize keywords recommendations further.
Improved Prompt Interpretation: Develop prompt interpretation mechanisms to reduce misinterpretation of video content, ensuring more accurate and relevant product suggestions.

Built With

docker
express.js
fastapi
kafka
llama3
minio
mongodb
phi3vision
react

Updates

Rowen Tey started this project — Jul 07, 2024 09:16 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.