mem[v]

mem[v] Landing Page
mem[v] Video Dashboard
mem[v] Upload Videos
mem[v] Search Canvas
mem[v] Agent Chat - Video Analysis
mem[v] Video Analysis with Pegasus
mem[v] Index Management
mem[v] - Nodes and Relationships for Video ID from Video Context
mem[v] Graph Store for Video ID

🚀 Inspiration Over the past year, Mani and Bhanu have been diving deep into one of AI's toughest challenges: video understanding. Despite progress in foundation models, reliably interpreting videos especially in real-world, dynamic settings remains unsolved. Our direction was shaped by:

The Twelve Labs blog on Context Engineering, which argued that the next leap won’t come from bigger models, but richer, adaptive context and self-healing memory.
A discussion at the All-In Summit 2025 where Mark Cuban and Tucker Carlson debated the future of video AI reinforcing the need for systems that are both context-aware and user-aware. These ideas led us to build mem[v] -> The context and memory layer for multimodal Agents.

🧠 What It Does mem[v] creates a persistent memory graph from video content, extracting:

Episodic context (what happened)
Temporal context (when and in what order)
Semantic context (relationships and meaning) Instead of re-processing videos repeatedly, AI agents query this memory layer instantly-enabling real-time insights at 40x speed and 1% the cost. Process once. Remember everything. Query instantly. It also integrates external business documents - like brand guidelines, product specs, and campaign briefs—into a unified graph, turning raw video data into actionable business intelligence.

🛠️ How We Built It Tech Stack:

Video Understanding: Twelve Labs (Pegasus + Marengo)
Reasoning: OpenAI GPT-4
Context Graph: Neon Postgres (Graph schema)
Query Layer: Redis cache + GPT-powered logic
Frontend: Next.js
Auth: Clerk We built intelligent chunking, stateful context tracking, and custom prompt pipelines to overcome limitations in API context length and lack of multi-turn capabilities.

⚔️ Challenges We Faced

No multi-turn chat support in Twelve Labs → Built our own context manager
Rate limiting & unclear errors → Upgraded mid-hackathon to pay-as-you-go
Limited video context length → Engineered smart chunking strategies
No fine-tuning options → Relied on prompt engineering for domain-specific graphs

✅ Accomplishments 🔧 Technical Wins

Built a memory layer on top of Twelve Labs
- From one-time API calls → persistent, queryable memory
Integrated external business context
- PDFs, decks, catalogs, and performance data into a multimodal graph
40x speed improvement
- From 30s+ video queries → <100ms with Redis + Graph
Graph-based video reasoning  "Find moments where Product X appears after a competitor mention and aligns with brand guidelines (section 3.2)" 
First working prototype in 24 hours
- Processed 20+ ad videos
- Ingested 5+ docs
- Created 500+ graph nodes and 2K+ relationships
Tackled $80B ad waste problem
- Reuses video memory across campaigns, teams, and platforms
Built a “single source of truth” for video intelligence
- Unifies video content with business knowledge
Context as infrastructure
- Democratizing memory + context for all video AI applications

🔍 Why It Matters

We amplify, not compete with, Twelve Labs Like Pinecone powers OpenAI - we power Twelve Labs outputs
Closed the context gap Bridge between raw video understanding and institutional knowledge
Unlocked real-world scalability 40x faster and 100x cheaper = deployable at scale
Built what the industry theorized First working prototype of context-engineered video memory
Immediate revenue path Ad industry needs this now: massive ROI, immediate need
Multimodal data lake Videos, documents, structured data—> all queryable via natural language

📚 What We Learned

Context beats model size
Memory compounds
Graphs + vectors = 🔥
Most AI failures = context failures, not model limitations
Foundation models need infrastructure to become usable

🚧 What’s Next

Launching SDKs: memvai on pip + npm (already registered)
Collaborate with Twelve Labs and become a selected customer for fine-tuning.
Onboarding 5–10 design partners in advertising
Proving 40%+ CPM improvements in real-world campaigns
Building privacy-preserving federated memory for cross-customer learning
Expanding into fashion, e-learning, and media AI Long-term: mem[v] becomes the universal memory layer for multimodal AI.

🌐 The Bigger Picture Twelve Labs democratized video understanding models. We’re making video memory + business context usable. Together, we're building the infrastructure for next-gen AI agents where video understanding meets institutional memory, and insights become truly actionable.

Built With

clerk
javascript
neon
openai
redis
vectordb
vercel

Updates

Bhanu Reddy started this project — Oct 05, 2025 12:35 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.