Media Manager An Agentic AI Studio for Social Media

Login page for users
All Contents page - Used to create contents
content creation page - title and content details
Upon selecting Brainstrom, AI will bring business context and process request
User can debate, discuss with AI with business context for the content
AI checks all aspects by asking follow-up questions about the content
I have extracted audio for the recording of the same product developed here
AI generated content, caption, Hook/headline, hastags
AI started processing
Script Panel in Studio
Voice generator using ElevenLabs

Modern social media tools are optimized for clicks, templates, and scheduling — not for how humans actually create content. Real ideation is conversational, iterative, and deeply tied to business context such as brand tone, audience, goals, and market positioning.

While working with conversational voice models and LLM agents, we identified a major gap: AI can generate text, but it rarely understands why a business is saying something.

At the same time, companies are expected to produce high-volume, multilingual, platform-specific content, often without losing brand consistency. This inspired us to build Media Manager, a business-context-aware, voice-first Agentic AI system that behaves like a real Creative Director — not a text generator.

What It Does? Media Manager is a business-context-aware, conversational AI platform for end-to-end social media content creation.

Core Capabilities: Voice-First Interaction - Users interact entirely through speech. Responses are generated using ElevenLabs conversational agents.

Agentic Content Reasoning - A multi-turn “Creative Director” agent: Asks clarifying questions tied to business intent. Reasons over brand goals, audience type, and platform norms. Iteratively refines scripts into structured social content.

Business Context Injection: Every prompt is enriched with: Brand voice & tone. Target audience demographics. Marketing objectives (awareness, conversion, trust). Platform constraints (LinkedIn vs X)

Multilingual Localization: Context-preserving translation and neural voice dubbing.

Unified Content Pipeline: Script → Translation → Audio → Scheduling-ready assets.

How We Built It? Architecture Overview The platform is designed as a context-driven AI orchestration system, where business metadata directly influences reasoning and output quality.

Context Modeling (Key Innovation) Each company is modeled as a Business Context Vector: C={tone,audience,goals,industry,platform_rules}

This context is: Persisted in PostgreSQL Injected into every agent prompt Used as a constraint during generation, translation, and dubbing This ensures brand consistency across languages and formats.

Challenges We Ran Into? Maintaining context across multi-turn voice conversations Filtering LLM metadata leakage into audio pipelines Latency optimization for real-time speech responses Fallback handling when voice synthesis APIs fail Ensuring translation preserves intent, not just words

Accomplishments That We’re Proud Of: Built a fully voice-driven social media studio Implemented business-context-aware agentic reasoning Achieved natural multilingual voice output at scale Designed a production-ready, extensible SaaS architecture

What’s Next for Media Manager? Autonomous publishing agents Video + avatar generation Performance-feedback loops into agent memory Enterprise-grade analytics and governance

Built With

a2a
adk
cloudrun
elevenlabs
fastapi
flask
gemini
nextjs
postgresql
python
vertexai

Updates

Harish Ramachandran started this project — Dec 31, 2025 04:21 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.