Inspiration
Creative operations—whether for corporate marketing, game development studios, or publishing houses—are plagued by massive fragmentation. Teams lose thousands of hours bouncing between distinct tools for project management, copy draft creation, asset design, and character alignment testing.
We were inspired to build Pixels AI to unify the creative pipeline. We wanted to build an intelligent, high-efficiency command center where creative directors, copywriters, and multimedia designers can execute a product roadmap seamlessly under one unified hub.
What it does
Pixels AI is an automated, end-to-end production environment optimized for content creation workflows and rapid media asset prototyping:
The Design Canvas: Streamlines design operations via specialized multi-turn text-to-image and reference-based image-to-image generation layers, letting teams iteratively edit and refine visual assets through standard conversation loops.
Autonomous Planning Pipeline: Eliminates the blank-page bottleneck entirely. Creative operators enter a core concept, and the AI immediately goes to work—autonomously generating full character matrices, systemic world/branding locations, and structured plot or campaign blueprints.
Contextual Execution & Human-Like Feedback: Generates long-form content instantly with real-time text streaming using your exact saved database parameters. To stress-test alignment, creators can seamlessly chat with up to 5 distinct, individualized character persona agents to ensure the script's dialogue matches precise behavior and speech profiles.
How we built it
The platform is engineered with a hyper-focused, minimal workspace aesthetic, utilizing responsive tab layouts, an intuitive settings drawer console, and hardware-accelerated CSS effects like custom conic-gradient indicators and fluid multi-row masonry asset grids.
The entire automated data flow and intelligence engine is integrated using Medo Skills APIs and Plugins:
Asset Management Layer: The Medo Skills Image Generation (Lite Version) plugin handles all raw image vectors, reference-to-image conversions, and iterative state updates for multi-turn editing.
Autonomous Operation Layer: Gemini 2.5 Flash integrated through the Medo Skills ecosystem drives the background planning agents, handles real-time text streaming outputs, and hosts the individualized conversational testbed instances.
Challenges we ran into
Our primary technical bottleneck was state preservation and contextual orchestration. Managing highly granular corporate style guides, localized character relationship constraints, and specific dialogue parameters across a multi-tab system required building an incredibly robust data tree.
We had to ensure that when the background planner updated parameters, those details instantly and dynamically mapped onto both the streaming story module and the character-agent chat memory maps without introducing server latency.
Accomplishments that we're proud of
We successfully turned a process that traditionally takes creative agencies days of brainstorming into an automated, 60-second autonomous blueprint pipeline.
Witnessing the AI flawlessly structure character maps, role hierarchies, and chapter objectives, and then immediately watching those variables accurately govern the real-time text stream and character chat instances is an immense milestone for our team's operational goals.
What we learned
We gained critical insight into optimizing developer velocity and reducing infrastructure overhead by centralizing our backend, storage pools, and multimodal AI configurations directly into a single unified workspace. Building an interconnected agent loop taught us how to pass dense JSON context arrays effectively across concurrent LLM tasks without dropping key operational variables.
What's next for Pixels AI
We aim to push creative workspace productivity even further:
Concurrent Multi-Agent Team Chats: Introducing collaborative meeting rooms where multiple automated character personas can interact with one another and the human operator to audit script consistency simultaneously.
Automated Asset Mapping: Integrating a pipeline that matches the autonomous character text outputs with the image-to-image canvas to automatically generate matching avatar sets from text descriptions.
Enterprise Document Exporting: Implementing advanced document compiling to let project managers instantly bundle planning documents, visual mockups, and final text scripts into professional EPUB, PDF, or markdown manifests.
Built With
- ai-agents
- ai-image-generation-api
- ai-story-generation
- automation
- content-creation
- css
- gemini-2.5-flash
- generative-ai
- html
- javascript
- llm
- medo
- react
- typescript
- vite
Log in or sign up for Devpost to join the conversation.