Eidos

Inspiration

Eidos was born from a frustration with the "black box" nature of modern AI image generation. While tools like Midjourney and DALL-E 3 are powerful, they often lack the precision needed for professional workflows. We wanted to build a tool where the AI acts not just as a generator but as a collaborator—an agent that understands intent and provides tactile, semantic controls. We envisioned a world where you could manipulate an image as easily as a 3D scene, with lighting, composition, and style at your fingertips.

What it does

Eidos is a multi-agent image editing platform that bridges the gap between natural language prompting and professional-grade editing.

Agentic Collaboration: A backend powered by Gemini 3 orchestrates multiple specialized agents to analyze images, enhance prompts, and execute complex edits.
3D Spatial Control: A real-time 3D "Composer" allows users to set up lighting, camera angles, and object positions, which are then translated directly into semantic prompt parameters.
Dynamic semantic UIs: Instead of generic text boxes, Eidos generates custom sliders, dropdowns, and color pickers tailored to the specific generation model and intent.
Pro-grade Workflows: Multi-model support (fal.ai, Replicate), image-to-image refinement, and a seamless desktop experience built with Electron.

How we built it

We leveraged a modern, high-performance stack to ensure a fluid creative experience:

Frontend: A React-based renderer using Tailwind CSS. We used Three.js (React Three Fiber) for the 3D controls and Framer Motion for smooth, interactive transitions.
Backend: A robust FastAPI service managing the agentic lifecycle. We integrated pydantic-ai and Agno for agent orchestration, using Gemini 3 as the primary "brain" and nano banana pro to generate the images.

Challenges we ran into

3D-to-Text Mapping: Translating 3D spatial properties (like directional lighting and tilt) into text that AI models actually respect was a major engineering feat involving complex vector math and prompt engineering.
WebGPU & Threading: Implementing high-performance client-side features required navigating the complexities of WebGPU and Emscripten, specifically offloading heavy tasks to pthreads to keep the UI responsive.
State Synchronicity: Orchestrating state between a real-time 3D canvas, a dynamic chat interface, and a remote agent backend required a very sophisticated Zustand store and stable WebSocket management.

Accomplishments that we're proud of

We've created an experience where "AI" doesn't just mean "typing a prompt." Being able to rotate a light source in 3D and watch the image update is a game-changer. Our system doesn't just generate; it reasons. It can look at an image, identify its flaws, and suggest specific semantic controls to fix them. We built a design language that feels premium and focused, minimizing distraction and maximizing the creative "flow."

What we learned

We learned that users don't just want "better" images; they want their images. Precision is the most valuable feature in generative AI. Designing interfaces for agents requires a shift from static forms to dynamic, progressive disclosure. You have to design for what the agent might need next. Especially in AI, where latency is high, a snappy UI with immediate feedback (like the 3D canvas) is crucial for maintaining the illusion of direct manipulation.

What's next for Eidos

Multi-Object Compositions: Expanding the 3D composer to handle multiple subjects with individual controls.
Local-First Capabilities: Moving more model execution to the edge using WebGPU for near-instant latency.
Fine-Tuning Integration: Allowing users to train or bring their own LoRAs directly into the agentic workflow.
Collaborative Editing: Real-time multi-user "whiteboarding" with AI agents.

Built With

agno
antigravity
electron
fastapi
gemini3
nano-banana
python
typescript

Updates

Constant Chen started this project — Feb 09, 2026 07:19 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.