Echo

Track: Creative Tools & Media

Inspiration

When designing and creating images, posters and other forms of creative media, individuals contact freelance designers or services to produce professional grade media. While outsourcing media creation, customers must vividly describe their wants to the producer, specifying every minute detail of their idea. Still, an image tells a thousand words, and customers can almost never share their exact vision to creators through words alone. We aim to fill in this creative divide, allowing designers to better understand what their customer desires so that they can fulfill the customer’s request while minimizing back-and-forth communication.

What it does

This project solves a fundamental problem in designer-client relationships: the difficulty of translating abstract requirements into visual concepts. By combining conversational AI and real-time image generation, the system creates a collaborative specification process where:

Clients answer guided questions about their design needs
After each response, an AI generates an updated visual concept
Clients provide feedback on these images, creating an iterative refinement loop
The system builds a progressively more accurate understanding of the client's vision
Designers receive both detailed written specifications and approved visual concepts

This approach significantly reduces miscommunication, revision cycles, and frustration for both parties while producing better-aligned final designs. The client gets to "see their thoughts" materialize in real-time, while designers start with a clear, client-validated direction.

How we built it

We used the following technologies:

AWS Bedrock (Claude 3.5 Sonnet) to ask intelligent prompt-refinement questions
AWS Bedrock (Stability AI's Stable Image Ultra v1.1) to generate high-quality, production-grade images
Amazon LightSail for fast prototyping and hosting our project
Amazon SageMaker to manage our model interaction logic and orchestrate calls between Claude 3 and Stable Image Ultra
AWS Lambda to launch simulations and test the Lightsail server
AWS IAM to configure permissions
Python to manage project logic, service integration, conversation memory, and image handling
HTML to create the front-end user interface
Base64 image encoding to pass images through Claude
Git and GitHub for version control and collaboration

We began by integrating Claude 3.5 Sonnet from AWS Bedrock to enable intelligent prompt refinement through natural conversation. Once Claude understood the user’s request, we connected it to Stability AI’s Stable Image Ultra v1.1, allowing it to generate images based on the described prompt. These generated images were then shown back to the user. To close the loop, we passed the resulting image back into Claude so it could analyze what was created and ask thoughtful follow-up questions to further refine and personalize the image output. Finally, we tied everything together using Amazon Lightsail, creating a lightweight, accessible deployment pipeline that showcases the full end-to-end functionality of Echo for consumers to use. We rigorously ran simulations with Lambda and Bedrock to determine the best models to use and to generate the most effective pipeline for an engaging user experience.

Challenges we ran into

Handling model restrictions and shortcomings, including token limits and model capabilities
Image restrictions on Claude 3.5 Haiku
Preserving conversation history and preventing the loss of details throughout a large conversation proved to be difficult
Creating multiple pipelines for images generated from Stable Image Ultra v1.1 to send to the user interface and Claude 3.5 Sonnet

Accomplishments that we're proud of

Learned a variety of AWS infrastructures allowing for complete project integration
Built a working multimodal consumer interface in less than 48 hours
Fully integrated AI chat and image generation pipeline
Resolved model challenges and optimized product based on model capabilities
Enabled three-way text and image communication between the user, Claude, and Stability
Created a system that allows for back-and-forth refinement like a real creative collaborator

What we learned

Hands-on experience with various AWS services
Accessing APIs through AWS Bedrock
Managing interactions within AWS SageMaker
Navigating AI model shortcomings and improving product resilience
Managing multimodal workflows in real-time applications
Deploying and debugging AI apps on AWS Lightsail
Keeping UX simple while using powerful backend tools
Coordinating isolated development within a team

What's next for Echo

Further integrating AWS infrastructures, such as AWS Amplify, would allow for product scalability and easier growth. We hope to add new features to Echo, including:

Allowing reference images to be passed into Stability AI models for improved specificity and consistency
Enabling in-image editing to modify specific parts of a generated image while preserving others
Integrating specialized models for niche tasks such as anime generation, 3D animation, and illustrations