Echo

Track: Creative Tools & Media

Inspiration

When designing and creating images, posters and other forms of creative media, individuals contact freelance designers or services to produce professional grade media. While outsourcing media creation, customers must vividly describe their wants to the producer, specifying every minute detail of their idea. Still, an image tells a thousand words, and customers can almost never share their exact vision to creators through words alone. We aim to fill in this creative divide, allowing designers to better understand what their customer desires so that they can fulfill the customer’s request while minimizing back-and-forth communication.

What it does

This project solves a fundamental problem in designer-client relationships: the difficulty of translating abstract requirements into visual concepts. By combining conversational AI and real-time image generation, the system creates a collaborative specification process where:

  • Clients answer guided questions about their design needs
  • After each response, an AI generates an updated visual concept
  • Clients provide feedback on these images, creating an iterative refinement loop
  • The system builds a progressively more accurate understanding of the client's vision
  • Designers receive both detailed written specifications and approved visual concepts

This approach significantly reduces miscommunication, revision cycles, and frustration for both parties while producing better-aligned final designs. The client gets to "see their thoughts" materialize in real-time, while designers start with a clear, client-validated direction.

How we built it

We used the following technologies:

  • AWS Bedrock (Claude 3.5 Sonnet) to ask intelligent prompt-refinement questions
  • AWS Bedrock (Stability AI's Stable Image Ultra v1.1) to generate high-quality, production-grade images
  • Amazon LightSail for fast prototyping and hosting our project
  • Amazon SageMaker to manage our model interaction logic and orchestrate calls between Claude 3 and Stable Image Ultra
  • AWS Lambda to launch simulations and test the Lightsail server
  • AWS IAM to configure permissions
  • Python to manage project logic, service integration, conversation memory, and image handling
  • HTML to create the front-end user interface
  • Base64 image encoding to pass images through Claude
  • Git and GitHub for version control and collaboration

We began by integrating Claude 3.5 Sonnet from AWS Bedrock to enable intelligent prompt refinement through natural conversation. Once Claude understood the user’s request, we connected it to Stability AI’s Stable Image Ultra v1.1, allowing it to generate images based on the described prompt. These generated images were then shown back to the user. To close the loop, we passed the resulting image back into Claude so it could analyze what was created and ask thoughtful follow-up questions to further refine and personalize the image output. Finally, we tied everything together using Amazon Lightsail, creating a lightweight, accessible deployment pipeline that showcases the full end-to-end functionality of Echo for consumers to use. We rigorously ran simulations with Lambda and Bedrock to determine the best models to use and to generate the most effective pipeline for an engaging user experience.

Challenges we ran into

  • Handling model restrictions and shortcomings, including token limits and model capabilities
  • Image restrictions on Claude 3.5 Haiku
  • Preserving conversation history and preventing the loss of details throughout a large conversation proved to be difficult
  • Creating multiple pipelines for images generated from Stable Image Ultra v1.1 to send to the user interface and Claude 3.5 Sonnet

Accomplishments that we're proud of

  • Learned a variety of AWS infrastructures allowing for complete project integration
  • Built a working multimodal consumer interface in less than 48 hours
  • Fully integrated AI chat and image generation pipeline
  • Resolved model challenges and optimized product based on model capabilities
  • Enabled three-way text and image communication between the user, Claude, and Stability
  • Created a system that allows for back-and-forth refinement like a real creative collaborator

What we learned

  • Hands-on experience with various AWS services
  • Accessing APIs through AWS Bedrock
  • Managing interactions within AWS SageMaker
  • Navigating AI model shortcomings and improving product resilience
  • Managing multimodal workflows in real-time applications
  • Deploying and debugging AI apps on AWS Lightsail
  • Keeping UX simple while using powerful backend tools
  • Coordinating isolated development within a team

What's next for Echo

Further integrating AWS infrastructures, such as AWS Amplify, would allow for product scalability and easier growth. We hope to add new features to Echo, including:

  • Allowing reference images to be passed into Stability AI models for improved specificity and consistency
  • Enabling in-image editing to modify specific parts of a generated image while preserving others
  • Integrating specialized models for niche tasks such as anime generation, 3D animation, and illustrations

Built With

Share this project:

Updates