GenAiExplainer

GenAiExplainer

Inspiration

Modern brands increasingly rely on visually memorable merchandise and creative packaging to build identity and deepen user engagement. Image-generation models are powerful tools for this, yet they are often unpredictable: users struggle to understand how the model identifies the main subject, maintains stability across variations, or responds to different guidance signals such as text prompts, reference images, or structural constraints. This opacity makes it difficult for creators to intentionally shape the final result.

What it does

Our project builds an explainable, controllable, and compliance-aware image-generation agent designed for real-world creative workflows. The agent combines a large language model with a safety detector to automatically craft effective prompts, ensure content compliance, and assist users in producing brand-consistent outputs.

How we built it

To address transparency, we expose the internal steps of diffusion generation by capturing the model’s predicted noise (ε), visualizing each denoising iteration, and showing how text and image conditions influence the reverse diffusion trajectory. This provides users with a clear understanding of why an image looks the way it does and how to guide the model toward their intended design.

Challenges we ran into

Setting up the AWS Instance to run it Choosing appropriate Image gen model Figuring out how diffusion model works and how ablation and cross attention occurs GPU Memory requirement Figuring out best safety methods for prompting and how to detect potential threats

Accomplishments that we're proud of

We went from zero to fully understanding how diffusion models work, and independently designed a complete visualization system for the denoising process. Every component—from epsilon logging to cross-attention analysis and word-ablation tests—was built entirely by us, giving the project strong originality and technical depth.

What we learned

We are also proud of how smoothly we collaborated. Each team member owned a different part of the pipeline—agent workflow, safety checks, visualization, and diffusion tracing—and the integration came together seamlessly. Through this, we created a transparent, controllable generation system that not only functions end-to-end but also reveals the inner mechanics of modern generative models.

What's next for GenAiExplainer

We plan to deepen the agent’s understanding of the underlying mechanisms of image-generation algorithms, expanding our transparency tools into more advanced architectures and real-world creative pipelines. Beyond technical improvements, our goal is to bring GenAiExplainer to market—turning it into a practical product that helps brands, designers, and creators generate controlled, trustworthy, and explainable visuals at scale.

Built With

amazon-web-services
bedrock
claude
flux
genai
gradio
image
langchain
promptininjection
python
sagemaker

Submitted to

Great Agent Hack 2025

Created by

Designed the transparency framework for diffusion models, including ε-prediction logging, denoising-step visualization, and analysis of how text and image conditions influence generation.

Built the end-to-end agent workflow, defining how the LLM, safety detector, and diffusion backbone interact to produce compliant, high-quality, and explainable outputs.

Developed prompt-effect analysis, such as keyword ablation and cross-attention interpretation, to reveal the role of each prompt component.

Led the conceptual direction for interpretability, turning complex model behavior into clear visual explanations for users.

Coordinated team roles and integration, ensuring smooth collaboration and consistent technical direction across all modules.

Designed the presentation poster and crafted the pitch script, shaping the project’s storytelling and communication for judges and audiences.

Chenyi Huang
Arun Joseph

Updates

Chenyi Huang started this project — Nov 16, 2025 09:59 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.