Inspiration
Most text-based image editing methods I've come across are focused on making very drastic changes to images. "Replace the eiffel tower with a bagel" or something like that. But when editing photos you want to mostly want to retain the original image and make more targeted edits. E.g Improve the lighting, or change the backdrop while keeping the subject.
What it does
The Image Editor takes in a link to an image and a prompt then edits the image according to the prompt. It's a sort of "AI filter".
How we built it
The user's image is first converted to grayscale and then passed to a stable diffusion controlnet. The controlnet constrains the model to original shape/structure of the image and then fills in the rest.
This output is then blended back together with the original image. The resolution is then restored to the original images's quality using laplacian reconstruction.
Challenges we ran into
Retaining the likeness of the subject of the original photo is a challenge. Works much better on single subject portrait photos than on multiple subject photos.
Accomplishments that we're proud of
What we learned
What's next for Image Editor - AI Filters
Improving the generality & reliability of the results. Training a more general controlnet would be beneficial here.
Built With
- controlnet
- python
- stablediffusion
Log in or sign up for Devpost to join the conversation.