Inspiration
A post from Google about Nano Banana drop. They dropped one day, I dropped first version the next.
What it does
- Use LLMs, Image and video generators of your choice
- Can work completely local, with only free downloads
- Or use the best image generators
- For online services authenticate with
- API keys
- Cloud for Google and Anthropic
- Use reference images for image and video
- Up to max allowed for each model
- Use AI to generate or enhance image prompts
- Ask about Files
- Ask about image prompt
- Build prompts using a built-in database of artists, styles, etc.
- Built-in examples and templates
- Full history
- Complete help
How I built it
Primarily with Claude Code. I used ChatGPT and Gemini a few times, early on.
Challenges I ran into
Too many to list here. Most are documented. These stand out:
- The video tab held special challenges to get the prompt generator to work with lyrics and music, or only text, depending on what you use.
- Getting all AI to play nicely was sometimes tricky.
- Recently, the font generator had special challenges recognizing and aligning glyphs.
Accomplishments that I'm proud of
The Font generator, AI video storyboard generation, complete with lip-sync and video prompt generators of your choice.
What I learned
Everything in the project is a learning project for me.
What's next for ImageAI
Enhance video generation.
Log in or sign up for Devpost to join the conversation.