Inspiration

A post from Google about Nano Banana drop. They dropped one day, I dropped first version the next.

What it does

  • Use LLMs, Image and video generators of your choice
    • Can work completely local, with only free downloads
    • Or use the best image generators
  • For online services authenticate with
    • API keys
    • Cloud for Google and Anthropic
  • Use reference images for image and video
    • Up to max allowed for each model
  • Use AI to generate or enhance image prompts
  • Ask about Files
  • Ask about image prompt
  • Build prompts using a built-in database of artists, styles, etc.
  • Built-in examples and templates
  • Full history
  • Complete help

How I built it

Primarily with Claude Code. I used ChatGPT and Gemini a few times, early on.

Challenges I ran into

Too many to list here. Most are documented. These stand out:

  • The video tab held special challenges to get the prompt generator to work with lyrics and music, or only text, depending on what you use.
  • Getting all AI to play nicely was sometimes tricky.
  • Recently, the font generator had special challenges recognizing and aligning glyphs.

Accomplishments that I'm proud of

The Font generator, AI video storyboard generation, complete with lip-sync and video prompt generators of your choice.

What I learned

Everything in the project is a learning project for me.

What's next for ImageAI

Enhance video generation.

Full-scale image on Flickr

Built With

Share this project:

Updates