I saw a lot of projects for content creators recently, usually based on OpenAI GPT-3 that generate marketing copy, ad titles, etc. That makes me think that generative assistant tools might be very useful and very powerful, though I never used them for myself before. I decided to try to make such a tool, as simple as possible but yet as useful as possible.

Because I write a decent amount of blog posts, writing all alt texts for images is a minor hassle I encounter. I thought that it would be much better for accessibility and image SEO if such texts were generated automatically and fine-tuned manually afterward.

What it does

It generates an alt tag for a photo and hosts it on CDN for further usage. It's as simple as is. A telegram chatbot was chosen as an easy-to-use UI for this project.

How we built it

The project consists of several parts, by the primary is an Azure Function which links all parts together.

The function is a webhook that receives an image from a telegram chatbot, checks that the user has any calls left, stores the image on CDN, and describes it with Azure Cognitive Services. I use Azure NoSql to store user data (profile, ImagesLeft), and Azure Blob Storage for image storage.

For the chatbot, I used my very own chatbot platform It consumes XML scenario to run chatbot flow and maintains the user state.

  <state name="Start">
    <transition input="start" next="Start" pending_keyboard="Markdown,HTML">Hello, {username}! This chatbot will generate a description (ALT text) for your image and provide you with CDN link if you need to host image.
    ===What kinds of links should I generate? *You could change it later by /reset command.*</transition>    
    <transition input="Markdown" next="MarkdownMain">Markdown it is. Send me an image or image URL and I will generate a link like `![ALT](path/to/img)` for you.</transition>    
    <transition input="HTML" next="HtmlMain">HTML it is. Send me an image or image URL and I will generate a link like `&lt;img url="path/to/img" alt="ALT"/&gt;` for you.</transition>    
    <transition input="*" next="Start">Send me /start to begin</transition>
  <state name="MarkdownMain">   
    <transition input="unsubscribe" next="Start">Your unsubscribe request taken, I will cancel your subscription in 24h</transition>
    <transition input="reset" next="Start" no_stop="true"/>
    <transition input="*" next="MarkdownMain" morphology="msg" action="" action_type="post"/>

Challenges we ran into

I decided to participate on the last day because I didn't have any good ideas before. So I needed to create my micro-product fast and make it useful. That's why I decided to go with two major features:

  • describe the photo for an alt tag
  • host photo with CDN for further usage

One of the problems I spent the most time on was caused by AzureBlobCopy. Often it took up to 5 minutes to copy a file from one storage to another, making the chatbot await all this time before an answer to the user. I fixed it with the power of async.

Accomplishments that I'm proud of

It works. I built it for 8 hours and even manage to stick some kind of monetization there. I also really like how compact yet powerful it ended up.

What I learned

Despite my solid experience with Azure Search, QnA, and BotFramework cognitive services is not something I used before. I was impressed when it recognized a local ex-minister in my photo with him. Quite sad that he has been charged with fraud and forming a criminal group, but that's 100% irrelevant.

What's next for Image Alt Text Generator Chatbot

It's a simple tool that might be useful to developers and content creators. It might work even better if embedded into some blog engines through API.

If it will get some attention I plan to wrap it as an API-first product and rewrite it into something like Azure Service Bus to ensure scalability. Trying more languages from Cognitive Service also sounds like fun.

Oh! And I definitely need to process payments automatically.

Built With

Share this project: