Inspiration

We are currently observing a remarkable phenomenon: creative prompt-based AIs are becoming more and more powerful by the day, and their abilities seem limitless if one knows how to use them.

A cybersecurity saying came to mind: most mistakes are committed between the chair and the keyboard. And we believe this is true for AI use as well. The potential of any model is severely limited by the ability of its users to both explore what it can do and formulate their thoughts in a way that the machine understands. We seek to solve this problem.

In a way, Echo is a magnifier of already existing AI technologies, and it is made to scale with them.

If you were at the MAIS club meeting earlier this week, you will know how hard it is to bend an AI to your will. Not anymore :)

What it does

Echo takes a prompt that one would have entered in a creative AI, and adds a step before the generation of results: mutation.

Inspired by genetic algorithms, it creates alternative versions of the entered prompt, which we call "echoes", and enters all of them in the generator instead of just the initial one.

Then, the user picks the best version, and the process is repeated using this new prompt until a satisfying enough version of the final product finally gets created.

How we built it

The echoes are generated using two mutation methods. First, we generate alternative formulations by substituting words by their synonyms. We then pick a few of the resulting sentences, and pass them through an advanced natural language processing API (google translate lol) to generate prompts close to the original one in meaning, but that might cause the creative AI to behave differently. Then, the user acts as the selection agent to explore promising prompts further.

This approach to creative AIs is general, and can be applied to anything that bases itself on a text prompt to generate content. To show this, we implemented radically different models in our final product. The text generator used for messages, emails and simple text continuation is the Cohere API, with handmade training sets. The image generator, on the other hand, is the DeepAI text-to-image model. As a future improvement, a music generator could also very simply be added.

To speed things up, everything computation-heavy in the backend runs in multiprocessing. So while the demo might not be very fast, this limitation is mainly due to the hardware it runs on, and it would greatly benefit from a server with many cores.

On the frontend, the website itself was made using Velo by Wix. It is connected to the backend (written in python) using Flask.

Challenges we ran into

Asking AIs to generate 6 different pieces of content based on 6 different prompts is very computation-heavy, and required some work to optimize in order to run at an acceptable speeds. The main solution has been parallelization: we generate every piece of content simultaneously, using multiprocessing, instead of sequentially, allowing us to exploit an entire multi-core machine's potential. This has allowed us to cut the loading time from 2 min to sometimes less than 20s to generate 24 different images, a necessary prerequisite to fully explore the variations of an idea.

Accomplishments that we're proud of

Everyone knows the hardest part of programming is giving names to things. And this project itself was no exception! Our solution? Feed the list of mediocre names we brainstormed to our creation, and have it name itself. "Echo" is actually one of the many good suggestions it gave.

What we learned

We learned how to use Wix to make amazing looking websites super quickly. We also learned that really, it's not about the juicy prizes at the end, but about the friendships we made along the way.

What's next for Echo

Anything goes! We made it while keeping in mind that it will scale as future technologies arrive. It amplifies the abilities users have of interacting with AIs in general. Therefore, as long as new, better prompt-based AIs get created, Echo will keep on improving the results we are able to squeeze out of them.

As stated before, it has absolutely no reason to be limited to text and AI generation. It can be used with generators in general, making it infinitely scalable. Music, videos, programs, dance moves... anything you can make a prompt-based generator of, you can apply Echo to.

Share this project:

Updates