Inspiration

We envisioned the future of learning — a holistic experience that taps into all facets of a child's cognitive senses. Recognizing that the combined power of reading, listening, and visual engagement can revolutionize understanding, we brought them together. We set out to combine the inherent curiosity children possess with an interactive visual learning environment that knows no bounds.

What it does

When the child chooses their topic and character of choice, our app provides three AI-powered learning modes:

1) Story-telling

2) Interactive chat for q-and-a

3) Quizzing

Our app creates a visual story based on the child's character and topic of choice. The child can respond with a question based on the story, and the model will continue to provide visually rich responses. Finally, the model will gently evaluate the child on the topic.

How we built it

We used the most powerful models from three forms of generative AI:

1) Text Generation: GPT-4

2) Image Generation: Stable-Diffusion XL

3) Speech Generation: Google Cloud TTS

We utilized the amazing capabilities of GPT-4 as a planner, story-teller, and as a source of knowledge.

Stable-Diffusion XL is the most advanced image generation model. It generates the images during story-telling, q-and-a, and evaluation.

Google Cloud TTS generates speech with humanlike intonation and allows us to modify the voice based on the chosen character.

We used Node.js, NextJS, and React to create our web-app.

Challenges we ran into

Our first challenge was GPT-4 taking too long to respond to our requests. We fixed this by combining our multiple calls to GPT-4 API into one call that requests for all the information that we need in one go: Story-telling, planning, etc.

Accomplishments that we're proud of

We are proud that we were able to combine the three forms of generative AI into one intuitive package. We are happy with the reduction in response time that we were able to achieve. Finally, we are proud of our child-friendly UI.

What we learned

We learned the subtleties in prompting large language models and the ability to combine multiple APIs with wildly different response times into one reasonably fast product.

What's next for Imaginate

If AI generated work is made commercializable, then we see immense potential in this app to transform education.

Built With

Share this project:

Updates