WhatsThat

Inspiration

Textbooks for children in primary school often have a limited and repetitive set of images describing objects and teaching them the meaning of words. There is also no feedback mechanism helping the child to associate the same object word with different scenario pictures. This significantly limits the understanding of a young learner. With the determination to overcome these limitations, we built our idea, a program to increase the cognitive abilities of young minds, to better shape their future.

What it does

Our program broadens the learner's knowledge horizon and improves their perception skills starting at a young age. It helps children reinforce the word to object association by showing multiple images of specific objects in all different orientations, angles, backdrops, and variety and ultimately testing the same. This is something that conventional educational resources fail to accomplish.

Our program consists of two levels and gives the user the flexibility to choose which level they want to use and how many times. The first level grabs an image from over 200 object categories from an online database hosted on GitHub. The computer speaks the name of the object and asks the child to repeat it. For example, the computer will display an image of an airplane and say the word ‘airplane’. As we have multiple images of airplanes, the child will learn to recognize airplanes of all shapes and sizes.

The second level is a guessing game, which uses an Artificial Intelligence model to predict the object inside a random image. The computer asks the child if the object is really what the AI model predicted, and the child replies either “yes” or “no”. This is compared to the correct answer and the child is awarded a point if they are correct.

The final score is displayed after the game finishes.

How we built it

Our program has been coded in python 3.7. It executes with the help of libraries such as TensorFlow, gTTS, SpeechRecognition, pocketSphinx, etc. We import the images from an online database that has been hosted on Github. After importing the image, the AI in the program detects the picture and saves the prediction as a variable. The program is further executed based on the level selected by the user. In level 1, the program speaks back to the user with the name of the object being displayed by using the matplotlib library. In level 2, the AI predicts the object in the image displayed on the user’s screen and waits for the user’s approval. The user’s approval is recorded from their microphones and translated by our program. Based on the answer given by the user, the program gives relevant feedback. This feedback is delivered to the user by the Google text-to-speech library(gTTS) from their selected output speaker. Initially, the user is prompted to input the number of images they would like the program to display so that the program stops when that number is reached.

Challenges we ran into

Not everyone in the group was familiar with python 3.7, thus there was a lot of learning involved while developing the program. Many of our team members had trouble installing certain Python Libraries, like TensorFlow
It was challenging to display the image in a non-blocking manner. Using Tkinter to display the image meant that the preceding code would not run. Hence, we had to revert to using matplotlib.
Our Team had to overcome the communication challenge of working remotely and coordinating team tasks online.

Accomplishments that we're proud of

Usage of Artificial Intelligence in level 2 of our program
Designed a refined dataset specifically for the program to help increase the object-word association in children
An interactive program to keep the user engaged by giving audio responses and feedback.
Built it in a team of 4 with thorough communication in a span of 36 hours

What we learned

Coding simultaneously online
Inculcating multiple python libraries to increase UX
Improved video editing skills
Collaboration and teamwork

What's next for WhatsThat

We will work on refining and polishing the program to increase speed and eventually launching it into the real world.
Can be used to train AI models too
Increasing the data set of object words
Reducing AI error percentage
Develop a more children-friendly user interface

Domains that we registered

letsnotgo.online - Kushagra Goel
allyour.space - Ajeya Madhava Rao Vijayakumar
nextdimension.space - Charu Tyagi
4dimensional.space - Jhanavi Gera

Disclaimer

Everything we submitted was created during Same Home Different Hacks

Built With

beautiful-soup
github
gtts
imageai
keras
matplotlib
pocketsphinx
python
tensorflow
wget

Submitted to

Same Home Different Hacks

Created by

I worked on the creating and curating the database that powers WhatsThat, and web-scraping on GitHub.
Ultimately, my contribution allowed us to use all types of images and pull them in a completely random fashion from the online database.

Kushagra Goel
I came up with the idea for the project. After which, we refined the idea as a team. Since I had no prior experience in Python, I had a lot to learn from this project. I was in charge of designing the logo and helping to write the description of the project.

jhanavi gera
I helped in creating the level 2 stage in the WhatsThat software. I worked on inculcating AI into the program and also helped in integrating all the sub-functions to create a better user experience.

Ajeya Madhava Rao Vijayakumar
I worked on coding the level 1 of the program (random imaging and text to speech), along with scripting, refining data set and idea conceptualization, helping the team's smooth workflow.

Charu Tyagi