Textbooks for children in primary school often have a limited and repetitive set of images describing objects and teaching them the meaning of words. There is also no feedback mechanism helping the child to associate the same object word with different scenario pictures. This significantly limits the understanding of a young learner. With the determination to overcome these limitations, we built our idea, a program to increase the cognitive abilities of young minds, to better shape their future.
What it does
Our program broadens the learner's knowledge horizon and improves their perception skills starting at a young age. It helps children reinforce the word to object association by showing multiple images of specific objects in all different orientations, angles, backdrops, and variety and ultimately testing the same. This is something that conventional educational resources fail to accomplish.
Our program consists of two levels and gives the user the flexibility to choose which level they want to use and how many times. The first level grabs an image from over 200 object categories from an online database hosted on GitHub. The computer speaks the name of the object and asks the child to repeat it. For example, the computer will display an image of an airplane and say the word ‘airplane’. As we have multiple images of airplanes, the child will learn to recognize airplanes of all shapes and sizes.
The second level is a guessing game, which uses an Artificial Intelligence model to predict the object inside a random image. The computer asks the child if the object is really what the AI model predicted, and the child replies either “yes” or “no”. This is compared to the correct answer and the child is awarded a point if they are correct.
The final score is displayed after the game finishes.
How we built it
Our program has been coded in python 3.7. It executes with the help of libraries such as TensorFlow, gTTS, SpeechRecognition, pocketSphinx, etc. We import the images from an online database that has been hosted on Github. After importing the image, the AI in the program detects the picture and saves the prediction as a variable. The program is further executed based on the level selected by the user. In level 1, the program speaks back to the user with the name of the object being displayed by using the matplotlib library. In level 2, the AI predicts the object in the image displayed on the user’s screen and waits for the user’s approval. The user’s approval is recorded from their microphones and translated by our program. Based on the answer given by the user, the program gives relevant feedback. This feedback is delivered to the user by the Google text-to-speech library(gTTS) from their selected output speaker. Initially, the user is prompted to input the number of images they would like the program to display so that the program stops when that number is reached.
Challenges we ran into
Not everyone in the group was familiar with python 3.7, thus there was a lot of learning involved while developing the program. Many of our team members had trouble installing certain Python Libraries, like TensorFlow
It was challenging to display the image in a non-blocking manner. Using Tkinter to display the image meant that the preceding code would not run. Hence, we had to revert to using matplotlib.
Our Team had to overcome the communication challenge of working remotely and coordinating team tasks online.
Accomplishments that we're proud of
Usage of Artificial Intelligence in level 2 of our program
Designed a refined dataset specifically for the program to help increase the object-word association in children
An interactive program to keep the user engaged by giving audio responses and feedback.
Built it in a team of 4 with thorough communication in a span of 36 hours
What we learned
Coding simultaneously online
Inculcating multiple python libraries to increase UX
Improved video editing skills
Collaboration and teamwork
What's next for WhatsThat
We will work on refining and polishing the program to increase speed and eventually launching it into the real world.
Can be used to train AI models too
Increasing the data set of object words
Reducing AI error percentage
Develop a more children-friendly user interface
Domains that we registered
letsnotgo.online - Kushagra Goel
allyour.space - Ajeya Madhava Rao Vijayakumar
nextdimension.space - Charu Tyagi
4dimensional.space - Jhanavi Gera
Everything we submitted was created during Same Home Different Hacks