Inspiration

We wanted to integrate the Google Cloud APIs in a way that practically responded to a real world problem. Our application uses autoML and the text-to-speech APIs in such a way that allows blind people to understand written text.

What it does

We trained a model using the autoML API for each uppercase letter. We can implement this machine learning model in our Python code, which isolates the letters in words included in the input .jpg file. The result is audio of the input written word(s).

How I built it

We trained the autoML model by using a dataset of uppercase letters and labeling each as the letter contained (as well as pictures with no letters labeled to be spaces). In the Python code, we sent each perceived letter to the model for it to return predictions of each letter. Next, these letters were concatenated into one string. This string was passed through the 'playsound()' API for the text to ultimately be read audibly.

Challenges I ran into

A major challenge we ran into was an insufficient dataset, which caused our model to have various false positives (most notably mixing K and X; P and D). We responded to this problem by adding many more images of these specific letters to create a more hefty dataset, thus a stronger and more accurate model.

Accomplishments that I'm proud of

I am proud of integrating these APIs in such a way that a real-world problem can be responded to via software.

What I learned

I learned many of the concepts involved in machine learning and how to implement the Google Cloud APIs into an application written in Python. I also learned how to integrate many different APIs and open-source code by using OpenCV to isolate each letter, the Google autoML API to label each latter, the Google text-to-speech API to convert each concatenated string to an mp3 file, and playsound to finally play the speech.

What's next for ByteWrite

Since we were limited on time, we only included capital letters. Next for ByteWrite is a more complete dataset including lowercase letters and symbols. Additionally, we can add more to the dataset to make a stronger model.

Built With

Share this project:

Updates