Ever stared intensely at a picture and wondered what people from all around the world thought of it?
What it does
Our program uploads pictures into Google Cloud. The programs interprets the pictures and searches for sentences related to the context of the picture. The programs returns a filtered picture with the generated sentence and an audio translation of that sentence in the specified language of our choosing. The program currently features 17 languages for audio/visual based language learning.
How we built it
Using the Google Cloud Python libraries, we isolated each API of Google Cloud and modularized every component. Afterwards, we developed a process flow to generate sentences in the user chosen language from any given image. The sentence is generated is scraped from a sentence generator online and passed to an overlay layer that modifies the original image. This is finally pulled back client-side to be displayed as a visual aide. Translated sentences are passed back to an audio handler which converts text-to-speech in the desired language.
Challenges we ran into
We originally planned to build a frontend in react-native and deploy in Expo. Node cloud-storage API was not compatible with Expo. We had to develop another workaround using the command line interface. Other issues we ran into was uploading the Python Modules properly. Google Cloud authentication was a big hurdle to figure out, but we pulled it off.
Accomplishments that we're proud of
Integrating all of this together of the Google API.
What we learned
What's next for Linguo
- Incorporate more images
- Make it work for mobile phones