Project Architecture and Roles
VocabViz Logo

VocabViz

Learning new language vocabulary through an interactive AR experience.

Team

Hector Castillo - MIT '20 | Integration, Infrastructure
Evan Hostetler - MIT '22 | Hardware Specialist, Branding
Anthony Nardomarino - MIT '22 | Object Classification Specialist
Tony Terrasa - MIT '21 | AR Specialist, Integration
Grady Thomas - MIT '23 | Translation API Specialist

Inspiration

We saw a huge opportunity to transform the way people learn languages by using computer vision to reimagine the most effective language learning strategy: immersion. At Vocab Viz we are driven not only to help the world learn new languages with practical tools, but to bring the world a little closer together.

What It Does

VocabViz is a way to learn vocabulary in different languages by detecting what an object is and translating it in real time using the camera on your device.

How We Built It

VocabViz runs on four main technologies. These include:

We start with a video stream. This can be fed in through any input, but for simplicity we chose the built-it webcam for our laptop. A section of the screen is then selected to be recognized and run through the IBM API.

Challenges Encountered

With such a short time span, some of the hardest challenges were to download the proper dependencies.

Another challenge we faced was to determine which of the outputs from IBM’s API to choose to display. We ended up using a system that works like a weighted average by class number and percentage match.

One of the hardest things about working with several parts that get coded by different people is ensuring that the inputs and outputs of the different modules are compatible. We ran into an issue where the output of the IBM API was a string, but the most convenient way to read that information was by reading it out of a JSON into a dictionary.

Accomplishments

We were very proud of the integration between our four main technologies that allowed for a functional visual translator. The fact that we accomplished so much within a day with limited coding experience is something we’re very proud of.

What We Learned

We learned how to use object tracking in OpenCV.

Furthermore, this being one of the first major collaborative projects done by some of the members of the team, several team members learned to use Git through this project. We also learned about the importance of documenting and making agreements as early as possible about the formats of the inputs and outputs of different modules. We saw an incompatibility, and because of this, we were able to combat it and sure that the data was passed as effectively as possible through the workflow.

What's Next

This application could be quite powerful as a mobile application. This would give people the ability to learn new vocabulary on the go.

We also foresee several improvements to the user interface. For example, plugging into a dictionary API could give a way to not just translate the word, but give the option for scrolling over to show more information including definition, sentence examples, synonyms and antonyms. Features could be added to the GUI to allow for easier change of language and the ability to see more than just two languages displayed on the screen.

Furthermore, one of the powerful things of IBM’s API is that it gives you several possibilities for the identification of the object. A useful future feature would be to cycle through different identifications as a means to learn different ways to describe the object that you are looking at and trying to describe in another language.

Dependencies

The following will be necessary to run VocabViz

Python2.7
Pillow==6.1.0
OpenCV-Contrib==3.4.4
Numpy==1.16.5
ibm-watson==3.4.0
google-cloud-translate==1.6.0

  pip install opencv-contrib-python==3.4.4.19  
  pip install pillow
  pip install numpy
  pip install ibm_watson
  pip install --upgrade google-cloud-translate

In order to run the google translate, you need the private key to access Google Cloud Services. Download your key and make sure the following environment variable is set:

  export GOOGLE_APPLICATION_CREDENTIALS="/path/to/key"

Built With

Submitted to

HackMIT 2019

Created by

I worked on backend infrastructure between the classifier and translator. I helped with logo design and documentation.

Hector Castillo
4th year undergrad at MIT studying mechanical engineering with a concentration in robotics and control
I worked on sourcing and integrating the hardware as well as visual design and branding for our team.

Evan Hostetler
I worked on integrating the IBM-Watson object classification system, as well as constructing the system for object detection.

Anthony Nardomarino
I developed the OpenCV I/O pipeline as well as integrated the different API modules.

Tony (Gabriel) Terrasa
gradythomas