Photo of Mug of Coffee
Description of Photo + Translated Text
Communication is critical in the modern world, especially with the ever-growing frequency of international travel and global interaction. Unfortunately, when traveling, non-speakers of the local language are at a huge disadvantage, sometimes unable to even communicate the simplest of questions to local residents. We were inspired to create an app that would allow adults and children alike to learn the basics of a new language in an interactive way, with simple translation for the casual user, and (eventually!) guided lessons for the active learner.
What it does
Show and Tell is an image recognition app that works on iOS phones and tablets. The premise of the app is fairly simple: you, the user, take a photo of an object, person, or other item in the general vicinity, and the app will identify the object, provide a short description of the object in English, and convert the description of the object into the commonly used languages of French and Spanish. Using the translated text, users can more easily understand new words and patterns in a language besides their own, while at the same time obtaining a translation for an object they did not know previously. Great for both the curious tourist and the dedicated student!
How we built it
Show and Tell was created using Xamarin (using Microsoft's Visual Studio), Microsoft's Computer Vision API (part of Microsoft Cognitive Services), and Microsoft Azure Translator Text API. Xamarin provided the framework for mobile app development, allowing us to write the app in C# using .NET core packages. To allow the app native camera access and functionality to take photos and save the photo's path, we put together a camera app which could save the specific path to the photo that was taken. We then employed the Microsoft Azure Computer Vision API to perform image recognition, and identify the predominant object in the image, and the major action being taken by the object. The Microsoft Azure Translator API then allowed us to translate a description of the object into various languages and output a selection of them for the user to view.
Challenges we ran into
When we first came up with the idea for an app like this, we realized that we would need a way to store the photo that was taken, and have knowledge of the directory on the phone where the photo was located, so that we could then proceed with image recognition on the image. It took us quite a while to figure out a way to do this, but we finally made it happen. We also had difficulty tying together our camera/photo saving implementation with the Microsoft Computer Vision API, since our camera implementation was built specifically for iOS, while the API was a more general .NET implementation. After several hours (along with a particularly memorable half-an-hour debugging session with a member of the Microsoft team), we were able to get these two things to work together properly.
Accomplishments that we're proud of
Our team is proud to have created an app that allows people to more easily communicate in languages unknown to them, and to have some fun while doing it. We are proud that we were able to add the functionality of taking one's own pictures to feed to the image recognizer, and that we were able to accomplish translation alongside it, all in one go. We hope that this app can truly help people learn the basics of a new language in a fun and easy way.
What we learned
This project was a major learning experience for all three members of the team. Rahil, Pavan, and Anirudh were all relatively new to the realm of app development, and had a very enjoyable time learning the quirkiness of C#, Xamarin, the various APIs, camera functionality, XCode and the ever-hard task of getting everything to work together. This was a great hackathon for all of the team members, and we all look forward to attending many more in the future.
What's next for Show and Tell
Show and Tell is right now in an early stage, but with a little bit more work, could very well be ready for the market. As of right now, it is a great, fun tool for translating images of objects/people into descriptions in English, French, and Spanish. The next step for Show and Tell is to add functionality for the user to select their target language of choice, for broader use in tourism, as well as guided lessons, including speech-to-text checking for the pronunciation of words, which would allow young children to use the app as an interactive way of learning new languages.