Homepage which users see when the skill is opened on a device with a screen
First page with a translated word
Second page with a translated word with animation and larger font
Echo Show 5
Echo Show 10'' HD
Fire TV 65''
To be released next: a history page allowing to scroll through previosly asked words. They will also be read by voice.
I always wanted to improve my Spanish vocabulary. I used books and apps for this, but they are not very convenient. My girlfriend is from Spain and I would constantly ask her how is this or that word in Spanish. At first that was fine, but she's not into languages, and it did not take long before she kind of got tired of this. Especially when I kept asking the same words, which are very similar, but I found confusing and hard to remember; for example spoon and knife - both are used for eating and sound almost the same: cuchara and cuchillo.
So Alexa was my occasional helper, but Alexa is not designed to teach vocabulary - after asking a few words it becomes tiring and sometimes even confusing, as you want Alexa for other user cases. Other available skills were usually for multiple languages and quite cumbersome to use for what I wanted.
What it does
Spanish Girlfriend - will translate words you tell her into Spanish without fuss. After she repeated the English word (so the user is sure that a correct word was heard), she says the word in Spanish and pauses - she then repeats the same word slower and leaves a longer pause - allowing the learner to repeat it for themselves too. Repetition helps with momorising.
Spanish Girlfriend only "Next Word”, plays a beep and nothing else. This is because as a Spanish vocabulary learner a user does not need any other additional information within the skill. In addition, any time a user can say: Alexa - and a word to learn.
The requested English word is displayed at the top of the Echo Show screen or Fire TV and the Spanish word is displayed in a pleasant, clear and big font. The second time the words is shown, it is re-displayed with a re-appearance of an even bigger font.
This serves a couple of purposes: association is important when learning new words - hence the English and Spanish words appear together. Spelling is also important and when the word is displayed for the second time - a light animation and a bigger font reinforces the word further into memory (together with the slowed audible pronunciation). The font sizes also ensures that the Spanish words can be seen quite easily pretty much from anywhere in the room.
On Echo, the screen also has a button, below the displayed Spanish word - which allows learner who is sitting next to the device, to tap it and ask for a word just by saying it, that is without a need to wait when the skill will ask for the “Next Word”. (A user can also say “Alexa” at anytime to ask for a new word).
For the Echo Show devices - the design of the screen is made to avoid any distractions to the learner. It is simple but focused. We researched what colours are good for reading and learning and we used a special pallet of such colours to make sure that the display to the learner is calm and unobtrusive.
(A history of the asked words is in development, but has not made it into this built of the skill)
How we built it
The skill was built using Alexa hosted skill in Node.js environment. For translations we used Google Translate API and a custom Spanish words dictionary.
Challenges we ran into
UI/VUI - it was quite challenging to make skill super simple but at the same time natural to use. We’ve gone through many iterations and tweaks until we locked the VUI and UI that we felt was the best for language learners. For example we could have put images next to words and more colours but after using the skill for a while those would distract more than add value.
Google Translate API does not always give most relevant translations, for example “What is your name?” is translated by Google literally, word by word as: ¿cuál es tu nombre?, when Spanish people mostly say: “¿Como te llamas?”. There are many examples like this. We have a solution for this, but it is not included in this release.
Dictionary - multiple meanings. It is hard to know the user’s intent when they say a single word. For example when they say ‘C’ do they mean sea or see; or for example a word: light - do they mean as in not heavy, or as in sunlight. But we have a solutions planned for this.
Utterance Conflicts with Build-in Intents. Translating functionality would normally include all build-in intents, but we did not want to lock out users from being able to use any of those (like many similar skills do) in case users wanted to leave the skill or do something they would expect a skill or Alexa to do. For now we removed those 20 or so words from our dictionary, but we have an elegant solution for the next release on how to go about translating them too, without impacting Alexa’s functionality.
Accomplishments that we're proud of
There was no current skill that could do what I wanted. Most skills are verbose and slow to interact with. Also nothing would repeat a word slowly giving me a good chance to say it too, as well as showing it to me very clearly - so that I would really have a chance to learn it.
Also there was no skill that I could be standing in the room, driving, or walking in a park or supermarket - and just by looking at things and saying them in English I could get their translation almost instantly, almost with no effort.
There’s nothing on the market that I have seen, that can work this well - not a skill or an app. Only a real person could do this, but even they would get distracted or tired very quickly answering same questions again and again.
We are very proud of our Spanish Girlfriend!
What we learned
the limitation of (Google Translate) API / ML translations, as well as its returned data.
Challenges of translation in general, especially for multiple-meaning and similar/same sounding words
the importance of performance, simplicity and focus in designing very user centric skills
how different Alexa Echo system’s devices work and how to best approach their design based on user cases that are related to users’ locations and what they do there. For testing we used Alexa phone app on the go, Alexa Echo in the bedroom, Echo Show 5 on the desk, Echo 10 HD in the kitchen and a Fire TV in the living room. All of those gave us different insights and ideas.
What's next for Spanish Girlfriend
These are the first steps for Spanish Girlfriend with Alexa. The skill has a high potential to become very interactive and fun to use for language learners.
We are not stopping in researching improvements to our visual and voice UI. There is a lot we can do in that area, for example we would like to improve transitions and animations on the screen.
Interactive requests and conversations