Build the vocabulary in a foreign language you understand

This Alexa skill exposes you to random sentences in a foreign language that are just a tiny step beyond your current level. It keeps track of the vocabulary you already know and helps you to progress.

Table of contents

  1. How learning a language works
  2. How the skill works
  3. Where the data comes from

How learning a language works

The four pillars of language learning: reading, listening, writing and speaking all depend on acquisition of a decent quantity of vocabulary. This can be done only by an extensive exposure to chunks of destination language, such as whole sentences.

Already after an introduction to a language done by other means, we can build on the knowledge of a few dozens words and show the learner some sentences that contain mostly known words together with one or two new words. Comparing these sentences to their English translation (done by humans), the learner will understand the meaning of those few new words.

After certain time and repeated exposure, the learner will recognize the words and can mark them as familiar, which will immediately open doors to even more and more sentences.

How the skill works

The user chooses a language to practice. If this is a new language, an approximate evaluation of his vocabulary size will be done by a bisection of a frequency list where the user indicates whether he understands the word in the destination language or not.

The main loop of the skill will then search for random sentences from the database that contain a certain number of new words and read them to the user. It is then possible to repeat, translate or skip to the next sentence. If the whole sentence is easy to the user, all unknown words will be marked as known.

If the user wants Alexa to explain the sentence, it will read him all the unknown words one after one together with a translation and the user can confirm whether the word is familiar to him or not.

In the following diagram, Alexa speaks blue, the user speaks yellow and the backend processes are in black&white fields:

Where the data comes from

Sentences pairs come from (released under CC-BY).

Frequency lists come from Wiktionary:Frequency_lists.

Built With

Share this project: