What it does
This bot helps you to learn and pronounce new English words.
How I built it
With AWS Lex, Polly, API Gateway, S3, and Lambda. Resources created and deployed with CloudFormation and Serverless. Words search and user profiles stored in Azure SQL database. The bot is connected to Facebook Messenger Platform. Many thanks to skyeng.ru for providing public API and data.
Challenges I ran into
AWS Lex provides Facebook integration out of the box, but in this case it's not enough. I need to receive voice, and send multiple messages with one bot reply, including audio, images, and quick replies. I solved it with another Lambda and API Gateway, which communicate with Facebook. Fortunately, it's fairly easy to set it up with Serverless framework. This gave me full control over what I sent.
After I finally sent my first audio message to Lex, I realized that Lex wants another audio format! Facebook sends mp4 and Lex wants PCM. Can I transcode in Node? Probably not... Well,
ffmpeg to the rescue! You can bundle a static x64 binary with lambda and call it with Node.js
child_process. Feels like cgi-bin, but works :)
Once Lex started accepting voice input from Facebook, I realized that it really tries to select one of slot examples. And this behavior is different from text input. Let me explain. Imagine, you have a custom slot with "cat" and "cow" example values. When you write "dog" - Lex accepts that. But if you say "dog" - it's confused and tries to go with "cat" as a best guess. Probably it has something to do with probability and confidence levels (????) This is not what I want! To solve this, I put words from training dictionary as example values. Now that the bot works I know that the limit is 10000 examples per slot :)
Accomplishments that I'm proud of
It accepts voice from Facebook messenger and says something back. Well, speech-to-text and text-to-speech are out there since a while, but now this technologies are really easy to use.
What I learned
It was a deep dive into Serverless, Facebook API, Lex and Polly. Perhaps I also learned a couple of English words ;)
What's next for ESLBot
I'm wondering if I can now extend the bot to have more exercises, for example, mock dialogues. Voice capability is appealing and this is what falls behind when you learn language using apps.