I love using Alexa for controlling my smart home, listening to music, and playing quizzes. But I was always missing features that were helping me to be connected with my friends while I am at home alone. I wish I could say "Alexa, I want to talk with my friend", but none of my friends have an Alexa-enabled device yet, and the calling to phone numbers is not available in my country.

If you are an Alexa user in the US or Great Britain you are always using the latest features, you can call freely any person via mobile. But all that becomes unavailable outside, for most of the people in the world. Even if you are in these countries, Alexa can't help you to stay connected with your parents, friends living in other countries. You can't send text messages for people who haven't signed up for Alexa Messaging. This makes it harder to convince friends to use smart speakers alongside with me and enjoy all the possibilities that we could have.

I thought I could actually solve this problem and solve it for others too.

What it does

The Cell Phone skill helps you call or text your friends in any country via mobile phone no matter if they have an Alexa device or not. Tell your phone number and the phone number of your friend. The skill will call you and your callee over the phone and create a joint conference so you both can talk to each other. You can also use this skill to ask another person to call you back if you are missing him, need attention, or just want to talk. The skill will call or text that person and kindly ask to call you back.

How I built it

The skill has written on Python and live at AWS as a Lambda function. For cellular connectivity, I use Twilio. For data storage, the skill uses PostgreSQL deployed in RDS.

Challenges I ran into

Initially, I started building this skill using an old-fashioned approach, using Intents. I had to write tons of code to handle all the possible paths of a conversation. But after the announcement of Alexa Conversations, I tried it and instantly loved how simple and faster it became to build complex skills. In this skill, I combined these two approaches to serve the initial goal to make the phone calls available for Alexa users in all countries (since Alexa Conversations, for now, is available with the US locale only).

Another challenge was gathering user input for sending text messages. The built-it slot type "AMAZON.SearchQuery" doesn't allow using phrases with other words, making it impossible to say natural "text {message} to my mom". I tried to make a custom slot type and to write down possible variations for all use cases, but this also didn't work well. So I finally decided to switch back to "AMAZON.SearchQuery" and split the conversation flow into 2 steps: "Text my mom" and saying the message then. The same problem was when I tried to use this slot type with Alexa Conversations. So for the dialogs-enabled version, I had to use a custom slot type.

The third challenge that I faced was integrating in-skill purchases. For the usual mobile phone operators, it's natural to offer several plans or packages that include a limited number of minutes and SMS to their clients. And it's natural for clients to easily switch between these plans depending on their calling needs. But for Alexa Skills, it is not possible to easily "upgrade" or "downgrade" the existing subscription: the user has to cancel one first and then purchase another, it is also impossible to offer discounts for some users.

To overcome this barrier I had to develop the following monetization scheme. The user purchases a basic plan and then has an option to purchase an upgrade to increase limits:

Medium plan = Small plan + upgrade 1 Big plan = Small plan + upgrade 1 + upgrade 2

Technically it's three different subscriptions, but they work together. This greatly increases the number of ISPs that should be defined (especially if we want to offer a discount) but gives some flexibility and personalization.

Accomplishments that I'm proud of

I love that now many people will have more ways to connect with each other and request someone's attention when they need it using Alexa.

It wasn't an easy task to build this skill while having a full-time job and an ongoing revolution in my country (Belarus). I faced many challenges and had to keep many small nuances in the mind, but I am proud that I did this alone within the hackathon and even submitted it before the deadline :)

Also, when I was choosing the right pricing model to keep it running and pay Twilio bills, I found out that actually people will be able to save money on phone calls using my skill. In most countries, you're not being charged for the incoming calls and the roaming calling is much more expensive than the subscription required to use the skill.

What I learned

I learned how to use Alexa Conversations and went deeper in designing engaging and easy-going voice user interfaces. I have mastered building complex skills that combine several approaches, work with relational DB, webhooks, and third-party services.

What's next for Cell phone

I want to add an ability to enter phone numbers using a dial pad on the screen-enabled Alexa devices and add some more visuals with the APL. Definitely I will add localizations and launch the skill in more counties. If more people will learn about this skill and will use it, I will be able to launch it even on non-monetizing locales to make it available for the larger audience.

Share this project: