Inspiration

When walking to class or driving, I like to listen to my Spotify playlists. When a new song starts playing, I often find myself guessing the song title as sort of a mental game with myself. The idea of being able to play a game tailored to my own quirky habits was the driving force to build “What's That Song.”

What it does

VUI Diagram

While there are Alexa Skills that have users guess songs, these skills are not customizable. With “What's That Song,” users can play the game using their followed Spotify playlists as their song pack. This allows users to play with public playlists or their own personalized, custom playlists.

Along with being customizable, “What's That Song” is available in the following locales:

  • US English: en-US
  • UK English: en-GB
  • Canadian English: en-CA
  • Australian English: en-AU
  • German: de-DE

Additionally, “What's That Song” is playable as a singleplayer or multiplayer game for up to four players with the use of Amazon Echo Buttons.

It is also prepared with the Alexa Presentation Language (APL), enabling the skill to have beautiful visuals on screened devices, such as the Echo Show or Echo Spot.

How "What's That Song" works

Play Song Flow on Screened Devices

"What's That Song" has many parts, but it can be summed up into three general categories, each with importance in the functionality of the skill. Let's check them out!

Account Linking

When a user first enables "What's That Song" within the Alexa App, they are prompted to link their Spotify Account to the skill. This allows me to make API calls to Spotify on behalf of the user.

Account Linking is required for the skill to function because it is what allows the skill to be customizable for the user, as they would have given the skill access to their followed Spotify Playlists, which is required to play.

The Lambda Function

The bulk of the skill functionality is handled in our Lambda Function and specifically handled in the scripts within the handlers folder and the configuration folder.

Global Methods

When any request hits the Lambda, they are sent to a request interceptor method that formats the request and prepares the Lambda to respond to said request. When the response is ready to be sent back to the skill, it is run through a response interceptor method that saves all session attributes to persistent attributes and cleans the Lambda function.

The Global Methods handle the following:

  • RequestInterceptor: Intercept the incoming request before dispatching to the handler.
  • ResponseInterceptor: Intercept the outgoing response before sending back to Alexa.
  • DefaultHandler: When all else fails, go to the default.
  • HelpHandler: General help prompt for the entire skill.
  • StopCancelHandler: Quits the skill when the user says "Stop", or "Cancel".
  • SessionEndedRequestHandler: Cleans up Lambda when a session is ended.
  • ErrorHandler: Handles all Lambda errors that arise in the skill flow.

Skill Start

When the skill starts, the skill checks to see if there is a past game in progress. If there is, Alexa prompts the user if they would like to resume, or start a new game. When the user responds, it will continue to roll call.

The Skill Start Methods handle the following:

  • LaunchPlayGameHandler: Invoked when a user says 'open' or 'play' or some other variant
  • StartNewGameHandler: Invoked when a user wants to start a new game
  • PlayerCountHandler: Invoked when a user responds to the skill regarding player count for the new game
  • NoHandler: The player has responded 'no' to the option of resuming the previous game.
  • YesHandler: The player has responded 'yes' to the option of resuming the previous game.

Roll Call

When the user is ready to begin, the skill needs to register the Echo Buttons using a roll call, this is where Alexa will ask each player to register their button.

The Roll Call Methods handle the following:

  • GameEventHandler: Events from the game engine (Listens for the Echo Buttons)
  • NoHandler: The player has responded 'no' to the option of continuing roll call.
  • YesHandler: The player has responded 'yes' to the option of continuing roll call.

Gameplay

When the Start and Roll Call flows are done, the skill will move into the general gameplay of the skill. This is the fun part! While the gameplay intents are intercepted in this part, most of the game logic is handled in game.js.

The Gameplay Methods handle the following:

  • EndGameHandler: The player has responded 'stop', 'cancel', 'no', requesting the game end.
  • GameEventHandler: Events from the game engine (Echo Buttons)
  • PlayGameHandler: The player has asked to play a game while in the middle of a game, continue on
  • YesHandler: Player has responded 'yes' to being ready to start the game
  • AnswerHandler: The player is answering a song.
  • DontKnowNextHandler: The player has responded 'don't know', 'next', or similar.
  • InProgressPackChooserIntent: The player has requested to change song pack, but has not provided us with the pack to change to.
  • CompletedPackChooserIntent: The player has requested to change song pack and has provided us with the pack to change to.

The Encoder

The encoder is a critical part of the skill, as it is the service that converts the MP3 song previews into usable files to be played using an SSML Audio Tag. The encoder is only used on headless devices, as the APL Video Component handles the playback on screened ones.

The encoder is based on BeSpoken's Encoder found here.

Challenges I ran into

Over the course of the project, I ran into many issues ranging from simple typos to services going down. I will go in-depth into some of these issues below.

APL Integration

When I first created "What's That Song", I built it with limited APL support with simple visuals displaying whether the song guessed was correct or not. Staying true to the groundwork I laid out, I did an overhaul to the APL, adding transitions, a scoreboard, and improved gameplay mechanics, keeping screens in mind.

In the gallery above, you will find some visuals the users are likely to see when they play "What's That Song".

The search for ways to improve audio quality

Before my APL overhaul of "What's That Song", all of the song previews would have to be sent through my MP3 encoder and played through the SSML Audio Tag, which requires audio to be in a specific format, reducing the quality of the song preview.

Near the end of my overhaul, I thought of an idea to solve my audio quality issue with certain devices, specifically screened devices, by playing the audio through an invisible APL Video Component. Doing this was relativity easy to integrate as I was already sending a SetPage command to my APL Document, so I could easily tack on a PlayMedia command alongside the SetPage. A simple modification to the APL Document was needed to add the invisible Video Component.

When the song finishes playing, I use the Video Component's onEnd method to play a reprompt when in singleplayer mode. In multiplayer mode, a button timeout stops the song when the timeout has been reached and no one has buzzed in.

Now, when "What's That Song" is played on devices with screens, song audio quality is significantly better than when the skill is played on headless devices.

The search for better gameplay

Originally, my "What's That Song" skill listened only for the name of the song and nothing else. After playing through the game with friends and family, I noticed that many of the players would say both the song name and artist. It seemed like it could be a fun addition. So, I added an entirely new point system to the game, including a results page at the end of the game.

  • Correct Artist: 2 points
  • Correct Song: 3 points

With the addition of the point system, I attempted to make the answers allow single-slot utterances. For example, a user has to give their answer a prefix, such as, "Is it ", but I have noticed that a significant amount of users were attempting to answer the song with only the song name, or "". After doing this, users would then hear the skill's reprompt as my skill was not expecting such an answer. To combat this, I attempted to add "" to my list of utterances. The result was catastrophic. The skill could then no longer be exited and it broke the entire gameplay of the skill. I realize now that because I was expecting the AMAZON.MusicRecording (US) slot or the AMAZON.Movie (UK, CA, AU, DE) and not a fully defined slot, the simple addition of "" would allow for almost any word spoken to Alexa, thus having the AnswerSongIntent intercept every request sent to the skill. I never did end up solving this issue, but I will note that this issue would be resolved if developers could have the ability to turn on and off intents on the fly from the skill.

BeSpoken Encoder Outage

By far my largest challenge was the outage of a critical service that my skill depended on; The BeSpoken Encoder. The Encoder took the URL of an mp3 file, then converted it into the proper format required for use with the SSML Audio Tag.

When I first deployed the skill everything worked fine for about a week, but then the Encoder service became unreliable with a few complete outages. I ended up having to reverse engineer their project on GitHub and deploy it myself on an Amazon EC2 Instance.

What's next for "What's That Song"

In the future, I would like to allow users to add a song to their Spotify library with a single, on-screen button press.

How I built "What's That Song"

My project has 3 parts. As we go through each part, I will give you a walk-through on how to construct each. Alternatively, you can enable my published skill for use with your Alexa enabled device to play "What's That Song?"

Let's begin!

Part 1: Spotify Setup

In order to be able to read the user's playlists, the skill will need to be able to access their account. We will achieve this through skill Account Linking and allow the user to log into their Spotify using Spotify's OAuth login. This gives our skill an API token, allowing us to query the user's Spotify playlists.

1) Log into the Spotify Developer Console.

Spotify Developer Dashboard Login

2) Click the CREATE A CLIENT ID button.

CREATE A CLIENT ID

3) Fill out the form that pops up and clicks the NEXT button.

NOTE: Because this app will be an Alexa Skill, select "Voice - Other" under the "What are you building" section.

Form

4) On the next page, click NO, as were are not building a commercial integration.

5) On the final page, check all three (3) boxes and click the SUBMIT button.

SUBMIT

6) After the form is submitted, you will be directed into the App page. You will need to copy the Client ID _and _Client Secret. Keep these values in a safe place, as we will be needing them later.

7) Click the green EDIT SETTINGS button.

EDIT SETTINGS

8) After the popup appears, find the redirect URI section.

Redirect URI section

9) Add the following URIs:

10) Click the green SAVE button in the bottom left of the popup.

SAVE

Awesome! We configured Spotify! Next, we need to configure our S3 bucket before we can start building our skill!

Part 2: S3 and Lambda Configuration

Before we create the skill, we need to create an S3 bucket and create credentials in IAM to allow access to these services from the lambda function.

Part 2.1 - S3 Bucket Configuration

1) Sign into the S3 Console.

2) Click the blue Create Bucket button

Create Bucket

3) In the popup, enter a name for your bucket, then click the white Create button in the bottom left of the popup.

Popup

4) When the popup closes, click on the name of the bucket.

Click on the name of your bucket

5) In the navigation bar at the top, click on Permissions, then click the edit button to edit the public access settings for this bucket.

Click permissions

6) Uncheck all 4 fields, then save.

Uncheck all 4 fields

7) Follow the prompts in the popup.

Follow the prompts in the popup

8) Next, click Access Control List in the sub-navigation.

Click Access Control List

9) Click Everyone under Public Access, check the following in the popup, and click the blue Save button.

  • List objects
  • Read bucket permissions

Click Everyone under Public Access

The Access Control List in the sub-navigation should now look like the following.

New look!

10) Now, click CORS configuration in the sub-navigation and paste the following code in the editor:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
   <AllowedOrigin>*</AllowedOrigin>
   <AllowedMethod>GET</AllowedMethod>
   <MaxAgeSeconds>3000</MaxAgeSeconds>
   <AllowedHeader>Authorization</AllowedHeader>
</CORSRule>
</CORSConfiguration>

CORS configuration

11) Click the blue Save button to save your changes.

Your bucket is now configured, but we still have to set up a user to write to the bucket.

Part 2.2 - IAM User Configuration for S3

1) Navigate to the IAM Console.

2) Click on the Users tab in the left navigation.

Users Tab

3) Near the top of the page, click on the blue Add User button.

Add User

4) The username can be any name you wish. Click next in the bottom right of the page to continue.

Click next in the bottom right of the page to continue

5) Attach the AmazonS3FullAccess Policy. Click next in the bottom right of the page to continue.

Click next in the bottom right of the page to continue

6) We do not need to add any tags. Click next in the bottom right of the page to continue.

Click next in the bottom right of the page to continue

7) Review the user. Click next in the bottom right of the page to continue.

Click next in the bottom right of the page to continue

8) On the last step, your user credentials will be displayed. Make note of the Access Key ID and the Secret Access Key. I highly suggest downloading the .csv file, as it contains these keys and you will not be able to see the secret key again.

Your user credentials will be displayed

9) Click the Close button in the bottom right of the page.

You now have your S3 user set up, which means that we can add objects to our bucket with the credentials. The objects we will add will be 30-second song previews that have been converted to play through the audio SSML tag.

Part 2.3 - IAM User Configuration for Lambda (Only for ASK CLI users)

NOTE: This is only needed if you do not want to use the ASK CLI to create your skill.

1) Click on the Roles tab in the left navigation.

Click on the Roles tab

2) Near the top of the page, click on the blue Create Role button.

Create Role button

3) Choose Lambda as the AWS service. Click next in the bottom right of the page to continue.

Click next in the bottom right of the page to continue

4) Add the AmazonDynamoDBFullAccess Policy. Don't click next yet...

Click next in the bottom right of the page to continue

5) Add the CloudWatchFullAccess Policy. Click next in the bottom right of the page to continue.

Click next in the bottom right of the page to continue

6) Adding tags is not needed for our use. Click next in the bottom right of the page to continue.

Click next in the bottom right of the page to continue

7) Name and confirm the role. Click Create Role in the bottom right of the page to create.

Click Create Role in the bottom right of the page to create

8) After you click create, you will be brought back to your roles you should see a green confirmation message.

Confirmation message

Now our lambda function can use this role to save user data in DynamoDB and write logs to CloudWatch.

Part 3: Alexa Skill

For the Alexa Skill, there are two paths you can take to create the skill, either through the developer console, or using the ASK CLI. I will go through both paths, so you can choose which one you would rather follow.

Part 3.1 - Skill Creation with ASK CLI

1) Follow the ASK CLI Quick Start through Step 3 and come back here.

2) Create a new folder anywhere on your computer and open it in VS Code.

3) In VS Code open a new terminal by going to Terminal->New Terminal.

4) In the new terminal, clone my repository with the following:

git clone https://github.com/AustinMathuw/WhatsThatSong.git

5) Once cloned, navigate to the following file:

lambda/config/settings.js

This file stores our settings and key information for the skill.

6) On line 23, make sure the value of APL_ENABLED is set to true.

7) On line 25, add your S3_KEYID and S3_SECRET you made note of in Part 2. Then, change S3_BUCKET to the bucket you created earlier.

8) Save the settings.js file, go back to your terminal and type:

cd WhatsThatSong/
ask deploy

9) The command, ask deploy, will create the skill in your developer console and it will also configure and create your lambda function.

10) To set up Account Linking, go to step 32 in Part 3.2.

Part 3.2 - Skill Creation with ASK Developer Console

1) Sign into the Alexa Skills Kit Developer Console.

2) Click the blue Create Skill button.

Create Skill

3) On the next page, enter a name for your skill, keep English as the default language, make sure the Custom model is selected, and click the blue Create Skill button.

Create Skill

4) Make sure the Start from Scratch template is selected and click the blue Choose button.

Start from Scratch

The following page displayed is the Build page for your Alexa Skill. Here we will configure the following:

  • Invocation Name
  • Intents
  • Slots
  • Interfaces
  • Lambda Endpoint
  • Account Linking

Build page

5) Instead of individually configuring all of our intents and slots, we will use the JSON editor. Click on JSON Editor in the left navigation panel.

Left Navigation Panel

6) Copy and paste the JSON here and paste it into the JSON Editor.

JSON Editor

7) Click the blue Save Model button above the JSON Editor to save the model.

Save Model is in the top left of the JSON Editor

Once saved, you should see a green notification in the bottom right of your browser letting you know the model was saved successfully.

Notification appears in the bottom right of the webpage

8) Under the JSON Editor option in the left navigation bar, you will see an option called Interfaces. Click on it and enable the following options:

  • Both Alexa Gadget options (Gadget Controller, Game Engine)
  • Alexa Presentation Language (APL)

9) Click the blue Build Model button above the JSON Editor to build the model. You should see a blue notification in the same place as the saved notification letting you know that the build has started.

Building loading icon

Once built, you should see another green notification informing you know the model was built successfully.

Notification appears in the bottom right of the webpage

10) Add a new tab in your browser and navigate to the Lambda Developer Console.

11) Click the orange Create Function button.

Create Function

12) On the next page, give your skill a name and choose the existing role you created in step 2.3.

Choose the existing role

13) Click the orange Create function button in the bottom right of the page.

14) On the next page, find the Alexa Skills Kit trigger and click on it. It will say configuration needed.

Find the Alexa Skills Kit trigger

15) Click on the Alexa Skills Kit trigger and scroll down. Then, go back to you Alexa Developer Console and go to the Endpoint option in the left navigation bar.

Endpoint option

16) Copy the Skill ID, paste it in your Lambda function and click the Add button.

Copy the Skill ID

17) Scroll up and Save your Lambda function.

18) Copy your Lambda function's ARN.

Copy your Lambda function's ARN

Paste it in the endpoint section of the Alexa Developer Console.

Paste it in the endpoint section

19) Save the Alexa Skill endpoint by clicking the Save Endpoints button on the top left of the page.

Save the Alexa Skill endpoint

20) Create a new folder anywhere on your computer and open it in VS Code.

21) In VS Code open a new terminal by going to Terminal->New Terminal.

22) In the new terminal, clone my repository with the following:

git clone https://github.com/AustinMathuw/WhatsThatSong.git

23) Once cloned, navigate to the following file:

lambda/custom/config/settings.js

This file stores our settings and key information for the skill.

24) On line 23, make sure the value of APL_ENABLED is set to true.

25) On line 25, add your S3_KEYID and S3_SECRET you made note of in Part 2. Then, change S3_BUCKET to the bucket you created earlier.

26) Save the settings.js file, go back to your terminal and type:

cd WhatsThatSong/lambda/custom/
npm install

NOTE: You need to install Node.JS for npm to work.

27) In a file explorer, navigate to your project directory and go to lambda/custom. Highlight everything and send it all to a .zip folder.

Zip up skill

NOTE: If you want to edit the code in Lambda, you will need to delete the aws-sdk folder under the node_modules folder.

28) In your Lambda Developer Console, click on the name of your function.

Click on the name of your skill

29) Scroll to the Function Code section.

Function Code

30) Click Upload and find your .zip folder.

Find your zip folder

31) Click save in the upper right of the page.

Click save

32) Go back into the Alexa Developer Console and select the ACCOUNT LINKING option in the left navigation.

ACCOUNT LINKING

33) Fill the page like the following images:

1

2

NOTE: The Client ID and Client Secret are those you made note of when setting up Spotify.

34) Click the Save button at the top right of the page, then click CUSTOM in the left navigation bar.

35) Click back on the JSON Editor option, then Save and Build the skill once more.

After you see the notifications, with no errors, you are all done and ready to test the skill!

Part 4: Testing

Wow! I'm glad you are still around! It's time for testing!

1) In your browser, navigate to the Alexa Skills Store.

2) In the top right of the page, click the Your Skills button.

Your Skills

3) Click the DEV SKILLS option in the top navigation bar.

DEV SKILLS

4) In the top right of the page, click the Settings button.

Settings button

5) Click the Link Account link.

Blue link on the right side of the page

6) Log into Spotify.

Log into Spotify

7) Confirm that it is you.

Confirm that it is you

8) Close the window as prompted.

Close the window as prompted

9) After the successful account linking flow, go back to your skill in the Alexa Developer Console.

10) Then go to the Test tab in the top navigation bar. Make sure testing is enabled and try launching your skill with "Alexa, open song trivia"

Test tab

Additional Information

My project is not endorsed, or sponsored, by Spotify AB.

You can get the code from my GitHub.

Share this project:
×

Updates