-
-
Skill Logo
-
Play Song Flow on devices with screen
-
Play Song Flow on devices without screen
-
Welcome screen
-
Game in progress notification
-
Asks if the user is ready to start the game
-
Displayed when song is playing
-
Incorrect song screen
-
Correct with artist and song
-
Correct with only song
-
Results screen
-
Asks if the user is ready to start the game
-
Game in progress notification
-
Displayed when song is playing
-
Results screen
Inspiration
When walking to class or driving, I like to listen to my Spotify playlists. When a new song starts playing, I often find myself guessing the song title as sort of a mental game with myself. The idea of being able to play a game tailored to my own quirky habits was the driving force to build “What's That Song.”
What it does
While there are Alexa Skills that have users guess songs, these skills are not customizable. With “What's That Song,” users can play the game using their followed Spotify playlists as their song pack. This allows users to play with public playlists or their own personalized, custom playlists.
Along with being customizable, “What's That Song” is available in the following locales:
- US English: en-US
- UK English: en-GB
- Canadian English: en-CA
- Australian English: en-AU
- German: de-DE
Additionally, “What's That Song” is playable as a singleplayer or multiplayer game for up to four players with the use of Amazon Echo Buttons.
It is also prepared with the Alexa Presentation Language (APL), enabling the skill to have beautiful visuals on screened devices, such as the Echo Show or Echo Spot.
How "What's That Song" works
"What's That Song" has many parts, but it can be summed up into three general categories, each with importance in the functionality of the skill. Let's check them out!
Account Linking
When a user first enables "What's That Song" within the Alexa App, they are prompted to link their Spotify Account to the skill. This allows me to make API calls to Spotify on behalf of the user.
Account Linking is required for the skill to function because it is what allows the skill to be customizable for the user, as they would have given the skill access to their followed Spotify Playlists, which is required to play.
The Lambda Function
The bulk of the skill functionality is handled in our Lambda Function and specifically handled in the scripts within the handlers folder and the configuration folder.
Global Methods
When any request hits the Lambda, they are sent to a request interceptor method that formats the request and prepares the Lambda to respond to said request. When the response is ready to be sent back to the skill, it is run through a response interceptor method that saves all session attributes to persistent attributes and cleans the Lambda function.
The Global Methods handle the following:
- RequestInterceptor: Intercept the incoming request before dispatching to the handler.
- ResponseInterceptor: Intercept the outgoing response before sending back to Alexa.
- DefaultHandler: When all else fails, go to the default.
- HelpHandler: General help prompt for the entire skill.
- StopCancelHandler: Quits the skill when the user says "Stop", or "Cancel".
- SessionEndedRequestHandler: Cleans up Lambda when a session is ended.
- ErrorHandler: Handles all Lambda errors that arise in the skill flow.
Skill Start
When the skill starts, the skill checks to see if there is a past game in progress. If there is, Alexa prompts the user if they would like to resume, or start a new game. When the user responds, it will continue to roll call.
The Skill Start Methods handle the following:
- LaunchPlayGameHandler: Invoked when a user says 'open' or 'play' or some other variant
- StartNewGameHandler: Invoked when a user wants to start a new game
- PlayerCountHandler: Invoked when a user responds to the skill regarding player count for the new game
- NoHandler: The player has responded 'no' to the option of resuming the previous game.
- YesHandler: The player has responded 'yes' to the option of resuming the previous game.
Roll Call
When the user is ready to begin, the skill needs to register the Echo Buttons using a roll call, this is where Alexa will ask each player to register their button.
The Roll Call Methods handle the following:
- GameEventHandler: Events from the game engine (Listens for the Echo Buttons)
- NoHandler: The player has responded 'no' to the option of continuing roll call.
- YesHandler: The player has responded 'yes' to the option of continuing roll call.
Gameplay
When the Start and Roll Call flows are done, the skill will move into the general gameplay of the skill. This is the fun part! While the gameplay intents are intercepted in this part, most of the game logic is handled in game.js.
The Gameplay Methods handle the following:
- EndGameHandler: The player has responded 'stop', 'cancel', 'no', requesting the game end.
- GameEventHandler: Events from the game engine (Echo Buttons)
- PlayGameHandler: The player has asked to play a game while in the middle of a game, continue on
- YesHandler: Player has responded 'yes' to being ready to start the game
- AnswerHandler: The player is answering a song.
- DontKnowNextHandler: The player has responded 'don't know', 'next', or similar.
- InProgressPackChooserIntent: The player has requested to change song pack, but has not provided us with the pack to change to.
- CompletedPackChooserIntent: The player has requested to change song pack and has provided us with the pack to change to.
The Encoder
The encoder is a critical part of the skill, as it is the service that converts the MP3 song previews into usable files to be played using an SSML Audio Tag. The encoder is only used on headless devices, as the APL Video Component handles the playback on screened ones.
The encoder is based on BeSpoken's Encoder found here.
Challenges I ran into
Over the course of the project, I ran into many issues ranging from simple typos to services going down. I will go in-depth into some of these issues below.
APL Integration
When I first created "What's That Song", I built it with limited APL support with simple visuals displaying whether the song guessed was correct or not. Staying true to the groundwork I laid out, I did an overhaul to the APL, adding transitions, a scoreboard, and improved gameplay mechanics, keeping screens in mind.
In the gallery above, you will find some visuals the users are likely to see when they play "What's That Song".
The search for ways to improve audio quality
Before my APL overhaul of "What's That Song", all of the song previews would have to be sent through my MP3 encoder and played through the SSML Audio Tag, which requires audio to be in a specific format, reducing the quality of the song preview.
Near the end of my overhaul, I thought of an idea to solve my audio quality issue with certain devices, specifically screened devices, by playing the audio through an invisible APL Video Component. Doing this was relativity easy to integrate as I was already sending a SetPage command to my APL Document, so I could easily tack on a PlayMedia command alongside the SetPage. A simple modification to the APL Document was needed to add the invisible Video Component.
When the song finishes playing, I use the Video Component's onEnd method to play a reprompt when in singleplayer mode. In multiplayer mode, a button timeout stops the song when the timeout has been reached and no one has buzzed in.
Now, when "What's That Song" is played on devices with screens, song audio quality is significantly better than when the skill is played on headless devices.
The search for better gameplay
Originally, my "What's That Song" skill listened only for the name of the song and nothing else. After playing through the game with friends and family, I noticed that many of the players would say both the song name and artist. It seemed like it could be a fun addition. So, I added an entirely new point system to the game, including a results page at the end of the game.
- Correct Artist: 2 points
- Correct Song: 3 points
With the addition of the point system, I attempted to make the answers allow single-slot utterances. For example, a user has to give their answer a prefix, such as, "Is it ", but I have noticed that a significant amount of users were attempting to answer the song with only the song name, or "". After doing this, users would then hear the skill's reprompt as my skill was not expecting such an answer. To combat this, I attempted to add "" to my list of utterances. The result was catastrophic. The skill could then no longer be exited and it broke the entire gameplay of the skill. I realize now that because I was expecting the AMAZON.MusicRecording (US) slot or the AMAZON.Movie (UK, CA, AU, DE) and not a fully defined slot, the simple addition of "" would allow for almost any word spoken to Alexa, thus having the AnswerSongIntent intercept every request sent to the skill. I never did end up solving this issue, but I will note that this issue would be resolved if developers could have the ability to turn on and off intents on the fly from the skill.
BeSpoken Encoder Outage
By far my largest challenge was the outage of a critical service that my skill depended on; The BeSpoken Encoder. The Encoder took the URL of an mp3 file, then converted it into the proper format required for use with the SSML Audio Tag.
When I first deployed the skill everything worked fine for about a week, but then the Encoder service became unreliable with a few complete outages. I ended up having to reverse engineer their project on GitHub and deploy it myself on an Amazon EC2 Instance.
What's next for "What's That Song"
In the future, I would like to allow users to add a song to their Spotify library with a single, on-screen button press.
How I built "What's That Song"
My project has 3 parts. As we go through each part, I will give you a walk-through on how to construct each. Alternatively, you can enable my published skill for use with your Alexa enabled device to play "What's That Song?"
Let's begin!
Part 1: Spotify Setup
In order to be able to read the user's playlists, the skill will need to be able to access their account. We will achieve this through skill Account Linking and allow the user to log into their Spotify using Spotify's OAuth login. This gives our skill an API token, allowing us to query the user's Spotify playlists.
1) Log into the Spotify Developer Console.
2) Click the CREATE A CLIENT ID button.
3) Fill out the form that pops up and clicks the NEXT button.
NOTE: Because this app will be an Alexa Skill, select "Voice - Other" under the "What are you building" section.
4) On the next page, click NO, as were are not building a commercial integration.
5) On the final page, check all three (3) boxes and click the SUBMIT button.
6) After the form is submitted, you will be directed into the App page. You will need to copy the Client ID _and _Client Secret. Keep these values in a safe place, as we will be needing them later.
7) Click the green EDIT SETTINGS button.
8) After the popup appears, find the redirect URI section.
9) Add the following URIs:
- https://layla.amazon.com/api/skill/link/MLRELFDJMUDDH
- https://pitangui.amazon.com/api/skill/link/MLRELFDJMUDDH
- https://alexa.amazon.co.jp/api/skill/link/MLRELFDJMUDDH
10) Click the green SAVE button in the bottom left of the popup.
Awesome! We configured Spotify! Next, we need to configure our S3 bucket before we can start building our skill!
Part 2: S3 and Lambda Configuration
Before we create the skill, we need to create an S3 bucket and create credentials in IAM to allow access to these services from the lambda function.
Part 2.1 - S3 Bucket Configuration
1) Sign into the S3 Console.
2) Click the blue Create Bucket button
3) In the popup, enter a name for your bucket, then click the white Create button in the bottom left of the popup.
4) When the popup closes, click on the name of the bucket.
5) In the navigation bar at the top, click on Permissions, then click the edit button to edit the public access settings for this bucket.
6) Uncheck all 4 fields, then save.
7) Follow the prompts in the popup.
8) Next, click Access Control List in the sub-navigation.
9) Click Everyone under Public Access, check the following in the popup, and click the blue Save button.
- List objects
- Read bucket permissions
The Access Control List in the sub-navigation should now look like the following.
10) Now, click CORS configuration in the sub-navigation and paste the following code in the editor:
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
<AllowedOrigin>*</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<MaxAgeSeconds>3000</MaxAgeSeconds>
<AllowedHeader>Authorization</AllowedHeader>
</CORSRule>
</CORSConfiguration>
11) Click the blue Save button to save your changes.
Your bucket is now configured, but we still have to set up a user to write to the bucket.
Part 2.2 - IAM User Configuration for S3
1) Navigate to the IAM Console.
2) Click on the Users tab in the left navigation.
3) Near the top of the page, click on the blue Add User button.
4) The username can be any name you wish. Click next in the bottom right of the page to continue.
5) Attach the AmazonS3FullAccess Policy. Click next in the bottom right of the page to continue.
6) We do not need to add any tags. Click next in the bottom right of the page to continue.
7) Review the user. Click next in the bottom right of the page to continue.
8) On the last step, your user credentials will be displayed. Make note of the Access Key ID and the Secret Access Key. I highly suggest downloading the .csv file, as it contains these keys and you will not be able to see the secret key again.
9) Click the Close button in the bottom right of the page.
You now have your S3 user set up, which means that we can add objects to our bucket with the credentials. The objects we will add will be 30-second song previews that have been converted to play through the audio SSML tag.
Part 2.3 - IAM User Configuration for Lambda (Only for ASK CLI users)
NOTE: This is only needed if you do not want to use the ASK CLI to create your skill.
1) Click on the Roles tab in the left navigation.
2) Near the top of the page, click on the blue Create Role button.
3) Choose Lambda as the AWS service. Click next in the bottom right of the page to continue.
4) Add the AmazonDynamoDBFullAccess Policy. Don't click next yet...
5) Add the CloudWatchFullAccess Policy. Click next in the bottom right of the page to continue.
6) Adding tags is not needed for our use. Click next in the bottom right of the page to continue.
7) Name and confirm the role. Click Create Role in the bottom right of the page to create.
8) After you click create, you will be brought back to your roles you should see a green confirmation message.
Now our lambda function can use this role to save user data in DynamoDB and write logs to CloudWatch.
Part 3: Alexa Skill
For the Alexa Skill, there are two paths you can take to create the skill, either through the developer console, or using the ASK CLI. I will go through both paths, so you can choose which one you would rather follow.
Part 3.1 - Skill Creation with ASK CLI
1) Follow the ASK CLI Quick Start through Step 3 and come back here.
2) Create a new folder anywhere on your computer and open it in VS Code.
3) In VS Code open a new terminal by going to Terminal->New Terminal.
4) In the new terminal, clone my repository with the following:
git clone https://github.com/AustinMathuw/WhatsThatSong.git
5) Once cloned, navigate to the following file:
lambda/config/settings.js
This file stores our settings and key information for the skill.
6) On line 23, make sure the value of APL_ENABLED is set to true.
7) On line 25, add your S3_KEYID and S3_SECRET you made note of in Part 2. Then, change S3_BUCKET to the bucket you created earlier.
8) Save the settings.js file, go back to your terminal and type:
cd WhatsThatSong/
ask deploy
9) The command, ask deploy, will create the skill in your developer console and it will also configure and create your lambda function.
10) To set up Account Linking, go to step 32 in Part 3.2.
Part 3.2 - Skill Creation with ASK Developer Console
1) Sign into the Alexa Skills Kit Developer Console.
2) Click the blue Create Skill button.
3) On the next page, enter a name for your skill, keep English as the default language, make sure the Custom model is selected, and click the blue Create Skill button.
4) Make sure the Start from Scratch template is selected and click the blue Choose button.
The following page displayed is the Build page for your Alexa Skill. Here we will configure the following:
- Invocation Name
- Intents
- Slots
- Interfaces
- Lambda Endpoint
- Account Linking
5) Instead of individually configuring all of our intents and slots, we will use the JSON editor. Click on JSON Editor in the left navigation panel.
6) Copy and paste the JSON here and paste it into the JSON Editor.
7) Click the blue Save Model button above the JSON Editor to save the model.
Once saved, you should see a green notification in the bottom right of your browser letting you know the model was saved successfully.
8) Under the JSON Editor option in the left navigation bar, you will see an option called Interfaces. Click on it and enable the following options:
- Both Alexa Gadget options (Gadget Controller, Game Engine)
- Alexa Presentation Language (APL)
9) Click the blue Build Model button above the JSON Editor to build the model. You should see a blue notification in the same place as the saved notification letting you know that the build has started.
Once built, you should see another green notification informing you know the model was built successfully.
10) Add a new tab in your browser and navigate to the Lambda Developer Console.
11) Click the orange Create Function button.
12) On the next page, give your skill a name and choose the existing role you created in step 2.3.
13) Click the orange Create function button in the bottom right of the page.
14) On the next page, find the Alexa Skills Kit trigger and click on it. It will say configuration needed.
15) Click on the Alexa Skills Kit trigger and scroll down. Then, go back to you Alexa Developer Console and go to the Endpoint option in the left navigation bar.
16) Copy the Skill ID, paste it in your Lambda function and click the Add button.
17) Scroll up and Save your Lambda function.
18) Copy your Lambda function's ARN.
Paste it in the endpoint section of the Alexa Developer Console.
19) Save the Alexa Skill endpoint by clicking the Save Endpoints button on the top left of the page.
20) Create a new folder anywhere on your computer and open it in VS Code.
21) In VS Code open a new terminal by going to Terminal->New Terminal.
22) In the new terminal, clone my repository with the following:
git clone https://github.com/AustinMathuw/WhatsThatSong.git
23) Once cloned, navigate to the following file:
lambda/custom/config/settings.js
This file stores our settings and key information for the skill.
24) On line 23, make sure the value of APL_ENABLED is set to true.
25) On line 25, add your S3_KEYID and S3_SECRET you made note of in Part 2. Then, change S3_BUCKET to the bucket you created earlier.
26) Save the settings.js file, go back to your terminal and type:
cd WhatsThatSong/lambda/custom/
npm install
NOTE: You need to install Node.JS for npm to work.
27) In a file explorer, navigate to your project directory and go to lambda/custom. Highlight everything and send it all to a .zip folder.
NOTE: If you want to edit the code in Lambda, you will need to delete the aws-sdk folder under the node_modules folder.
28) In your Lambda Developer Console, click on the name of your function.
29) Scroll to the Function Code section.
30) Click Upload and find your .zip folder.
31) Click save in the upper right of the page.
32) Go back into the Alexa Developer Console and select the ACCOUNT LINKING option in the left navigation.
33) Fill the page like the following images:
NOTE: The Client ID and Client Secret are those you made note of when setting up Spotify.
34) Click the Save button at the top right of the page, then click CUSTOM in the left navigation bar.
35) Click back on the JSON Editor option, then Save and Build the skill once more.
After you see the notifications, with no errors, you are all done and ready to test the skill!
Part 4: Testing
Wow! I'm glad you are still around! It's time for testing!
1) In your browser, navigate to the Alexa Skills Store.
2) In the top right of the page, click the Your Skills button.
3) Click the DEV SKILLS option in the top navigation bar.
4) In the top right of the page, click the Settings button.
5) Click the Link Account link.
6) Log into Spotify.
7) Confirm that it is you.
8) Close the window as prompted.
9) After the successful account linking flow, go back to your skill in the Alexa Developer Console.
10) Then go to the Test tab in the top navigation bar. Make sure testing is enabled and try launching your skill with "Alexa, open song trivia"
Additional Information
My project is not endorsed, or sponsored, by Spotify AB.
You can get the code from my GitHub.
Built With
- amazon-alexa
- amazon-ec2
- amazon-web-services
- amazon-dynamodb
- amazon-cloudwatch
- amazon-lambda
- amazon-iam
- alexa-skills-kit
- spotify
Log in or sign up for Devpost to join the conversation.