Inspiration

I am interested in creating voice tools, and when the APL (Alexa Presentation Language) came out, I immediately thought about using it to create voice controlled image creation/manipulation.

What it does

You create pictures by saying things like, "make a green triangle", "move it left five times", "rotate it counter clockwise", etc. Possible instructions include moving in one of the four directions, rotating, growing, shrinking, changing color, or move up or down layers since shapes can overlap. You have the option to name your shapes and then refer to them by name (move Sam right), and show or hide names.

I tried to make it more entertaining by throwing encouraging comments in if the picture or the work done is complex enough. There are also interjections if Alexa has problems understanding what you want.

How I built it

Coded in Python on AWS Lambda using DynamoDB for persistence.

Challenges I ran into

I submitted for Amazon certification on Jan 14th, and they initially said they would have results by the 17th, but then gave me the message that, "It is taking more time than anticipated to review your skill. " and they expect to be done by the 28th. It was actually done on the 31st, and they found problems with CanFulfillIntent HelpIntent when used on devices without a screen, and that I forgot to include the companion app sample phrases directly in my model (although they would work). This feedback got me thinking about how to handle users without a screen (who wants to draw without a screen?), and I added a function to give a summary of what is on the screen. I also fixed all the issues and resubmitted on Feb 11th. They estimated Feb 25th, but gave me a fail on the 19th with one model problem from the phrase I added last time. I fixed and resubmitted on the 21st and they estimated it would be done on Mar 7th (a few hours before context deadline). I got the fail on Mar 1st.

This time they found a corner case I missed. If you raise the level of an object multiple times so it tries to go past the top, it would fail. Trying to move past the bottom or using the 'move to the top' intent were fine. Simple problem to fix and I resubmitted on Mar 1st. They estimated Mar 14th, but fortunately they passed it on March 6th. Yay!

Accomplishments that I'm proud of

Fairly smooth interaction and handling of circumstances like, "move the circle up" when there are two circles. If they differ in color, Alexa asks which color and if they differ in position, she asks, "the circle on the left or the right".

I didn't want the "help" function to get in the way of verbal experimentation, so Alexa makes suggestions until you have entered a certain number of successful commands. I list all the commands in help using a ScrollView and APL Speech Synchronization. That way, the automatic halt on speech that happens when touching a ScrollView becomes a feature.

What I learned

It isn't easy to cover all the cases of what people find intuitive in referring to situations. Still more changes to make. Also, it is hard to make undo functions!

I also learned about the challenge of giving auditory instructions for a long manual of possible commands. There are over 15 commands which you can list by saying help. They are easy to see on the screen, but to help users (esp kids) get these in small chunks, the program gives a couple of available command suggestions at the end of each action when you start. After a while, to avoid becoming boring, the skill stops that and just asks, "What next?". If you ever say anything she doesn't understand, you get a couple of suggestions again. I don't know if this is effective, and would love feedback on it.

What's next for Doodlegram

See if people are interested. I want to add the ability to group objects, but this may need to change the internal representation. A "group" function could be used to email pictures to me (as a kid's skill I can't email the pictures to users), and I could create a gallery so people could add groups.

Share this project:

Updates