Inspiration

This is a skill, that solves a real world problem: In Germany we have more than a dozen of public TV stations which offer part of their contents via apps or websites for free. The problem: if you search for a documentary about "Climate Change", "Covid" or "Neil Armstrong" then you have quite some work to do when manually searching all the different media libraries.

Currently there is no such thing as a unique interface for all the media libraries. Some community based unofficial projects like "mediathekview" or "mediathekviewweb" try to solve this using desktop applications or web based solutions. My work makes use of their results and interfaces.

What the skill does

The Alexa skill "Meine Mediathek" which is German for "my media library" is the first voice based solution to approach this problem. Using commands like "Search for Movies related to Covid" the skill will search approximately 300000 movie descriptions within seconds and return the findings to the user. Selected results can then be viewed on the same device. A task that before took minutes is now completed within seconds and initiated by one voice command.

How I built it

The skill makes use of APL including paging and vector graphics for smooth scrolling effects, the new motion features from Echo Show 10, S3 for persistence, Voice View for better accessibility and Node.JS in the backend.

Being surprised that something like this did not exist so far I started with a small proof of concept. Seeing this work and having fun with it I decided to make it "something serious" right when the Hackathon started.

Features:

  • Main feature: Search for films via voice. Search terms will also reflect public film descriptions not only titles.

  • A "select an episode feature" supports selecting a specific episode by number if there are too many results available for series.

  • Page through result lists. I decided to limit the number of result to 18 as the APL otherwise tends to get more and more 'laggy'.

  • While the video plays on screen the user can ask for a contents description of the movie.

  • Via voice the user can navigate seconds or minutes forward or backwards within the movie, e.g. to review an interesting scene.

  • The user can pause and continue playback.

  • The skill will remember your latest search expression when you restart the skill the next time.

  • Using the new motion features supported by the new Echo Show 10 the screen will even follow the user while moving through the room.

  • The skill adds big value to Fire TV devices, it also support the Fire TV remote control where possible.

Challenges I ran into

Making the skill work on various devices turned out to be a difficult task. Regarding APL I tried to go to the limits of what is possible which turns out is different from device to device. Displaying multiple preview videos on screen while there is vector graphics scrolling the same time overwhelms some devices so I had to work with fallbacks for devices with limited hardware capabilities. It was important for me though to make this skill work on big and small screen Echo Devices, the Echo Spot (which has a round display) and the different Fire TV versions (which do not support touch at all).

Accessibility

The whole skill also works on non screen devices like Echo Dot with additional guidance to allow easy access for blind and visually impaired people. On screen based devices the skill supports the Voice View feature for better accessibility.

German public TV stations offer a lot of contents enriched with voice over scene descriptions ("Audiodeskription") which blind people can now access in a simple and barrier free way: the command "Search for comedy audiodeskription" will guide them directly to search results which they can then play on their device without pressing a single button or key.

Accomplishments that I'm proud of

I am writing code for more than thirty years but this is the first time I received dozens of emails saying "Thank you" for something I did: the response came from blind or visually impaired people after launching the first version of the skill. They told me that they have now barrier free access to material which was kind of hidden from them before I made the skill available. Their feedback helped me to continuously improve the language model and make the skill something useful.

What I learned from building the skill

Three things:

  • Getting into contact with visually impaired people offered me some insight in how they somehow naturally have a voice first approach when discussing about user interfaces.

  • It turned out that Voice View support is quite easy to implement.

  • Meanwhile I strongly believe that accessibility is a topic which has not yet addressed by Alexa skill developers the way it should. I am convinced that Alexa could make life easier for a lot of people in ways that haven't been tried and tested out yet.

What's next for Meine Mediathek ("My Media Library")

The contents is made available for free by German TV stations and due to geo blocking issues the skill only makes sense in Germany in the current version.

The code though decouples the voice related stuff from a separated backend adapter which searches the media archives. The interface of the adapter is designed as such that it can be exchanged with another implementation and tested without dependency to the voice related environment. So it should be straightforward to implement some skill like "NASA TV" to offer their public material to a broader audience which is one of the next ideas which I have in mind.

Monetization: I do not plan for monetization as the material presented in the skill is not owned by me. By changing the adapter to another media library it would be possible though to make it a skill where non public contents is only available through a paywall.

Homepage

As I believe the skill is something serious I published a homepage dedicated to what the skill can do. Have a look at https://www.meinemediathek.de (German only).

Finally

This skill shows what the future of television might look like and it already proved to make life easier for some people.

And for the record: This project has not been funded in any way. I did it all in my free time.

AmazonAlexaBeyondVoiceChallenge

Built With

  • accessibility
  • apl
  • motion
  • node.js
  • ssml
  • tv
  • video
  • voiceview
Share this project:

Updates