The skills created by university teams for the Alexa Prize challenge work in conjunction with knowledge-graph databases in order to sustain a conversation over different topic domains. I wanted to explore the use of a knowledge graph in conjunction with Alexa Conversations, and wanted to see if I could support conversations outside of any one subject domain using general data relationships.
What it does
In Wiki Walk, the user asks for information about something - anything in the Wikidata database. Wiki Walk offers the user a description that matches the query, and asks for confirmation that that is the topic of interest. If the user wants more options, Wiki Walk asks for something to narrow down the search. Once a topic and its description have been accepted, Wiki Walk traverses the Wikidata database to connect the original something with its hierarchy of inclusion: e.g. "Earth, third planet from the Sun in the Solar System, is a instance of inner planet, a subclass of planet of the Solar System, a subclass of planet, a subclass of planemo, a subclass of substellar object."
How I built it
Wiki Walk works entirely with its Alexa Conversations model, using four dialogs and three APIs, without depending on an intent-based interaction Model. The API handlers are included in an Alexa-hosted Lambda function. On asking about a topic, the GetAFact API handler uses the Wikidata API for a search on the topic, returning 50 possible Wikidata entities, offering the user a description of the first entity. If the user rejects the first description, and offers up a desired description, the MatchDescription API handler scores all the remaining descriptions for a best match, and continues offering until the user accepts one. Once an acceptable description is selected, the TakeAWalk API handler makes successive calls to the WikiData API to search for properties that indicate a hierarchy of inclusion and fetches the next parent entity in the hierarchy (up to six levels). The results are read by Alexa using APL-A, and displayed (with an image of the original entity), using APL.
Challenges I ran into
My first challenge was to choose a concept for my skill, and scope it to the time of the hackathon. There was a new-tech learning curve built-in, and both the platform and its documentation were in beta, so these needed to be scoped realistically as well. The primary technical challenges were: deciphering the error messages relative to the state of the model and the build, the inability to save a snapshot of a working model before changing or extending it, and the longer Alexa Conversations build times that affected my development cadence. Because I was learning at the same time the platform itself was being shaken out, I was challenged in distinguishing between conceptual error, programming error, and platform or documentation error.
Accomplishments that I'm proud of
In addition to completing the skill within its defined scope, I contributed to the community during the development period, helping other developers in Twitch streams and on Slack, and providing feedback in terms of experience, questions, and documentation suggestions to the Alexa Conversations team. I was not alone in this, however; we were all lifted by the engaged community of developers and the commitment of time and resources by the Amazon team.
What I learned
- I learned that there are general semantic relationships embedded in a knowledge graph that can support conversations around a variety of topics, independent of domain knowledge within the skill.
- I learned to scope an exploration to fit within the time constraints of a learning curve and a hackathon deadline.
- I learned to better differentiate between gaps in my conceptual understanding, coding errors, and platform and documentation maturity issues.
What's next for Wiki Walk
- Find other domain independent property relationship patterns to offer in a conversational context without domain-specific hacks and heuristics
- Use Alexa Entities, when integration with Alexa Conversations is supported
- Use real sentence embedding vs simplistic scoring of potential descriptions in the MatchDescription API
- Maintain context across sessions (get to know the user’s interests for likely disambiguation)
- Integrate more imagery (e.g. sync the spoken output with an image slide show rather than a single image)
- Understand questions relative to entity type (e.g. “Who” vs “What” will weight people entities)
- Create more dialogs for "unhappy paths"