Recording user's input about desired destination.
System recognises user's intent to go somewhere.
Almost everyone has at least once in their life felt uncomfortability of being lost in a big shopping mall or being unable to find a meeting room when invited for interview. We decided to design a solution which can elevate this by providing information and even organizing indoor transport-facilities to guide you to your destination through the most natural interface of all: your voice.
What it does
Generally, we thought about idea of bringing people to the desired place just in time, also providing all levels of security control and convenient interface. For demo purposes we limited our scope to solving one task: understanding user's intent based on his natural speech in the elevator and taking him/her to the right floor. Our solution is able to understand both as simple commands like "bring me to the basement" and more sophisticated, like "I am hungry". You can try it out here.
How I built it
We utilised IBM Watson services for quick prototyping backend logic, such as converting users's speech to machine-readable text; analysing it with purpose to extract user's intent; and organising the whole flow. Also, we created a React.js-based frontend which shows how our logic works and allows our AI to communicate with user.
Challenges I ran into
It was very hard to narrow down our case to some implementable small prototype, which can show this idea in a nutshell. Also, we didn't have time to connect it with all planned API.
What I learned
We used our experience of working with IBM Watson, so our main lesson was about working in a lack of time and with maximum productivity. Also, we discovered that sometimes it is more effective to use simpler Converstaion models when you need to solve straightforward tasks.
What’s next for I.V.I
I.V.I is just a prototype: it demonstrates a few ideas but is by no mean a final destination for us. In the future, we would like to implement additional features, which includes:
- location-based trigger for recording voice requests (e.g. when one approaches the elevators);
- user identification/authentication via voice recognition providing additional level of security;
- curated advertisements and information display for user inside elevator cars (e.g. supermarket deals, next movies, meeting agendas, etc.);
- scaled up solution which span the whole building and building complexes while maintaining the capability of processing context-specific intents;
- customized elevator music from users’ common preferences.