YouTalk: augmented messaging alerts for video


Background

Often when we are talking in video conferences, we make references to outside sources that we'd like the other party to know about. The problem is, often times, we must pause the discussion to search for the data in question.

With YouTalk, we enable users to do all of their talking while alerts pop up to give context awareness to their audience – especially in online webinars.

How it works:

Four tiny little phrases:
  • "Let's go out for lunch this Sunday!"
  • "I'm not familiar with the 2019 lizard mating season, could you explain that?"
  • "Any update on the weather at the New York office?"
  • "The new guy in the office's name is Sid Sharma, you should look him up."

YouTalk takes what a caller is saying in a video conference and converts it to text.

https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/

The NLP component then looks for sentiment in the given text, to form an action.

https://azure.microsoft.com/en-us/services/cognitive-services/directory/know/

Once an action is formed (schedule lunch + data), it forwards that actionable input to a UI Modal popup that will be shown to the receiver with call-to-action button that lets them trigger and execute an existing Node.js function. For e.g.

Person says: "Let's go to lunch next week...Maybe thursday?" Youtalk recognizes 'let's' 'lunch', 'next week' and 'thursday' and calls on a serverless function that can make a reservation in the receiver's Google Calendar, if s/he clicks the button of intent.

It focuses on three features from the new SDK: Quality Transparency (viewing local audio stats w/ Client.getLocalAudioStats), Switch Devices (change audio output w/ Stream.setAudioOutput) and finally Audio/Video Track Management (retrieve audio track w/ Stream.getAudioTrack).

TL;DR

YouTalk takes audio-input from the Agora.io powered call and presents actionable popup alerts to be showcased on the call receiver's end.

Think of it almost as IFTTT for video.

Why it was built

It helps your video conferences feel more alive and empowers the Agora.io SDK further for consumers. Plus, automation recipes and video conferencing put together is a pretty cool concept to bring to life.

What's next?

Additional speech context integrations for YouTalk + Agora.io in the future w/ deeper search capabilities!


APIs used:

  1. Microsoft Cognitive Services & Speech SDK for natural language processing
  2. DarkSky & Bing Web Search API for general web search-related recipes
Share this project:
×

Updates