Inspiration
I've been working remotely for the past 2 years and spend hours a week on virtual meetings, either running them or passively participating. What I've noticed is that a lot of things get lost during meetings and after meetings are over, especially if the meeting is not recorded. Even then, you have to spend time re-watching the recording to get the info you need. It's also hard to do certain things during the meeting such as take notes, create tickets, message others, etc. I thought there has to be a way to solve some of these problems. That's where VAL comes in.
What it does
VAL is an AI assistant that listens for certain cues/commands and executes flows based on these commands. Using the audio stream from a WebRTC media stream, audio data is fed into the AssemblyAI real-time transcription API. From there, the user's speech is transcribed to text, and the application searches the text for specific commands. Phrases like 'create a ticket in Jira title Business Plan' will trigger a flow that creates a ticket in JIRA and titles the ticket 'Business Plan'. Upon successful creation, the information is sent back to the user and displayed on the screen for the user and others to see. This is done with just the user's voice. Below is a list of all the commands supported:
- Create a ticket in Jira or Asana
- Send an email to a user with a message
- Make a note of 'some text' adds a note to the user's 'Notes' section in the meeting
A powerful feature that is automatic and does not require any user input is the automatic transcribing of the meeting audio into text. The transcribing also highlights notes, captures any actions/commands that have been executed, and any associated metadata. The best part, this transcript can be downloaded and the information is searchable. No need to watch the hour long recording of your meeting last week. All you need to do is search the transcript using some keywords that you are looking for and find the information right there. You can even find all of the tasks that were creating during that meeting saving you a lot of time.
How we built it
I used ASP.NET Core as the backend server app. This app interacts with JIRA, Asana, AssemblyAI, and Gmail to complete the server-side API calls. SignalR is used for the WebRTC signaling which is also part of the ASP.NET Core app. The front-end portion is built using JavaScript, jQuery, VueJS, and WebRTC. I used WebRTC to simulate a virtual meeting and process the user's audio stream which is then fed to Assembly AI to transcribe the speech to text.
Challenges we ran into
The main challenge I ran into was coming up with the correct key phrases that would trigger these flows to be executed.
Accomplishments that we're proud of
I'm proud of myself for being able to learn the Assembly AI API within a few hours. The real-time transcription API is very straightforward to use, and I did not run into any blockers. The samples from AssemblyAI were also very helpful to get me started.
What we learned
I learned that transcription and natural language processing is not 100%. There were instances where some words were interpreted as something completely different.
What's next for VAL - The Virtual Assistant Liaison
I'm going to continue working on some additional use cases and continue to explore the AssemblyAI APIs while building upon this project.
Built With
- asp.net
- javascript
- signalr
- vuejs

Log in or sign up for Devpost to join the conversation.