Who Said What?

Who Said What
Our Transcription Dialog Box and Speaker
Slack Bot saving Transcripts with Timestamps

Inspiration

At work, conference calls usually involves multiple people on one side using the same microphone. It may be hard to know who's speaking and what their role is. Furthermore, some details of the meeting can be lost and it's tedious to note everything down.

What it does

Our app distinguishes/recognizes speakers, shows who's speaking and automatically transcribe the meeting in real time. When the meeting ends, our app can also export the meeting minutes (log of who said what at what time).

Features:

display who's currently speaking using speaker recognition
transcribe what's being said by who like a chat application
create and train a new speaker profile within 15 seconds
stream transcription to services such as Slack
export transcription to cloud storage such as Google Sheets

How I built it

Microsoft Speech Recognition API
Microsoft Speech to Text API
Google Cloud Speech to Text API
Google Sheets API
Slack API
stdlib for integrating services for the backend such as Slack and SMS
NodeJS with Express for the backend
Vue for the frontend
Python scripts for accessing Microsoft's APIs
Love ❤️

Challenges I ran into

Generating the audio file in the right format for Microsoft's API was tougher than expected; seems like Mac's proprietary microphone isn't able to format the audio in the way Microsoft wants it.

Accomplishments that I'm proud of

Learning how to use the APIs, Microsoft Azure, and sampling an audio input to a format the API needs.
Finishing an app before the deadline.

What I learned

Usage of many APIs, speech recording, and integration of multiple services.

What's next for Who Said What?

A year long worldwide tour to show.

Built With

Submitted to

nwHacks 2019
- Winner The Wolfram Award

Created by

I worked on backend and integration with the Microsoft's Cognitive Services API through Azure

Jimmy Huang
I created the design for the website and the javascript to display the speaker and transcription

Sorina C
Nicholas Wu
Coding is a treat.

Updates

Jimmy Huang started this project — Jan 27, 2019 02:49 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.