Inspiration

We video call each other... a lot. However, there's been times where we're in a noisy environment and unable to hear the audio well. Therefore, our original idea was to create a live transcription of a video call that would generate captions for the caller to read.

What it does

As a first step to this goal, however, we decided for now be able to record a call and create a transcription of that recording. Based on user input, our Chrome extension can generate a file of the audio in the current tab. It will then send it to the Rev API to get a transcription of the file.

Challenges we ran into

The greatest challenge we faced was passing the audio file through our extension. What we record from the tab is a Blob object that has to be sent to our server. We found the original structure we had for the extension didn't work since we weren't able to send a message from the background to the content script without the object getting deleted. Therefore, we had to reconsider and restructure our extension to send it directly from the background. Once in the server, we had challenges sending the file in a format that is accepted by the API.

What's next for chrome-transcribe

Once we're able to send a local audio file to the API and get a transcription back, the next step would be to find a way to interweave the audio from the microphone and the video call. That way, we would be able to have a transcript of both sides of the conversation. From there, we would need to be able to stream the audio files in smaller chunks to get back the transcription in separate readable parts that would be more similar to how we expect captions to work.

Built With

Share this project:

Updates