Speech Toolkit Visualization Module

Note: this is a remote submission, and I will not be able to do a presentation for this. Hence, I have decided to do a detailed description.

Idea and Imact

In the speech recognition researchers, it is well known that there is no "universal speech recognition system", and one must be developed for every user and acoustic condition. This can be a difficult task in and of itself because every user and acoustic condition presents unique challenges, and researchers must be able to identify the reasons why their recognizer is failing.

In this project, we propose to build a web tool for easy visulization of errors occurring in speech recognition, which in turn can support iterative development of any recognizer.

Implementation

I have used following tools

RubyMine
Yeoman
Chrome Web Inspector

I have used following libraries

dc.js
crossfilter.js
d3.js
jQuery
Bootstrap

How to use

Go to the submission link

You can filter speech instances using various demographics, speech properties and results by drag-select and click-select.

According to your parameters, some speech instances will be selected and you can view them in the table below the graphs. This way, a speech expert can identify where his/her speech recognizer is making mistakes and what steps should be taken to improve it.

Future work

Currently, this module uses data from a JSON file. It can be easily modified to work with data streaming from server. Other future work is related to make this results sharable and build a github-like community of speech experts around this, where they can comment on the results, verify the results and speech-course instructors can guide their students.

Updates

Malav Bhavsar started this project — Mar 21, 2014 05:42 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.