Inspiration

During my time volunteering for long-term care centers, I came across many seniors who had speech-language disorders. People, even their families, had trouble understanding what they wanted to say. I was hoping to provide a solution using machine learning.

What it does

For the scope of this project, I focused on just simple commands. The tool recognizes speech commands (like yes/no, directions, numbers) and gives the orthographic transcription as the output.

How we built it

Dataset: TORGO (from University of Toronto) ML model: a convolutional neural network

Challenges we ran into

  1. Not enough speech data for training the model
  2. Finding the best architecture for the CNN
  3. Preprocessing raw speech data in general
  4. Finding pre-existing models for transfer learning

Accomplishments that we're proud of

Too many! Just having a (kinda) working end model is making me proud already. This is also my first personal computer science project, so I really learned a lot.

What we learned

  1. How to execute and manage a machine learning project, from looking for data to fine-tuning the model
  2. How to use CNN for speech recognition
  3. Data augmentation techniques, and the pros and cons of data augmentation itself
  4. Basic web development with Vue.js front-end and Flask back-end

What's next for Atypical Speech Recognition

  1. Find more data
  2. Go above simple commands (i.e., real speech recognition)
  3. Learn more about the state-of-the-art speech recognition methods, and adapting to the target population with the results of dysarthria research

Built With

Share this project:

Updates