Inspiration
During my time volunteering for long-term care centers, I came across many seniors who had speech-language disorders. People, even their families, had trouble understanding what they wanted to say. I was hoping to provide a solution using machine learning.
What it does
For the scope of this project, I focused on just simple commands. The tool recognizes speech commands (like yes/no, directions, numbers) and gives the orthographic transcription as the output.
How we built it
Dataset: TORGO (from University of Toronto) ML model: a convolutional neural network
Challenges we ran into
- Not enough speech data for training the model
- Finding the best architecture for the CNN
- Preprocessing raw speech data in general
- Finding pre-existing models for transfer learning
Accomplishments that we're proud of
Too many! Just having a (kinda) working end model is making me proud already. This is also my first personal computer science project, so I really learned a lot.
What we learned
- How to execute and manage a machine learning project, from looking for data to fine-tuning the model
- How to use CNN for speech recognition
- Data augmentation techniques, and the pros and cons of data augmentation itself
- Basic web development with Vue.js front-end and Flask back-end
What's next for Atypical Speech Recognition
- Find more data
- Go above simple commands (i.e., real speech recognition)
- Learn more about the state-of-the-art speech recognition methods, and adapting to the target population with the results of dysarthria research
Built With
- flask
- javascript
- jupyter
- python
- vue.js
Log in or sign up for Devpost to join the conversation.