Inspiration
The inspiration for this project came from one of our group members, a Chinese major, who had this idea to build an app to help those who are learning Chinese practice their proficiency with Chinese tones. While programs like Duolingo and HelloChinese exist as online language learning tools for people trying to learn Chinese, their ability to detect correct usage of Chinese tones is often faulty, preventing them from giving language learners support in improving their pronunciation of Chinese tones. As such, we hope to fill that gap with the creation of ToneTutor, our Chinese tone-checking tool.
What it does
ToneTutor is a tool that takes a user's speech and analyzes it and provides example voice clips for the user to compare their output to the desired output, helping them improve their tones.
How we built it
The website for ToneTutor was built with javascript, with the speech analysis done through Python, with the help of librosa, a Python library for music/audio analysis, and ToneNet, a convolutional neural network developed by Gao Qiang, Sun Shutao, and Yang Yaping from the Communication University of China.
Challenges we ran into
Our team ran into multiple problems, including, but not limited to:
- Difficulty integrating Python with Javascript
- Finding a solution for the lack of invariance problem in regards to Chinese tones
- Formatting spectrogram data into acceptable input for the neural network to evaluate
- Neural network runtime
- Generally underestimating the difficulty of the scope of the project as a whole
Accomplishments that we're proud of
This is the first hackathon that each of the members of our group have ever attended, so we're just really proud to have managed to put together anything at all!
What we learned
We all learned a lot over these couple days, including:
- The complexity of the prosody of Mandarin Chinese (especially the contextual and general variation of Mandarin tones)
- How to use Javascript to build a website
- How to use keras (Python deep learning API)
- How to use Github
- The value of good teamwork!
What's next for ToneTutor
We had to strip out a lot of our intial ideas for the functionality of this tool due to the time constraints, but given this, our next steps would be to expand ToneTutor's tools to include more in-depth comparisons between the user's speech and the target output in order to give better feedback. Additionally, ToneTutor should be able to take in a wider range of Chinese characters (or even longer strings of characters) for the user to use for practice, possibly even with the ability to let a user input a character that it can specifically help tutor the user on.
On a larger scale, ToneTutor could be developed into a full language learning website complete with user accounts and the ability for teachers to arrange curriculum on it, allowing for students to track their progress or for teachers to assign series of exericses to their classes.
Works Cited/Acknowledgements
- Alzohairi, Reema. "How to Record Audio in JavaScript". *Medium, 2021. https://ralzohairi.medium.com/audio-recording-in-javascript-96eed45b75ee
- Chen, Y., & Gussenhoven, C. (2008). Emphasis and tonal implementation in Standard Chinese. Journal of Phonetics, 36, 724-746. https://doi.org/10.1016/j.wocn.2008.06.003
- Gao, Q., Sun, S., Yang, Y. (2019). ToneNet: A CNN Model of Tone Classification of Mandarin Chinese. Proc. Interspeech 2019, 3367-3371, doi: 10.21437/Interspeech.2019-1483
- Gartzman, Dalya. "Getting to Know the Mel Spectrogram". Towards Data Science, 2019.
- Jongman, Allard & Wang, Yue & Moore, C.B. & Sereno, Joan. (2006). Perception and Production of Mandarin Chinese Tones. Handbook of Chinese Psycholinguistics. 209-217. doi: 10.1017/CBO9780511550751.020.
- Keating, P., & Kuo, G. (2012). Comparison of speaking fundamental frequency in English and Mandarin. The Journal of the Acoustical Society of America, 132(2), 1050-1060. doi: 10.1121/1.4730893.
- Michigan State University. Tone Perfect: Multimodal Database for Mandarin Chinese. https://tone.lib.msu.edu/
- Rose, P. (1987). Considerations in the normalisation of the fundamental frequency of linguistic tone. Speech Communication, 6(4), 343-351. https://doi.org/10.1016/0167-6393(87)90009-4
- Xu, Y. (2017). Intotation in Chinese. In Encyclopedia of Chinese Language and Linguistics. W. Behr, g. G. Yue, ZevHandel, C.-T. J. Huang and j. Myers. Boston: Brill pp. 458-466.
- Yuan, J. (2011). Perception of intonation in Mandarin Chinese. The Journal of the Acoustical Society of America, 130(6), 4063-4069. doi: 10.1121/1.3651818.