Inspiration
Preceding my musical based dissertation, I wanted to try a different musical based project, gaining experience with digital signal processing. Taking the common compliment for guitarists 'they make the guitar sing', I decided to turn this into a reality.
What it does
Given a lyric sheet in a text file, Vox reads each word/syllable in and then synthesises speech for each syllable in a wave file. This is then converted into an audio buffer using libsndfile. From this Vox then uses onset detection to detect when a note is struck on the guitar. Once a note strike is detected, the next piece of speech is then output.
How I built it
I used the wdl-ol templates for cockos to generate desktop applications and vst plugins. The first step was to use an executable called tts.exe to generate the speech files. Then libsndfile opened this and converted it into 44.1khz buffers which could be then directly output in real time. These are then looped around as note strikes are made on the guitar, to the extent that if a note is played while one sound is still playing, it will switch immediately to the next one!
Challenges I ran into
Most of the libraries I used surprisingly turned out to work very nicely. However the main difficulty that came into play was trying to map the notes on my guitar to shifting the note of the text to speech. Although I found auto-correlation and pitch shifting libraries, these proved to be slow to run in real time, leading to the sounds no longer being played. Therefore the only way I would have been able to solve this was to roll my own versions of the algorithms. Since these involve some advanced physics, this was not something I could complete at 3am in the morning and so I decided it best to leave them out (for now) ;)
Accomplishments that I'm proud of
That, apart from parts of the hack that I found to be infeasible in 24 hours, I actually managed to complete the project on my own (a first for me at a hackathon). I managed to stay on time which is a bonus, which gave me a lot of extra time to investigate pitch detection, before coming to the conclusion it wasn't going to be feasible.
What I learned
This is the first audio plugin I have written and have thus learnt a lot about the structure of such applications as well about a lot of different techniques used in audio processing, especially onset detection, and the reading of wav files into raw audio buffers.
What's next for Vox
Polyphonic pitch detection to map the notes on my guitar to the synthesised speech output, and a prettier UI :)
Built With
- asio
- c++
- libsndfile
- onsetsds
- vst
- wdl-ol
Log in or sign up for Devpost to join the conversation.