Inspiration

The God Emperor himself

What it does

takes in a user's speech and Trumpify it, so that it sounds like Trump is saying what you are

How we built it

The voice conversion backend is built on the top of SPTK. Speech analysis/synthesis is done using World speech A/S library (Morise et al, 2016). A tool was written (in C) to convert GMM into differential GMM (Kobayashi et al, 2014). The dereverberation algorithm in (Lebart et al, 2001) is implemented in GNU Octave to pre-process the noisy training data. All these tools are glued together in a makefile.

Training Features: 24-order mel-cepstral coefs 5-order band aperiodicity Static feature + first-order differential GMM: 8 mixture components, 10 iterations for EM training Pitch transformation: mean + std log F0

Data: 40 parallel sentences of total duration ≈ 2min 40s, including silence Trump speech taken from Republican National Convention (they’ve got some tremendous reverberation)

Express An express server would host a GUI to allow users to give their own speech files, and the server would process their speech using the SPTK library above and play the Trumpified speech, using the Howler.js library. Style was added throguh bootstrap and interactivity through jquery.

References

Y. Stylianou, O. Cappé, and E. Moulines. "Continuous probabilistic transform for voice conversion." IEEE Transactions on Speech and Audio Processing 6, no. 2 (1998): 131-142. K. Kobayashi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura. "Statistical singing voice conversion with direct waveform modification based on the spectrum differential." In INTERSPEECH, pp. 2514-2518. 2014. K. Lebart, J. M. Boucher, and P. N. Denbigh. "A new method based on spectral subtraction for speech dereverberation." Acta Acustica united with Acustica 87, no. 3 (2001): 359-366. M. Morise, F. Yokomori, and K. Ozawa: WORLD: a vocoder-based high-quality speech synthesis system for real-time applications, IEICE transactions on information and systems, vol. E99-D, no. 7, pp. 1877-1884, 2016.

Challenges we ran into

merging the ML libraries to process files from the express server

Time to train the model

reverberation from interviews and test recordings

Accomplishments that we're proud of

Express server able to upload files speech was able to sound similar to Trump

What we learned

dereverberation, express server backend tools, and javascript, and practical experience with GMM based Voice Conversion

What's next for TalkTrumpToMe

Natural Language Processing Trump Speech Synthesizer

Built With

Share this project:
×

Updates