Inspiration

Typing in commands all the time grew tiring, and I thought about someway I could make something I monotonously do all the time easier. What first came to mind was to capitalize on Discord's incredibly smooth and diverse use of voice channels that aided gamers in communication. Thus, I realized that it would be possible to create a bot that could capitalize on the Discord API to integrate search commands with the voice, and free up the fingers for intense gaming.

What it does

The Discord voice recognition bot can listen in, and understand, what you say! After activating the bot and letting it listen in to your voice channel, The bot is then able to pick up on certain keywords that trigger specific events. These events include requesting YouTube songs and play lists, skipping and resetting play list queues, and conduct image searches on Imgur. Without the need to press any key or button, the Discord bot will listen in to whatever you say and only respond when necessary. Oh and, there's also a repeat-after-me text-to-speech feature as well (so speech-to-speech..)!

How I built it

The program is mainly coded in node.js and JSON. I used the Discord.js API for Discord/bot interactions. I used the Wit.ai API for natural language processing and speech-to-text conversion. I also incorporated use of the Youtube Data API and Imgur API to develop the main features of the program.

Challenges I ran into

Figuring out how to use Discord to capture voice channel activity was most likely the hardest part of the programming. Learning and understanding how to use callbacks was quite tiring.

Accomplishments that I'm proud of

I'm proud that I was eventually able to see through the creation of a program that does something so seemingly simple. Translating from speech to text, and then using that information to do other tasks was really a motivating idea.

What I learned

I learned how to used Node.js, retrieving HTTP requests, and wrestle with callbacks.

What's next for Discord Voice Recognition Bot

Doing anything I can do with commands, but with speech

  • Google searches
  • Swear jars
  • news retrieval
  • ease of use for people without efficient typing capability

Built With

+ 4 more
Share this project:

Updates