Inspiration

On our mobile devices, there's already so much built-in support for voice activation through services like Siri and functionality built into other applications. Yet Discord, one of the most popular social messaging applications out there right now does not have the same level of voice-activation that you can find on many phone applications. This inspired us to create Coci Bot, a Discord bot that is fully voice-activated allowing you to use any of the commands by just speaking out loud. Through voice recognition, we sought to make a more accessible and easy-to-use bot without needing people to constantly open Discord to interact with one.

What it does

CociBot currently has 2 main functions for a user to activate purely through voice commands. The first one is a simple command for the bot to send a message into a text channel on behalf of the user who activates the command, alongside the message that is spoken afterward. This allows users who may currently be occupied with other tasks to still send messages and participate in conversations without having to pull up a keyboard and type out a message with their hands. The second command is the current primary function of CociBot, the user can state a search query and the bot will then return a list of 3 URLs from Google's search engine alongside a two-sentence summary for each. When someone using Discord needs to quickly look up information, they can now do so with their voice within DIscord without having to open a new browser, and the bot also filters out sponsored content that often is placed at the top of search results when using a standard browser.

How we built it

We used the Python library Discord.Py to build out and initialize our Discord bot to start accepting commands from a user. For voice recognition, we used Python's SpeechRecognition library to accept input from the user's mic and then OpenAI's Whisper neural net to parse the speech into clear text to interpret as commands. For our search query functionality, we used the Google-API-Client to feed the user's search query and get back a list of links from Google's search engine. With these links, we used BeautifulSoup to get the raw text displayed on the website and then gave both the raw text and the URL of the web page to GPT-3.5 for a summary of the website's content to display alongside the URL in the bot's message.

Challenges we ran into

The biggest challenge we ran into was configuring audio input for our voice recognition. We were originally going to use Better Discord, an application that provides plugin support and developer support. However, Better Discord recently got rid of the ability to have 3rd party plugins, making accessing the microphone not possible from this angle. We then did more research until we decided on using the speech recognition library which had support for various APIs and utilized PyAudio. The API we chose for voice recognition was the Whisper API by OpenAI. The recognition is extremely accurate, even down to punctuation in user input. It was challenging getting this configured but we're proud of the results.

Accomplishments that we're proud of

For one of our team members (Tucker), it was his first Hackathon, getting exposure to things like APIs, AI, and version control in a small group project were all unique things one doesn't typically experience on the job or in the classroom. Both of us were able to express ourselves creatively and technically to overcome challenges and meet the criteria of the categories we targeted. This was a net positive experience for both of us and are more proficient because of this experience.

What we learned

Getting familiar with the libraries we used and configuring our search feature enabled us to gain a better understanding of how to utilize AI with search queries and audio interpretation. There was also a lot of misc. learning, both of us feel more confident in our Python skills with newly learned keywords and features, and also new implementations of features we were already familiar with. All of the tools and frameworks utilized in this project will be valuable to our futures as programmers and this experience was a great way to familiarize ourselves. We also learned how to communicate and compromise as a team in order to accomplish the needs of the category we chose and our own personal interests.

What's next for Voice-Activated Discord Bot - CociBot

With more time, we would like to add many smaller features to have more ways for a user to interact with the bot, whether it's through other ways to interact with a Discord server or activities that the user can partake in with just voice. We would also like to integrate a text-to-speech module for the bot to reply back to a user's query and potentially hold a conversation.

Built With

Share this project:

Updates