EchoPlay

Inspiration

All of us enjoy playing games in our free time, and we want to enable those who can't enjoy them conventionally to do so.

What it does

EchoPlay translates your voice to game action. We've build EchoPlay from the ground up to be universal to any game that you can imagine. Its fully customizable while having a user friendly interface to customize everything from word to key mappings to button hold time.

How we built it

We came into the project with two big goals:

Voice input to game action with as low latency as possible.
Ability to use with any game, and easy to use GUI for customization.

Voice input:

We first found a library called RealtimeSTT that does real time transcription, and confirmed that it is our best starting point
Then we spent time tuning the hyperparameters to try and get it as fast as possible.
We realized just the hyperparameters weren't enough, and made our own changes to the library to further speed it up.
Then we spent time cleaning up the output from the model before we passed it on to the backend to map the words to key presses.

Voice to key press:

After taking in the cleaned data from the voice model, we have a dictionary with synonyms that we use to properly map the words to keys in case the model reads it in wrong. For example if it reads in "why" instead of "y".
We also have an entire logic system working to remove duplicates in the input stream.
We also implemented all of the backend with multiprocessing, which further improved efficiency.
Veteran's united recommended that we make a feature to find homophones for each of the commands to help account for different accents. This transformed our product and really took it to the next level

Challenges we ran into

There is no model on the market that is fast enough for our purposes, and we had to spend a lot of time making the model faster. This involved changing hyperparameters, rewriting parts of the library structure, and pruning the RealtimeSTT library to make it faster.
The model also has a tendency to throw out duplicates a lot, which wouldn't have been a problem if we weren't going for minimum latency because the model would correct itself and just output the final thing. - We had to come up with multiple levels of logic and a custom data structure to filter the transcribed words for duplicates
Since optimizing the voice model took so long, we needed to also develop the backend in parallel with no idea of how they would go together. This was a huge concern for us during initial development.

Accomplishments that we're proud of

We were able to finish our Minimum Viable Product (MVP) way ahead of plan. Because we effectively communicated during our development process, the back end logic and the voice model went together almost perfectly without needing a lot of changes.
Our voice model has super low latency (fastest one we could find), and the transcription is fast enough to where we can effectively play a fast paced such as Trackmania with no noticeable lag.
Our duplicate filtration works as intended and catches almost all of the duplicates.
Our GUI is intuitive and easy to use
We are able to customize it for different games and user phrases

What we learned

All of us had previous experience in python, but not with the specific libraries and tools we used in this project. We had to libraries like Tkinter, multiprocessing, openai-whisper, RealtimeSTT, and vgamepad
Aside from libraries, we also sharpened our fundamentals. In trying to filter out duplicates, we implemented a custom data structure and had to do some revision on queues and sliding window strategies.
We also realized the importance of a good git branching scheme. In previous hackathons that we've done, we've always encountered merge conflicts due to editing the same file on the same branch. This time we used the github flow branching strategy to manage our repository and prevent merge conflicts. The small effort of setting up and using the strategy has paid dividends, completely eliminating merge conflicts.

What's next for EchoPlay

We want to continue improving on the functionality and ease of use, polishing the GUI, improving the model
Eventually we want to open source the code base so anybody can use EchoPlay for free.