I always wanted to create a robot which was controlled with my voice. So when I got the chance to create something for this hackathon, I decided to build one!

What it does

My project is based around a Flask application running on a Raspberry Pi. On the Flask webpage, you can record yourself saying a command and upload that file to the webserver on the Pi. Then, the audio file is uploaded to the Google Speech-to-Text API which returns the transcribed text from the audio file. Finally, the text is interpreted by the Python script which can then control motors on a small robot I built.

How I built it

When I first decided to build this project, I had to research how to create a dynamically updating webpage with Flask. I also had to read the documentation of the SpeechRecognition Python library and find how I could record a user's microphone from a webpage using by "Recorder.js". After I did all this, I coded up a webpage which could record your voice and tell you what the Google Speech API thinks you said. In my second iteration, I added support for specific commands, like "go forward" or "turn left". Finally, I decided to combine my Flask webserver with a simple robot which would allow it to be controlled by voice.

Challenges I ran into

One of the first challenges I ran into was trying to figure out a way to dynamically update my webpage without needing to have the user reload the page. After a lot of research, I found a way to do so which didn't look too long or complicated or required separate libraries. The next challenge I ran into was driving my robot's motors. For some reason, whenever I ran my code from the command line, it would run my motors at the correct speed, but when I ran it from the Python IDLE, it would go much slower. I eventually decided to simply use system commands to run my code instead of creating custom modules.

Accomplishments that I'm proud of

I am proud of finding a way to customize the length of time the robot will move. The solution to my problem came after I realized I could use lists to split up the transcribed text from the Google Speech API and take the number of seconds the user specified and put that into a command line argument, which allowed me to still run my motor code from the command line and still have a degree of customizability.

What's next for Speech Controlled Robot

Next, I want to add more comprehensive error control so that when a user says "go forward potato", the Flask application will say that the user used an invalid number or argument. I also want to make my script more streamlined and faster as it sometimes slows down.

Built With

Share this project: