Nowadays, speech recognation has been a hot topic in technology. With higher accuracy and recognition speed, this technology could benifit people’s daily life and bring revolution to our lifestyle.
We would like to bring this technology to daily life and implement this technology to create an advanced product.
What it does
Genrally, our prototype will take user speech input and tranform into speech command. With each command, execute different task functions.
Our central control unit will take user speech input and recognize it. Then the result will forward to robot via bluetooth. For example, “Go”, “Stop” and send command to our VCV ( Voice Control Vehicle ).
Once the VCV take input command, it will execute the task, control motor to go or stop or turns.
Our prototype also has ability to recognize some "emotion" of the user input. For example, if the input command has a large volume which means user may want to execute that function in a hurry, then out VCV would react to that emotional input. For example, speed up or turn fast.
How we built it
- Language: C++, python, Arduino
- Library: CMU Sphinx
- Tool: ROS
Create ROS Node to send speech recognition result to the robot via Bluetooth. The results are generated by the CMU Sphinx library. We designed the speech dictionary to fit our own need.
Used SOLIDWORKS to build the robot model and manufactured by laser cut.
The electrical design of our VCV contains the following parts:
- Bluetooth communication with central computer
- Use HM-10 Bluetooth module embedded with Arduino UNO board to communicate with central control unit
- High current motor control system
- Self designed dual H-bridge circuit to control high power DC motor with max 6 amp current limit.
- User input voice control
- Using op-amp to amplify user input voice level.
- Emotional and volume detect circuit
- Using 4 electrical microphones to detect user input volume and emotional on VCV.
Challenges we ran into
We have ran into serval challenges and problems while we building our prototype:
Speech recognition delay
Natural language recognition using Google Cloud Platform takes long time for processing which may also cause delay for VCV control in real time
Voice filter circuit
There are 4 electrical microphone on out prototype VCV. Since the microphone we got is too basic, designing a proper voice input filter circuit and amplifier is a big challenge.
Distance recognition plan drop
The initial plan for the on VCV microphones is to also detect distance when user send commands. However, because of the voice filter circuit design challenge, we can not have a very filtered voice signal. In order to finish out plan, we have to give up on detecting distance using voice input.
Accomplishments that we're proud of
We have a great synthesis of software and hardware. We designed, manufactured our robot, soldered and optimized the electric ciruit and incorporate the state of the art voice recognition technique. We are proud of our robot and embraced the diversity of our team and truely believed together we are stronger.
What we learned
- Use google cloud platform API and CMU Sphinx to recognize speech
- Configure the bluetooth and read/write of serial port
- Familiarized ourselves with Arduino
- Filter and amplify voice signal
What's next for Voice Controlled Vehicle
Regarding to our challenges and problem we encounter, the following points are the next step for out VCV:
- Improve microphone input control
- Switch new microphone device
- Design new voice filter circuit and amplify circuit which would accomplish a good noise filter and have a larger input voice command signal.
- Improve Speech Recognition Detection
- Train new module for speech input recognition
- Improve device adaptive to different language
- Train new module for different language and accent so that more people would be able to use out device rather than just saying in english.