Autonomous vehicles are becoming more prevalent every year. This broad category includes everything from Roombas to drones to the Mars Rover. As one might expect, a LOT of software has to be written to make sure these vehicles properly navigate their environments, not to mention completing the tasks that they are created for. We realized that recent advancements in Reinforcement Learning made it possible to prototype a workflow that can automatically generate a self-preservation, safety protocol for any autonomous (or RC!) vehicle. In theory, this approach could stop Roombas from hitting people, level out a spiraling quadcopter, and prevent the Mars Rover from meeting an untimely end on Mars.
What it does
The workflow proceeds as follows: code up your AV or RC robot into a virtual environment (we use Open AI's Gym) and create a simulation of the environment that the vehicle will travel in. Use a learning algorithm (described below) to reward the bot whenever it remains "stable." This type of reinforcement learning, known as continuous Deep-Q learning, can function on any defined set of inputs. That means you simply specify which sensors are on the vehicle, and continuously provide their value to the algorithm. The algorithm will learn how to remain "stable" by processing and learning from those sensor inputs and the reward it gets from not crashing. By the end of this workflow, you would have trained an algorithm that can keep your robot safe automatically, without any writing of specific protocols and maneuvers.
What we built
To demo this idea, we built a custom acrylic RC car with an Arduino and wifi-card. We use ultrasonics to measure distances from each edge of the robot. We used OpenAI Gym to build a custom physics engine and environment for this vehicle, and ran the learning algorithm on Google's Deep Learning VM. We can show in both simulation and the real-world how the algorithm can be used to reject unsafe actions and replace them with safe ones, preventing collisions and preserving the robot.
Challenges I ran into
There was a large amount to accomplish technically, from building a robot to coding the environment (and physics engine) to training the RL model to setting up the server connection. Due to a diverse team (two CS, two ECE), we could parallelize and accomplish a lot.
Accomplishments that I'm proud of
- Building a Physics Engine
- Laser cutting a car
- Successfully implementing the algorithm
What I learned
Reinforcement learning principles and applications alongside the workings/interfacing of hardware, sensors and code.
What's next for Auto-safety for the Autonomous
Perhaps exploring how this same approach can be used in a much more difficult context, like a quadcopter.