Abstract
The recent natural disasters in Turkey and Syria shocked all of us in the past weeks. As engineers, that got us thinking about how we could potentially help in such scenarios. Our project seeks to help by making a small rover equipped with a thermal camera to identify the victims in difficult scenarios and send their coordinates to rescuers. We would use machine learning to differentiate heat signatures from humans to other sources and use pose detection to help guide our robot. During our project, we encountered many challenges throughout the system, including building a reliable model with low resolution, noise, low refresh rate, and other limitations of the thermal camera. Nonetheless, we believe that, given the technical constraints, we were successfully able to demonstrate the potential of the system while also leaving room for future developments.
Hardware
Our project design hardware is a two-wheeled rover, equipped with a 32x24 thermal array camera, a long-distance IR distance sensor, a short-distance IR distance sensor, a 1.8-inch TFT display, an Arduino Uno with a Motor shield, a Raspberry Pi 3 B+, and an RGB LED. It was powered by the 6S AA batteries, which were then bucked to 5V via a buck module (to ensure clean and stable power to the Pi) to power the Raspberry Pi via USB.
Logic
As described in the diagram below, the Atmega was responsible for controlling the DC motors using PWM, the LEDs, and reading the analog values from the distance sensors using the ADC and converting them to distance values. On the other hand, the Raspberry Pi was responsible for getting the image from the camera, running the ML model on the image, hosting the visualization dashboard for the user, and sending the appropriate commands to the Atmega via UART.
On the software side, the main logic for the rover was done on the Raspberry Pi, since it has much greater computing power than the Atmega and was the one responsible for fetching the image and running inference on it. It then would send the appropriate command to the Atmega (e.g. ‘turn left’, ‘stop’, ‘get distance short’) via UART and read the response back, if any. Thus, the Atmega is fairly agnostic to its inputs and acts mostly by simply following what it gets told by the Pi. An exception to this would be the collision avoidance on the rover, which is managed completely by the Atmega. It works by simply disobeying the Pi command to move forward if that would make the rover collide. This way, we avoid having to fetch the distance back every time, which would slow down the entire system and drastically increase the response time for collision detection, since the image fetching takes most of the time. This way, we can always be checking for collisions, even while running inference or fetching the image, without requiring resources from the Pi.
Dashboard
we also build a user web dashboard (seen above). It is a simple web server hosted on the Pi which was mostly made for data collection. The simple interface allowed us to simply input the label of the image (e.g. ‘left’) and press the save button, which would then save the image in the appropriate folder with the correct name via the API. This system allowed us to efficiently collect more than 2000 unique images of different poses to train our model. It also interacts with the rover via the ‘RUN’ and ‘STOP’ buttons, which started and stopped the rover running inference and sending commands based on the images collected.
Conclusions
We are quite happy with the results achieved and the overall system we were able to build, despite the challenges on the way. We believe that we were able to apply a very large breath of knowledge throughout the project, as our project included aspects from 3D modeling, circuit design, bare metal programming, communication protocols, machine learning on the edge, and even some web development. This can be seen by the variety of programming languages used, which included C, Python, and JavaScript, besides the 3D files. Throughout this project we encountered many challenges in many of the stages described above. One main challenge was to get a reliable machine learning model to control our rover. Due to the very low resolution, noise, and low refresh rate, we had to pivot a little on the complexity of movements we could detect. Originally, we would have liked to have multiple different poses to control different aspects of the rover, however the limitations above greatly restricted our capabilities. For instance, the resolution was so low that sometimes one arm would be only 1 pixel wide on the camera, which would make it very difficult to identify more complex poses. Another challenge, more hardware related, were the very limited AA batteries. Although we don’t have any specifics, we know that our system was quite power-hungry, mostly due to the Pi, so the batteries with their high internal resistance heated up and drained very quickly while in use, even while stationary. This also meant that the voltage drop due to the internal resistance was quite significant and potentially impaired the functioning of the buck converter (as it needs some headroom from the input voltage to the 5V output). Thus, we would need new batteries for almost every time we would have a work session. In the future, we could improve on many aspects of the system, depending on our preferences. On the hardware side, we could try to get a better camera, and add more sensors for a more complete system (e.g. IMU, GPS) to better simulate a rescue rover. On a software side, we could work on getting a more reliable model, adding more poses, and better controls for the rover (e.g. PID control for collision distance, Kalman Filters for robot movement using encoders and IMU data). And finally, we could improve on the mechanical structure of the rover by having a larger structure, more powerful motors, and a better power source.
Built With
- c
- javascript
- python
- raspberry-pi
- tensorflow
Log in or sign up for Devpost to join the conversation.