Started by trying to follow ceiling lights, which got us first from checkpoint 0 to 1.
Began using monocular RGB to depth models, and had success with glpn-nyu.
Tried doing 2D slam and 2D pattern matching on a horizontal slice of the depth image. However, the depth images seemed too homogeneous to perform reliable pattern matching and orientation estimation.
Pivoted to doing end-to-end control (finetuned resnet50) on the depth image. This allowed us to get semi-consistent control from checkpoint 0 to 1. We began working on checkpoint 1 to 2 and beyond but ran out of time for data collection and training. We also experimented with translations on the depth image to generate more training data.
See videos for demos
Built With
- commabody
Log in or sign up for Devpost to join the conversation.