Cozmo Depth Map
monocluar absolute depth estimation using robotics x deep learning
[Codebase] [Presentation Slides]
Partner: Akshath Burra
My senior year I took Cognitive Robotics, a course in which you program Cozmo, a robot with a camera sensor. Our goal was to estimate absolute depth of every pixel seen by the camera sensor
Implementation
Thanks to CMU, cozmo also had access to a ~8 GB GPU. For our final project, my partner Akshath and I decided to use MiDaS, a monocular depth model, to predict depth at every frame that Cozmo sees.
Since MiDaS only gives relative depth, this depth map is not grounded with real world depth values. However, when Cozmo sees a light cube, a special object with an aruco marker, he knows how far away this light cube is. Using light cubes as a sparse depth signal, we calculate an optimal scaling factor to multiply to the relative MiDaS depth map to give an accurate depth map of the image, which can then be queried at any pixel. Feel free to look at the slides linked above for a full explanation and proof of optimality for our scaling factor!
Demo