Changhyun Choi > Research > RGB-D Tracking

Visual Object Perception
in Unstructured Environments

AUTHOR

Changhyun Choi

Thesis Committee

Prof. Henrik I. Christensen (Advisor), School of Interactive Computing, College of Computing

Prof. James M. Rehg, School of Interactive Computing, College of Computing

Prof. Irfan Essa, School of Interactive Computing, College of Computing

Prof. Anthony Yezzi, School of Electrical & Computer Engineering, College of Engineering

Prof. Dieter Fox, Department of Computer Science & Engineering, University of Washington

SUMMARY

As robotic systems move from well-controlled settings to increasingly unstructured environments, they are re-quired to operate in highly dynamic and cluttered scenarios. Finding an object, estimating its pose, and tracking its pose over time within such scenarios are challenging problems. Although various approaches have been developed to tackle these problems, the scope of objects addressed and the robustness of solutions remain limited. In this thesis, we target a robust object perception using visual sensory information, which spans from the traditional monocular camera to the more recently emerged RGB-D sensor, in unstructured environments. Toward this goal, we address four critical challenges to robust 6-DOF object pose estimation and tracking that current state-of-the-art approaches have, as yet, failed to solve.

The first challenge is how to increase the scope of objects by allowing visual perception to handle both textured and textureless objects. A large number of 3D object models are widely available in online object model databases, and these object models provide significant prior information including geometric shapes and photometric appear- ances. We note that using both geometric and photometric attributes available from these models enables us to handle both textured and textureless objects. This thesis presents our efforts to broaden the spectrum of objects to be handled by combining geometric and photometric features.

The second challenge is how to dependably estimate and track the pose of an object despite the clutter in back- grounds. Difficulties in object perception rise with the degree of clutter. Background clutter is likely to lead to false measurements, and false measurements tend to result in inaccurate pose estimates. To tackle significant clutter in backgrounds, we present two multiple pose hypotheses frameworks: a particle filtering framework for tracking and a voting framework for pose estimation.

Handling of object discontinuities during tracking, such as severe occlusions, disappearances, and blurring, presents another important challenge. In an ideal scenario, a tracked object is visible throughout the entirety of tracking. However, when an object happens to be occluded by other objects or disappears due to the motions of the object or the camera, difficulties ensue. Because the continuous tracking of an object is critical to robotic manipulation, we propose to devise a method to measure tracking quality and to re-initialize tracking as necessary.

The final challenge we address is performing these tasks within real-time constraints. Our particle filtering and voting frameworks, while time-consuming, are composed of repetitive, simple and independent computations. Inspired by that observation, we propose to run massively parallelized frameworks on a GPU for those robotic perception tasks which must operate within strict time constraints.

Dissertation

Changhyun Choi, “Visual Object Perception in Unstructured Environments,” Robotics Ph.D., School of Interactive Computing, College of Computing, Georgia Institute of Technology, Dec. 2014. [ pdf (51 MB) | pdf with color href (51 MB) | slides (64 MB) ]

Acknowledgments

First of all, I would like to thank my advisor, Henrik Christensen, for his kind advice and thoughtful guidance. I appreciate that he had believed in me all the time, even when I lacked confidence in my strength. I am also grateful to my doctoral thesis committee, Jim Rehg, Irfan Essa, Anthony Yezzi, and Dieter Fox, for their constructive comments and priceless suggestions. Thanks to you all, my thesis has been much improved since my proposal draft.

Meeting kind friends in a strange foreign place has been a valuable present. I would like to thank the CogRob lab mates: Alex Trevor, John Rogers, Jake Huckaby, Carlos Nieto, Akansel Cosgun, Sasha Lambert, Rahul Sawhney, Kimoon Lee, Sungtae An, Siddharth Choudhary, Pushkar Kolhe, and Jayasree Kumar. I sincerely thank Heni Ben Amor for giving valuable comments on both my thesis and my prospective career. It was also a precious experience to study with brilliant Robotics Ph.D. students, such as Richard Roberts, Yong-Dian Jian, Baris Akgun, Duy-Nguyen Ta, Tucker Hermans, Misha Novitzky, Neil Dantam, Maya Cakmak, Crystal Chao, etc. I would also like to thank Kathy Cheek and Nina White for their kind administrative help.

Outside of Gatech, I had the great opportunity to collaborate with amazing researchers, including Yuichi Taguchi, Ming-Yu Liu, Srikumar Ramalingam, and Oncel Tuzel at MERL as well as Ross Knepper, Andrew Spielberg, and Mehmet Dogar at MIT. Study and research funding during my Ph.D. from General Motors, Boeing Company, MERL, KFAS, and Google Summer of Codes is greatly appreciated.

Finally, I am deeply indebted to my father and mother for giving me the freedom of pursuing my dream and to my sister Goeun for her warm support in a distance.