This page lists some of the research projects that are currently in progress or I have worked on.
ASPN (All Source Positioning and Navigation) This project seeks to enable low cost, robust, and seamless navigation solutions for users on any operational platform and in any environment. We collaborate with academic researchers at Georgia Institute of Technology and MIT, and government agencies. We are currently developing real-time navigation algorithms and systems needed for rapid integration and reconfiguration of any combination of sensors, by using approaches based on Factor Graphs. In our framework, each sensor measurement is encoded as a factor. This way provides an adaptable and flexible foundation for any plug-and-play sensor. So far we have tested our system with 19 different sensor types (total 57 sensors: each sensor type includes multiple kinds of sensors) mounted on dismount, ground, and aerial platforms... [more] |
Vision-Aided Navigation for Large-Area Augmented Reality We develop a real-time augmented reality training/gaming system using head mounted displays (HMDs) to fulfill the requirement of FITE-JCTD project supported by JFCOM/ONR. The core component of this system is a novel approach we created to provide real-time stable pose estimation on users over large areas. The pose tracking (3D head orientation and 3D head location) is achieved only with sensors (video cameras and an inertial measurement unit) mounted on the individual users. The users who worn the system can interact with virtual actors inserted in real scenes using head mounted displays...[more] |
Tracking in GPS-Denied and Vision-Impaired Environment We develop a real-time system that tracks 6 degrees of freedom of head poses of the user in GPS-denied and vision-impaired (such as smoky scenes) environment by incorporating multiple low-cost sensors (cameras, IMUs, and range radios) mounted on the user. The system is based on an error-state Kalman filter algorithm we propose to fuse local measurements from visual odometry, global measurements from landmark matching through a pre-built visual landmark database, and ranging measurements from either static or dynamic ranging radios...
[more] |
Reconstructing 3D Objects for Robot Manipulation We propose an approach to automatically reconstruct ``entire'' 3D objects from a single 2D image, by using prior 3D shape models of classes. The prior 3D shape models, defined as a collection of oriented primitive shapes centered at fixed 3D positions, can be learned from a few labeled images for each class. The 3D class model can then be used to estimate 3D shape of an object instance, including occluded parts, from a single image. The reconstructed 3D shape is sufficiently accurate for a robot to estimate the pose of an object and successfully grasp it, even in situations where the part to be grasped is not visible in the input image...[more] |
Automatic Object Pop-up We construct a system to use prior 3D class models (3D Potemkin model) to automatically reconstruct 3D objects from a single image. To achieve complete automation of the reconstruction process, we describe an approach involving several steps: detection, segmentation, part registration, and 3D object creation. We then can use the resulting 3D object to construct realistic 3D 'pop-up' models from photos ...
[more] |
Constructing 3D Class Models We develop the 3D Potemkin Model , which can be learned from a small set of labeled views of an object, to represent the target object class. The learned 3D Potemkin model can be used to enable existing detection systems to reconstruct the 3D shapes of detected objects. Informally, the 3D Potemkin (3DP) model can be viewed as a collection of 3D planar shapes, one for each part, which are arranged in three dimensions. This model can also be viewed as a simplification of a detailed 3D model using a small set of 3D planar polygons...
[more] |
Learning to Generate Novel Views of Objects for Class Recognition Multi-view object class recognition can be achieved using existing approaches for single-view object class recognition, by treating different views as entirely independent classes. This strategy requires a large amount of training data for many viewpoints, which can be costly to obtain. We describe a method for constructing a model from as few as two views of an object of the target class, and using that model to transform images of objects from one view to several other views, effectively multiplying their value for class recognition. These transformed images are ``virtual training examples'' of previously seen objects from novel views...
[more] |
Matching Interest Points Using Affine Invariant Concentric Circles We present a new method to perform reliable matching between different images. This method finds complete region correspondences between concentric circles and the corresponding projected ellipses centered on interest points. It matches interest points exploiting all the available luminance information in the regions under affine transformation.
[paper] |