UBC Theses and Dissertations
Object volume reconstruction and pose estimation from monocular video Nasiri Avanaki, Alireza
Acquiring a three-dimensional perception of an object or a scene, from regular (single-camera and 2-D) video, is a trivial task for humans. The automatic implementation of such a task has been, and still is, one of the major problems of computer vision. The new approach introduced in this thesis focuses on volume reconstruction of an object from image sequences taken by a single camera. One of the numerous applications of this approach is 3-D object tracking in video. This can be used in very low bit-rate customized video transmission schemes. A multi-objective pose estimation method is introduced that computes object relative pose between two input frames. One advantages of this method is that it does not use any feature point, thus it does not suffer from problems with feature point detection and tracking. Also, the method does not assume any model for the object at the outset, hence it can be applied to an arbitrary object. The method, however, requires a depth-map, which is not readily available from an image sequence. To overcome this requirement, an iterative scheme is employed. The first round of pose estimation between consequent frames is performed, assuming flat depth-maps. Pose estimates are then adjusted to reduce the error by maximizing a novel quality factor for shape-from-silhouette volume reconstruction. Shape-from-silhouette is applied to construct a 3-D model (volume), which provides depth-maps for the next round of pose estimation. The feedback loop is terminated when pose estimates do not change much, as compared to those produced by the previous iteration. Based on our theoretical study of the proposed system, a test of convergence to a given set of poses is devised. To handle input sequences with unknown frame order, the input sequence undergoes a preprocessing stage, in which the frames of the sequence are re-ordered to obtain the most accurate pose estimation. A theoretical validity criterion for volume reconstruction by shape-from-silhouette is established. This criterion is used to produce a volume reconstruction quality factor, which plays an important role in pose estimation adjustment. The reliable performance of our system is proved via several simulations carried on both synthetic and real image sequences. Effects of pose sampling rate, distribution of pose samples, and error in input pose on volume reconstruction quality by shape-from-silhouette are studied. It is shown that high levels of pose error cannot be compensated by increase in pose sampling rate, and that volume reconstruction at high pose sampling rates is more sensitive to pose error.
Item Citations and Data