UBC Theses and Dissertations
Using unlabeled 3D motion examples for human activity understanding Gupta, Ankur
We demonstrate how a large collection of unlabeled motion examples can help us in understanding human activities in a video. Recognizing human activity in monocular videos is a central problem in computer vision with wide-ranging applications in robotics, sports analysis, and healthcare. Obtaining annotated data to learn from videos in a supervised manner is tedious, time-consuming, and not scalable to a large number of human actions. To address these issues, we propose an unsupervised, data-driven approach that only relies on 3d motion examples in the form of human motion capture sequences. The first part of the thesis deals with adding view-invariance to the standard action recognition task, i.e., identifying the class of activity given a short video sequence. We learn a view-invariant representation of human motion from 3d examples by generating synthetic features. We demonstrate the effectiveness of our method on a standard dataset with results competitive to the state of the art. Next, we focus on the problem of 3d pose estimation in realistic videos. We present a non-parametric approach that does not rely on a motion model built for a specific action. Thus, our method can deal with video sequences featuring multiple actions. We test our 3d pose estimation pipeline on a challenging professional basketball sequence.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International