UBC Theses and Dissertations
Towards automatic broadcast of team sports Chen, Jianhui
Sports is the social glue of society as it allows people to interact with each other and appreciate games irrespective of their social status, age and ethnicity. Automatic sports broadcasting produces stream videos from vision sensors without human intervention. The goal is to predict where cameras should look and which camera should be on air. The technique can benefit millions of people as most viewers participant in sports by watching TV or Internet broadcasting. The target team sports include basketball, soccer and ice hockey in which team members quickly move their positions in the game, excluding sports like baseball and cricket in which team members have relatively stable positions. Automatic sports broadcasting covers areas of statistics, commentary, camera control and so on. We provide solutions for automatically setting camera parameters such as camera orientation angles and locations using computer vision. We restrict our attention to static pan-tilt-zoom (PTZ) cameras for television or live Internet broadcasting. We propose three essential components of autonomous broadcasting: camera calibration, planning and selection. By learning from human demonstrations, our work can predict camera angles for single camera systems and camera viewpoints for multi-camera systems. We obtain human demonstrations from existing videos that are generated by professional camera operators. These videos contain camera angles and camera IDs if there are multiple cameras. Because camera angles are not directly available, we first propose two novel camera calibration methods. We evaluate and compare our methods with previous algorithms. Our methods are more accurate and faster than previous algorithms. With labeled data from human operators, we develop two methods for smooth camera planning which predict camera pan angles. The first method directly incorporates temporal consistency into a data-driven predictor. The second method optimizes the camera trajectory in overlapped temporal windows. We show they outperform previous methods in the literature. We also propose two methods for selecting a broadcast camera view from multiple candidate camera views. The first method uses deep features for camera selection. The second method augments the training data with Internet videos. We demonstrate comparable results with selections from human operators in soccer games.
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International