- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- HalluciNet-ing Spatiotemporal Representations Using...
Open Collections
UBC Faculty Research and Publications
HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN Parmar, Paritosh; Morris, Brendan
Abstract
Spatiotemporal representations learned using 3D convolutional neural networks (CNN) are currently used in state-of-the-art approaches for action-related tasks. However, 3D-CNN are notorious for being memory and compute resource intensive as compared with more simple 2D-CNN architectures. We propose to hallucinate spatiotemporal representations from a 3D-CNN teacher with a 2D-CNN student. By requiring the 2D-CNN to predict the future and intuit upcoming activity, it is encouraged to gain a deeper understanding of actions and how they evolve. The hallucination task is treated as an auxiliary task, which can be used with any other action-related task in a multitask learning setting. Thorough experimental evaluation, it is shown that the hallucination task indeed helps improve performance on action recognition, action quality assessment, and dynamic scene recognition tasks. From a practical standpoint, being able to hallucinate spatiotemporal representations without an actual 3D-CNN can enable deployment in resource-constrained scenarios, such as with limited computing power and/or lower bandwidth. We also observed that our hallucination task has utility not only during the training phase, but also during the pre-training phase.
Item Metadata
| Title |
HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN
|
| Creator | |
| Publisher |
Multidisciplinary Digital Publishing Institute
|
| Date Issued |
2021-09-08
|
| Description |
Spatiotemporal representations learned using 3D convolutional neural networks (CNN) are currently used in state-of-the-art approaches for action-related tasks. However, 3D-CNN are notorious for being memory and compute resource intensive as compared with more simple 2D-CNN architectures. We propose to hallucinate spatiotemporal representations from a 3D-CNN teacher with a 2D-CNN student. By requiring the 2D-CNN to predict the future and intuit upcoming activity, it is encouraged to gain a deeper understanding of actions and how they evolve. The hallucination task is treated as an auxiliary task, which can be used with any other action-related task in a multitask learning setting. Thorough experimental evaluation, it is shown that the hallucination task indeed helps improve performance on action recognition, action quality assessment, and dynamic scene recognition tasks. From a practical standpoint, being able to hallucinate spatiotemporal representations without an actual 3D-CNN can enable deployment in resource-constrained scenarios, such as with limited computing power and/or lower bandwidth. We also observed that our hallucination task has utility not only during the training phase, but also during the pre-training phase.
|
| Subject | |
| Genre | |
| Type | |
| Language |
eng
|
| Date Available |
2021-10-12
|
| Provider |
Vancouver : University of British Columbia Library
|
| Rights |
CC BY 4.0
|
| DOI |
10.14288/1.0402498
|
| URI | |
| Affiliation | |
| Citation |
Signals 2 (3): 604-618 (2021)
|
| Publisher DOI |
10.3390/signals2030037
|
| Peer Review Status |
Reviewed
|
| Scholarly Level |
Faculty
|
| Rights URI | |
| Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
CC BY 4.0