- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- HalluciNet-ing Spatiotemporal Representations Using...
Open Collections
UBC Faculty Research and Publications
HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN Parmar, Paritosh; Morris, Brendan
Abstract
Spatiotemporal representations learned using 3D convolutional neural networks (CNN) are currently used in state-of-the-art approaches for action-related tasks. However, 3D-CNN are notorious for being memory and compute resource intensive as compared with more simple 2D-CNN architectures. We propose to hallucinate spatiotemporal representations from a 3D-CNN teacher with a 2D-CNN student. By requiring the 2D-CNN to predict the future and intuit upcoming activity, it is encouraged to gain a deeper understanding of actions and how they evolve. The hallucination task is treated as an auxiliary task, which can be used with any other action-related task in a multitask learning setting. Thorough experimental evaluation, it is shown that the hallucination task indeed helps improve performance on action recognition, action quality assessment, and dynamic scene recognition tasks. From a practical standpoint, being able to hallucinate spatiotemporal representations without an actual 3D-CNN can enable deployment in resource-constrained scenarios, such as with limited computing power and/or lower bandwidth. We also observed that our hallucination task has utility not only during the training phase, but also during the pre-training phase.
Item Metadata
Title |
HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN
|
Creator | |
Publisher |
Multidisciplinary Digital Publishing Institute
|
Date Issued |
2021-09-08
|
Description |
Spatiotemporal representations learned using 3D convolutional neural networks (CNN) are currently used in state-of-the-art approaches for action-related tasks. However, 3D-CNN are notorious for being memory and compute resource intensive as compared with more simple 2D-CNN architectures. We propose to hallucinate spatiotemporal representations from a 3D-CNN teacher with a 2D-CNN student. By requiring the 2D-CNN to predict the future and intuit upcoming activity, it is encouraged to gain a deeper understanding of actions and how they evolve. The hallucination task is treated as an auxiliary task, which can be used with any other action-related task in a multitask learning setting. Thorough experimental evaluation, it is shown that the hallucination task indeed helps improve performance on action recognition, action quality assessment, and dynamic scene recognition tasks. From a practical standpoint, being able to hallucinate spatiotemporal representations without an actual 3D-CNN can enable deployment in resource-constrained scenarios, such as with limited computing power and/or lower bandwidth. We also observed that our hallucination task has utility not only during the training phase, but also during the pre-training phase.
|
Subject | |
Genre | |
Type | |
Language |
eng
|
Date Available |
2021-10-12
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
CC BY 4.0
|
DOI |
10.14288/1.0402498
|
URI | |
Affiliation | |
Citation |
Signals 2 (3): 604-618 (2021)
|
Publisher DOI |
10.3390/signals2030037
|
Peer Review Status |
Reviewed
|
Scholarly Level |
Faculty
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
CC BY 4.0