Deep learning based multi-modal image analysis for enhanced situation awareness and environmental perception

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Deep learning based multi-modal image analysis for enhanced situation awareness and environmental perception Liu, Shuo

Abstract

Situation awareness (SA) plays an important role in surveillance and security applications. In a SA system, the perception capability is fundamental and essential. However, the accurate perception of the key elements and events in the complex and dynamic environment is a challenging task, due to the target scale variations, complex backgrounds, and poor illumination conditions. The research presented in this thesis aims to address these challenges by developing deep learning approaches for enhanced environmental perception. Firstly, a deep learning based multi-modal image fusion method is proposed for the automatic target detection task, in which the information from visible, thermal, and temporal images is fused with a multi-channel convolutional neural network (CNN). Compared with the conventional multi-modal image fusion methods, the fusion strategy in the proposed method can be learned automatically at the training stage rather than by a hand-craft design. Secondly, the deep multi-modal image fusion (DMIF) is further improved in a new framework, where the multi-modal image fusion, region proposal, and region-wise classification modules are integrated into an end-to-end neural network. Thus, it becomes more efficient to train and optimize the neural network. In addition, a deeper CNN is also implemented. The comprehensive experiments demonstrate the proposed DMIF can successfully address the challenges that arise from target scale variations and complex backgrounds in a dynamic environment. Finally, to enhance the environmental perception at dark night, a deep learning based thermal image translation (namely IR2VI) method is presented. As the visible camera usually does not work at the dark night without sufficient illumination, the multi-modal image fusion methods for context enhancement will not function properly in this specific situation. The proposed IR2VI method is able to translate the nighttime thermal images to the daytime human favorable visible image. Experimental results show the superiority of the IR2VI over the state of the arts.

Item Metadata

Title	Deep learning based multi-modal image analysis for enhanced situation awareness and environmental perception
Creator	Liu, Shuo
Publisher	University of British Columbia
Date Issued	2018
Description	Situation awareness (SA) plays an important role in surveillance and security applications. In a SA system, the perception capability is fundamental and essential. However, the accurate perception of the key elements and events in the complex and dynamic environment is a challenging task, due to the target scale variations, complex backgrounds, and poor illumination conditions. The research presented in this thesis aims to address these challenges by developing deep learning approaches for enhanced environmental perception. Firstly, a deep learning based multi-modal image fusion method is proposed for the automatic target detection task, in which the information from visible, thermal, and temporal images is fused with a multi-channel convolutional neural network (CNN). Compared with the conventional multi-modal image fusion methods, the fusion strategy in the proposed method can be learned automatically at the training stage rather than by a hand-craft design. Secondly, the deep multi-modal image fusion (DMIF) is further improved in a new framework, where the multi-modal image fusion, region proposal, and region-wise classification modules are integrated into an end-to-end neural network. Thus, it becomes more efficient to train and optimize the neural network. In addition, a deeper CNN is also implemented. The comprehensive experiments demonstrate the proposed DMIF can successfully address the challenges that arise from target scale variations and complex backgrounds in a dynamic environment. Finally, to enhance the environmental perception at dark night, a deep learning based thermal image translation (namely IR2VI) method is presented. As the visible camera usually does not work at the dark night without sufficient illumination, the multi-modal image fusion methods for context enhancement will not function properly in this specific situation. The proposed IR2VI method is able to translate the nighttime thermal images to the daytime human favorable visible image. Experimental results show the superiority of the IR2VI over the state of the arts.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2018-08-31
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0371852
URI	http://hdl.handle.net/2429/67036
Degree	Master of Applied Science - MASc
Program	Electrical Engineering
Affiliation	Applied Science, Faculty of; Engineering, School of (Okanagan)
Degree Grantor	University of British Columbia
Graduation Date	2018-09
Campus	UBCO
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Deep learning based multi-modal image analysis for enhanced situation awareness and environmental perception Liu, Shuo

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights