UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Deep learning based multi-modal image analysis for enhanced situation awareness and environmental perception Liu, Shuo


Situation awareness (SA) plays an important role in surveillance and security applications. In a SA system, the perception capability is fundamental and essential. However, the accurate perception of the key elements and events in the complex and dynamic environment is a challenging task, due to the target scale variations, complex backgrounds, and poor illumination conditions. The research presented in this thesis aims to address these challenges by developing deep learning approaches for enhanced environmental perception. Firstly, a deep learning based multi-modal image fusion method is proposed for the automatic target detection task, in which the information from visible, thermal, and temporal images is fused with a multi-channel convolutional neural network (CNN). Compared with the conventional multi-modal image fusion methods, the fusion strategy in the proposed method can be learned automatically at the training stage rather than by a hand-craft design. Secondly, the deep multi-modal image fusion (DMIF) is further improved in a new framework, where the multi-modal image fusion, region proposal, and region-wise classification modules are integrated into an end-to-end neural network. Thus, it becomes more efficient to train and optimize the neural network. In addition, a deeper CNN is also implemented. The comprehensive experiments demonstrate the proposed DMIF can successfully address the challenges that arise from target scale variations and complex backgrounds in a dynamic environment. Finally, to enhance the environmental perception at dark night, a deep learning based thermal image translation (namely IR2VI) method is presented. As the visible camera usually does not work at the dark night without sufficient illumination, the multi-modal image fusion methods for context enhancement will not function properly in this specific situation. The proposed IR2VI method is able to translate the nighttime thermal images to the daytime human favorable visible image. Experimental results show the superiority of the IR2VI over the state of the arts.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International