Adversarial attacks on multi-modal 3D detection models

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Adversarial attacks on multi-modal 3D detection models Abdelfattah, Mazen

Abstract

Modern Autonomous Vehicles (AVs) rely on sensory data often acquired by cameras and LiDARs to perceive the world around them. To operate the vehicle safely and effectively, Artificial Intelligence (AI) is used to process this data to detect objects of interest around the vehicle. For 3D object detection, recent advances in deep learning have resulted in the development of state-of-the-art multi-modal models which are built using Deep Neural Nets (DNNs) to process camera images, and LiDAR point clouds. While DNN-based models are very powerful and accurate they may be vulnerable to adversarial attacks which introduce a small change to a model’s input and can result in great errors in its output. These attacks have been heavily investigated for models that operate on camera image input only, and recently for point cloud processing models, however they have rarely been investigated in models that utilize both modalities as is often the case in modern AVs. To address this gap we propose a realistic adversarial attack on such multi-modal 3D detection models. We place a 3D adversarial object on a vehicle with the aim of hiding this object’s host vehicle from detection by powerful multi-modal 3D detectors. This object’s shape and texture are trained so that it can be used to prevent a specific model from detecting any host vehicle in any scene. 3D detection models are often based on either a cascaded architecture where each input modality is processed consecutively, or a fusion architecture where multi-input features are extracted and fused simultaneously. We use our attack to study the vulnerabilities of representative models of these architectures to realistic adversarial attacks and to understand the effects of multi-modal learning on the robustness of a model. Our experiments show that a single adversarial object is capable of hiding its host vehicle 55.6% and 63.19% of the times from the cascaded model and from the fusion model respectively. This vulnerability was found to be mainly due to RGB image features which were much less robust to adversarial scene changes compared to the point cloud features.

Item Metadata

Title	Adversarial attacks on multi-modal 3D detection models
Creator	Abdelfattah, Mazen
Publisher	University of British Columbia
Date Issued	2021
Description	Modern Autonomous Vehicles (AVs) rely on sensory data often acquired by cameras and LiDARs to perceive the world around them. To operate the vehicle safely and effectively, Artificial Intelligence (AI) is used to process this data to detect objects of interest around the vehicle. For 3D object detection, recent advances in deep learning have resulted in the development of state-of-the-art multi-modal models which are built using Deep Neural Nets (DNNs) to process camera images, and LiDAR point clouds. While DNN-based models are very powerful and accurate they may be vulnerable to adversarial attacks which introduce a small change to a model’s input and can result in great errors in its output. These attacks have been heavily investigated for models that operate on camera image input only, and recently for point cloud processing models, however they have rarely been investigated in models that utilize both modalities as is often the case in modern AVs. To address this gap we propose a realistic adversarial attack on such multi-modal 3D detection models. We place a 3D adversarial object on a vehicle with the aim of hiding this object’s host vehicle from detection by powerful multi-modal 3D detectors. This object’s shape and texture are trained so that it can be used to prevent a specific model from detecting any host vehicle in any scene. 3D detection models are often based on either a cascaded architecture where each input modality is processed consecutively, or a fusion architecture where multi-input features are extracted and fused simultaneously. We use our attack to study the vulnerabilities of representative models of these architectures to realistic adversarial attacks and to understand the effects of multi-modal learning on the robustness of a model. Our experiments show that a single adversarial object is capable of hiding its host vehicle 55.6% and 63.19% of the times from the cascaded model and from the fusion model respectively. This vulnerability was found to be mainly due to RGB image features which were much less robust to adversarial scene changes compared to the point cloud features.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2021-04-22
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0396930
URI	http://hdl.handle.net/2429/77927
Degree	Master of Applied Science - MASc
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2021-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Adversarial attacks on multi-modal 3D detection models Abdelfattah, Mazen

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights