- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- Efficient Layer-Wise Cross-View Calibration and Aggregation...
Open Collections
UBC Faculty Research and Publications
Efficient Layer-Wise Cross-View Calibration and Aggregation for Multispectral Object Detection He, Xiao; Yang, Tong; Yan, Tingzhou; Li, Hongtao; Ge, Yang; Ren, Zhijun; Liu, Zhe; Jiang, Jiahe; Tang, Chang
Abstract
Multispectral object detection is a fundamental task with an extensive range of practical implications. In particular, combining visible (RGB) and infrared (IR) images can offer complementary information that enhances detection performance in different weather scenarios. However, the existing methods generally involve aligning features across modalities and require proposals for the two-stage detectors, which are often slow and unsuitable for large-scale applications. To overcome this challenge, we introduce a novel one-stage oriented detector for RGB-infrared object detection called the Layer-wise Cross-Modality calibration and Aggregation (LCMA) detector. LCMA employs a layer-wise strategy to achieve cross-modality alignment by using the proposed inter-modality spatial-reduction attention. Moreover, we design Gated Coupled Filter in each layer to capture semantically meaningful features while ensuring that well-aligned and foreground object information is obtained before forwarding them to the detection head. This relieves the need for a region proposal step for the alignment, enabling direct category and bounding box predictions in a unified one-stage oriented detector. Extensive experiments on two challenging datasets demonstrate that the proposed LCMA outperforms state-of-the-art methods in terms of both accuracy and computational efficiency, which implies the efficacy of our approach in exploiting multi-modality information for robust and efficient multispectral object detection.
Item Metadata
| Title |
Efficient Layer-Wise Cross-View Calibration and Aggregation for Multispectral Object Detection
|
| Creator | |
| Publisher |
Multidisciplinary Digital Publishing Institute
|
| Date Issued |
2026-01-23
|
| Description |
Multispectral object detection is a fundamental task with an extensive range of practical implications. In particular, combining visible (RGB) and infrared (IR) images can offer complementary information that enhances detection performance in different weather scenarios. However, the existing methods generally involve aligning features across modalities and require proposals for the two-stage detectors, which are often slow and unsuitable for large-scale applications. To overcome this challenge, we introduce a novel one-stage oriented detector for RGB-infrared object detection called the Layer-wise Cross-Modality calibration and Aggregation (LCMA) detector. LCMA employs a layer-wise strategy to achieve cross-modality alignment by using the proposed inter-modality spatial-reduction attention. Moreover, we design Gated Coupled Filter in each layer to capture semantically meaningful features while ensuring that well-aligned and foreground object information is obtained before forwarding them to the detection head. This relieves the need for a region proposal step for the alignment, enabling direct category and bounding box predictions in a unified one-stage oriented detector. Extensive experiments on two challenging datasets demonstrate that the proposed LCMA outperforms state-of-the-art methods in terms of both accuracy and computational efficiency, which implies the efficacy of our approach in exploiting multi-modality information for robust and efficient multispectral object detection.
|
| Subject | |
| Genre | |
| Type | |
| Language |
eng
|
| Date Available |
2026-02-13
|
| Provider |
Vancouver : University of British Columbia Library
|
| Rights |
CC BY 4.0
|
| DOI |
10.14288/1.0451491
|
| URI | |
| Affiliation | |
| Citation |
Electronics 15 (3): 498 (2026)
|
| Publisher DOI |
10.3390/electronics15030498
|
| Peer Review Status |
Reviewed
|
| Scholarly Level |
Faculty; Researcher
|
| Rights URI | |
| Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
CC BY 4.0