Presentation + Paper
26 October 2022 Exploring fusion techniques in U-Net and DeepLab V3 architectures for multi-modal land cover classification
Kevin Qiu, Lina E. Budde, Dimitri Bulatov, Dorota Iwaszczuk
Author Affiliations +
Abstract
Many deep learning architectures exist for semantic segmentation. In this paper, their application to multi-modal remote sensing data is examined. Two well-known network architectures, U-Net and DeepLab V3+, developed originally for RGB image data, are modified to accept additional input channels, such as near infrared or depth information. In both networks, ResNet101 is used as the backbone, while data-preprocessing steps, including data augmentation, are identical. We compare both networks and experiment with different fusion techniques in U-Net and with hyper-parameters for weighting the input channels for fusion in DeepLab V3+. We also evaluate the effect of pre-training on RGB and non-RGB data. The results show a minimally better performance of the DeepLab V3+ model compared to U-Net, while for the certain classes, such as vehicles, U-Net yields a slightly superior accuracy.
Conference Presentation
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kevin Qiu, Lina E. Budde, Dimitri Bulatov, and Dorota Iwaszczuk "Exploring fusion techniques in U-Net and DeepLab V3 architectures for multi-modal land cover classification", Proc. SPIE 12268, Earth Resources and Environmental Remote Sensing/GIS Applications XIII, 122680T (26 October 2022); https://doi.org/10.1117/12.2636144
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
RGB color model

Buildings

Convolution

Image segmentation

Computer programming

Data modeling

Image fusion

Back to Top